Pose-invariant face recognition system and process
First Claim
1. A computer-implemented face recognition process for identifying a person depicted in an input image, comprising using a computer to perform the following process actions:
- creating a database of a plurality of model image characterizations, each of which represents the face of a known person that it is desired to identify in the input image as well as the person'"'"'s face pose;
training a neural network ensemble to identify a person and their face pose from a region which has been extracted from said input image and characterized in a manner similar to the plurality of model images, wherein the network ensemble comprises, a first stage having a plurality of classifiers each of which has input and output units and is dedicated to a particular pose range and outputs a measure of the similarity indicative of the similarity between said characterized input image region and each of said model image characterizations associated with the particular pose range of the classifier, and a fusing neural network as its second stage which combines the outputs of the classifiers to generate an output indicative of the person associated with the characterized input image region and the face pose of that person and which has at least enough output units to allow a different output to represent each person it is desired to identify at each of the pose ranges, and wherein training the neural network ensemble comprises, preparing each model image characterization from a model image depicting the face of a known person that it is desired to identify in the input image by, extracting the portion of the model image depicting said face, normalizing the extracted portion of the model image by resizing it to a prescribed scale if not already at the prescribed scale and adjusting the region so that the eye locations of the depicted subject fall within a prescribed area, and cropping the extracted portion of the model image by eliminating unneeded portions of the image not specifically depicting part of the face of the subject to create a model face image, categorizing the model face images by assigning each to one of a set of pose ranges into which its associated face pose falls, and for each pose range, choosing a prescribed number of the model face images of each person being modeled which have been assigned to the selected pose range, concatenating each of the chosen model face images to create a respective dimensional column vector (DCV) for each, computing a covariance matrix from the DCVs, calculating eigenvectors and corresponding eigenvalues from the covariance matrix, ranking the eigenvalues in descending order, identifying a prescribed number of the top eigenvalues, using the eigenvectors corresponding to the identified eigenvalues to form the rows of a basis vector matrix (BVM) for the pose range, and multiplying each DCV by each BVM to produce a set of principal components analysis (PCA) coefficient vectors for each model face image, and for each face recognition neural network, inputting, one at a time, each of the PCA coefficient vectors associated with the pose range of the face recognition neural network into the inputs of the network until the outputs of the network stabilize, initializing the fusing neural network for training, for each DCV, simultaneously inputting the PCA coefficient vectors generated from the DCV into the respective face recognition neural network associated with the vector'"'"'s particular pose range group until all the PCA coefficient vectors of every DCV have been input, and repeating until the outputs of the fusing neural network stabilize, and for each DCV, simultaneously inputting the PCA coefficient vectors generated from the DCV into the respective face recognition neural network associated with the vector'"'"'s particular pose range group and assigning the active output of the fusing neural network as corresponding to the particular person and pose associated with the model image used to create the set of PCA coefficient vectors; and
employing the network ensemble to identify the person associated with the characterized input image region and the face pose of that person.
4 Assignments
0 Petitions
Accused Products
Abstract
A face recognition system and process for identifying a person depicted in an input image and their face pose. This system and process entails locating and extracting face regions belonging to known people from a set of model images, and determining the face pose for each of the face regions extracted. All the extracted face regions are preprocessed by normalizing, cropping, categorizing and finally abstracting them. More specifically, the images are normalized and cropped to show only a person'"'"'s face, categorized according to the face pose of the depicted person'"'"'s face by assigning them to one of a series of face pose ranges, and abstracted preferably via an eigenface approach. The preprocessed face images are preferably used to train a neural network ensemble having a first stage made up of a bank of face recognition neural networks each of which is dedicated to a particular pose range, and a second stage constituting a single fusing neural network that is used to combine the outputs from each of the first stage neural networks. Once trained, the input of a face region which has been extracted from an input image and preprocessed (i.e., normalized, cropped and abstracted) will cause just one of the output units of the fusing portion of the neural network ensemble to become active. The active output unit indicates either the identify of the person whose face was extracted from the input image and the associated face pose, or that the identity of the person is unknown to the system.
142 Citations
9 Claims
-
1. A computer-implemented face recognition process for identifying a person depicted in an input image, comprising using a computer to perform the following process actions:
-
creating a database of a plurality of model image characterizations, each of which represents the face of a known person that it is desired to identify in the input image as well as the person'"'"'s face pose;
training a neural network ensemble to identify a person and their face pose from a region which has been extracted from said input image and characterized in a manner similar to the plurality of model images, wherein the network ensemble comprises, a first stage having a plurality of classifiers each of which has input and output units and is dedicated to a particular pose range and outputs a measure of the similarity indicative of the similarity between said characterized input image region and each of said model image characterizations associated with the particular pose range of the classifier, and a fusing neural network as its second stage which combines the outputs of the classifiers to generate an output indicative of the person associated with the characterized input image region and the face pose of that person and which has at least enough output units to allow a different output to represent each person it is desired to identify at each of the pose ranges, and wherein training the neural network ensemble comprises, preparing each model image characterization from a model image depicting the face of a known person that it is desired to identify in the input image by, extracting the portion of the model image depicting said face, normalizing the extracted portion of the model image by resizing it to a prescribed scale if not already at the prescribed scale and adjusting the region so that the eye locations of the depicted subject fall within a prescribed area, and cropping the extracted portion of the model image by eliminating unneeded portions of the image not specifically depicting part of the face of the subject to create a model face image, categorizing the model face images by assigning each to one of a set of pose ranges into which its associated face pose falls, and for each pose range, choosing a prescribed number of the model face images of each person being modeled which have been assigned to the selected pose range, concatenating each of the chosen model face images to create a respective dimensional column vector (DCV) for each, computing a covariance matrix from the DCVs, calculating eigenvectors and corresponding eigenvalues from the covariance matrix, ranking the eigenvalues in descending order, identifying a prescribed number of the top eigenvalues, using the eigenvectors corresponding to the identified eigenvalues to form the rows of a basis vector matrix (BVM) for the pose range, and multiplying each DCV by each BVM to produce a set of principal components analysis (PCA) coefficient vectors for each model face image, and for each face recognition neural network, inputting, one at a time, each of the PCA coefficient vectors associated with the pose range of the face recognition neural network into the inputs of the network until the outputs of the network stabilize, initializing the fusing neural network for training, for each DCV, simultaneously inputting the PCA coefficient vectors generated from the DCV into the respective face recognition neural network associated with the vector'"'"'s particular pose range group until all the PCA coefficient vectors of every DCV have been input, and repeating until the outputs of the fusing neural network stabilize, and for each DCV, simultaneously inputting the PCA coefficient vectors generated from the DCV into the respective face recognition neural network associated with the vector'"'"'s particular pose range group and assigning the active output of the fusing neural network as corresponding to the particular person and pose associated with the model image used to create the set of PCA coefficient vectors; and
employing the network ensemble to identify the person associated with the characterized input image region and the face pose of that person. - View Dependent Claims (2)
-
-
3. A face recognition system for identifying a person depicted in an input image, comprising:
-
a general purpose computing device; and
a computer program comprising program modules executable by the computing device, wherein the computing device is directed by the program modules of the computer program to, capture model images, each of which depicts at least one person of known identity, locate and extract regions within the model images, each of which depicts the face of a known person that it is desired to identify in the input image, determine the face pose for each of the face regions extracted from the model images, categorize each face region by assigning each to one of a set of pose ranges into which its associated face pose falls, train a neural network ensemble to identify a person and their face pose from a region that depicts the face of a person which has been extracted from said input image, wherein the network ensemble comprises, a first stage having a plurality of classifiers each of which has input and output units and is dedicated to a particular pose range and outputs a measure of the similarity indicative of the similarity between said input image region and each of said model image regions associated with the particular pose range of the classifier, and a fusing neural network as its second stage which combines the outputs of the classifiers to generate an output indicative of the person associated with the characterized input image region and the face pose of that person and which has at least enough output units to allow a different output to represent each person it is desired to identify at each of the pose ranges, and wherein training the neural network ensemble comprises, (a) preparing each face region extracted from said model images by normalizing and cropping the extracted regions, wherein said normalizing comprises resizing each extracted face region to the same prescribed scale if not already at the prescribed scale and adjusting each region so that the eye locations of the depicted subject fall within the same prescribed area, and wherein said cropping comprises eliminating unneeded portions of the image not specifically depicting part of the face of the subject, (b) selecting a previously unselected one of the set of pose ranges, (c) choosing a prescribed number of the prepared face images of each person being modeled which have been assigned to the selected pose range, (d) concatenating each of the chosen prepared face images to create a respective dimensional column vector (DCV) for each, (e) computing a covariance matrix from the DCVs, (f) calculating eigenvectors and corresponding eigenvalues from the covariance matrix, (g) ranking the eigenvalues in descending order, (h) identifying a prescribed number of the top eigenvalues, (i) using the eigenvectors corresponding to the identified eigenvalues to form the rows of a basis vector matrix (BVM) for the selected pose range, (j) repeating actions (b) through (i) for each remaining pose range, (k) multiplying each DCV by each BVM to produce a set of principal components analysis (PCA) coefficient vectors for each face image, (l) for each face recognition neural network, inputting, one at a time, each of the PCA coefficient vectors associated with the pose range of the face recognition neural network into the inputs of the network until the outputs of the network stabilize, (m) initializing the fusing neural network for training, (n) for each DCV, simultaneously inputting the PCA coefficient vectors generated from the DCV into the respective face recognition neural network associated with the vector'"'"'s particular pose range group until all the PCA coefficient vectors of every DCV have been input, and repeating until the outputs of the fusing neural network stabilize, and (o) for each DCV, simultaneously inputting the PCA coefficient vectors generated from the DCV into the respective face recognition neural network associated with the vector'"'"'s particular pose range group and assigning the active output of the fusing neural network as corresponding to the particular person and pose associated with the model image used to create the set of PCA coefficient vectors, and employ the network ensemble to identify the person associated with the characterized input image region and their face pose. - View Dependent Claims (4)
-
-
5. A computer-readable memory for use in identifying a person depicted in an input image, comprising:
-
a computer-readable storage medium; and
a computer program comprising program modules stored in the storage medium, wherein the storage medium is so configured by the computer program that it causes a computer to, input model images, each of which depicts at least one person of known identity, locate and extract regions within the model images, each of which depicts the face of a known person that it is desired to identify in the input image, determine the face pose for each of the face regions extracted from the model images, categorize each face region by assigning each to one of a set of pose ranges into which its associated face pose falls, train a neural network ensemble to identify a person and their face pose from a region that depicts the face of a person which has been extracted from said input image, wherein the network ensemble comprises, a first stage having a plurality of classifiers each of which has input and output units and is dedicated to a particular pose range and outputs a measure of the similarity indicative of the similarity between said input image region and each of said model image regions associated with the particular pose range of the classifier, and a fusing neural network as its second stage which combines the outputs of the classifiers to generate an output indicative of the person associated with the characterized input image region and the face pose of that person and which has at least enough output units to allow a different output to represent each person it is desired to identify at each of the pose ranges, and wherein training the neural network ensemble comprises, (a) preparing each face region extracted from said model images by normalizing and cropping the extracted regions, wherein said normalizing comprises resizing each extracted face region to the same prescribed scale if not already at the prescribed scale and adjusting each region so that the eye locations of the depicted subject fall within the same prescribed area, and wherein said cropping comprises eliminating unneeded portions of the image not specifically depicting part of the face of the subject, (b) selecting a previously unselected one of the set of pose ranges, (c) choosing a prescribed number of the prepared face images of each person being modeled which have been assigned to the selected pose range, (d) concatenating each of the chosen prepared face images to create a respective dimensional column vector (DCV) for each, (e) computing a covariance matrix from the DCVs. (f) calculating eigenvectors and corresponding eigenvalues from the covariance matrix, (g) ranking the eigenvalues in descending order, (h) identifying a prescribed number of the top eigenvalues, (i) using the eigenvectors corresponding to the identified eigenvalues to form the rows of a basis vector matrix (BVM) for the selected pose range, (j) repeating actions (b) through (i) for each remaining pose range, (k) multiplying each DCV by each BVM to produce a set of principal components analysis (PCA) coefficient vectors for each face image, (l) for each face recognition neural network, inputting, one at a time, each of the PCA coefficient vectors associated with the pose range of the face recognition neural network into the inputs of the network until the outputs of the network stabilize, (m) initializing the fusing neural network for training, (n) for each DCV, simultaneously inputting the PCA coefficient vectors generated from the DCV into the respective face recognition neural network associated with the vector'"'"'s particular pose range group until all the PCA coefficient vectors of every DCV have been input, and repeating until the outputs of the fusing neural network stabilize, and (o) for each DCV, simultaneously inputting the PCA coefficient vectors generated from the DCV into the respective face recognition neural network associated with the vector'"'"'s particular pose range group and assigning the active output of the fusing neural network as corresponding to the particular person and pose associated with the model image used to create the set of PCA coefficient vectors, and employ the network ensemble to identify the person associated with the characterized input image region and their face pose. - View Dependent Claims (6)
-
-
7. A computer-implemented face recognition process for identifying a person depicted in an input image, comprising using a computer to perform the following process actions:
-
creating a database of a plurality of model image characterizations, each of which represents the face of a known person that it is desired to identify in the input image as well as the person'"'"'s face pose;
training a neural network ensemble to identify a person and their face pose from a region which has been extracted from said input image and characterized in a manner similar to the plurality of model images, wherein the network ensemble comprises, a first stage having a plurality of classifiers each of which has input and output units and is dedicated to a particular pose range and outputs a measure of the similarity indicative of the similarity between said characterized input image region and each of said model image characterizations associated with the particular pose range of the classifier, and a fusing neural network as its second stage which combines the outputs of the classifiers to generate an output indicative of the person associated with the characterized input image region and the face pose of that person, and wherein training the network ensemble comprises, deriving each model image characterization from a set of model images of people where each model image of the same person shows that person at a different face pose, said deriving comprising, extracting the portion of each model image depicting a face, normalizing the extracted portion of each model image by resizing it to a prescribed scale if not already at the prescribed scale and adjusting the region so that the eye locations of the depicted subject fall within a prescribed area, cropping the extracted portion of each model image by eliminating unneeded portions of the image not specifically depicting part of the face of the subject to create a model face image, concatenating each of the model face images to create a respective model dimensional column vector (DCV) for each, categorizing the model DCVs by assigning each to one of a set of pose ranges into which its associated face pose falls, and inputting the model DCV of the each model face image falling in a particular pose range, one at a time, to a pre-selected classifier dedicated to the particular pose range, initializing the fusing neural network for training, simultaneously inputting the respective DCV of each model face image into all classifiers, until the DCV of every model image has been input, and repeating until the outputs of the neural network stabilize, and simultaneously inputting the respective DCV of each model face image into all classifiers, and assigning the active output the neural network as corresponding to the particular person and pose associated with the model image used to create the DCV; and
employing the network ensemble to identify the person associated with the characterized input image region and the face pose of that person. - View Dependent Claims (8, 9)
-
Specification