Method and device for Quasi-Gibbs structure sampling by deep permutation for person identity inference
First Claim
1. A method for visual appearance based person identity inference, comprising:
- obtaining a plurality of input images, wherein the input images include a gallery set of images containing persons-of-interest and a probe set of images containing person detections, and one input image corresponds to one person;
extracting N feature maps from the input images using a Deep Neural Network (DNN), N being a natural number;
constructing N structure samples of the N feature maps using conditional random field (CRF) graphical models, comprising;
for a feature map, constructing an initial graph structure by K Nearest Neighbor (KNN) based on feature similarity in a feature space corresponding to the feature map, the graph model including nodes and edges, a node representing one person;
performing structure permutations by a plurality of iterations of KNN computation in N feature spaces with a Quasi-Gibbs Structure Sampling (QGSS) process;
assigning labels to the nodes that minimize a conditional random field (CRF) energy function over all possible labels, wherein the all possible labels represent all different persons-of-interest in the gallery set; and
deriving the N structure samples from the plurality of iterations and the assigned labels;
learning the N structure samples from an implicit common latent feature space embedded in the N structure samples; and
according to the learned structures, identifying one or more images from the probe set containing a same person-of-interest as an image in the gallery set.
1 Assignment
0 Petitions
Accused Products
Abstract
The present disclosure provides a method and device for visual appearance based person identity inference. The method may include obtaining a plurality of input images. The input images include a gallery set of images containing, persons-of-interest and a probe set of images containing person detections, and one input image corresponds to one person. The method may further include extracting N feature maps from the input images using a Deep Neural Network, N being a natural number; constructing N structure samples of the N feature maps using conditional random field (CRF) graphical models; learning the N structure samples from an implicit common latent feature space embedded in the N structure samples; and according to the learned structures, identifying one or more images from the probe set containing a same person-of-interest as an image in the gallery set.
4 Citations
14 Claims
-
1. A method for visual appearance based person identity inference, comprising:
-
obtaining a plurality of input images, wherein the input images include a gallery set of images containing persons-of-interest and a probe set of images containing person detections, and one input image corresponds to one person; extracting N feature maps from the input images using a Deep Neural Network (DNN), N being a natural number; constructing N structure samples of the N feature maps using conditional random field (CRF) graphical models, comprising; for a feature map, constructing an initial graph structure by K Nearest Neighbor (KNN) based on feature similarity in a feature space corresponding to the feature map, the graph model including nodes and edges, a node representing one person; performing structure permutations by a plurality of iterations of KNN computation in N feature spaces with a Quasi-Gibbs Structure Sampling (QGSS) process; assigning labels to the nodes that minimize a conditional random field (CRF) energy function over all possible labels, wherein the all possible labels represent all different persons-of-interest in the gallery set; and deriving the N structure samples from the plurality of iterations and the assigned labels; learning the N structure samples from an implicit common latent feature space embedded in the N structure samples; and according to the learned structures, identifying one or more images from the probe set containing a same person-of-interest as an image in the gallery set. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A device for visual appearance based person identity inference, comprising one or more processors configured to:
-
obtain a plurality of input images, wherein the input images include a gallery set of images containing persons-of-interest and a probe set of images containing person detections, and one input image corresponds to one person; extract N feature maps from the input images using a Deep Neural Network (DNN), N being a natural number; construct N structure samples of the N feature maps using conditional random field (CRF) graphical models, comprising; for a feature map, constructing an initial graph structure by K Nearest Neighbor (KNN) based on feature similarity in a feature space corresponding to the feature map, the graph model including nodes and edges, a node representing one person; performing structure permutations by a plurality of iterations of KNN computation in N feature spaces with a Quasi-Gibbs Structure Sampling (QGSS) process; assigning labels to the nodes that minimize a conditional random field (CRF) energy function over all possible labels, wherein the all possible labels represent all different persons-of-interest in the gallery set; and deriving the N structure samples from the plurality of iterations and the assigned labels; learn the N structure samples from an implicit common latent feature space embedded in the N structure samples; and according to the learned structures, identify one or more images from the probe set containing a same person-of-interest as an image in the gallery set. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
Specification