Method and device for Quasi-Gibbs structure sampling by deep permutation for person identity inference

US 10,339,408 B2
Filed: 12/22/2016
Issued: 07/02/2019
Est. Priority Date: 12/22/2016
Status: Active Grant

First Claim

Patent Images

1. A method for visual appearance based person identity inference, comprising:

obtaining a plurality of input images, wherein the input images include a gallery set of images containing persons-of-interest and a probe set of images containing person detections, and one input image corresponds to one person;

extracting N feature maps from the input images using a Deep Neural Network (DNN), N being a natural number;

constructing N structure samples of the N feature maps using conditional random field (CRF) graphical models, comprising;

for a feature map, constructing an initial graph structure by K Nearest Neighbor (KNN) based on feature similarity in a feature space corresponding to the feature map, the graph model including nodes and edges, a node representing one person;

performing structure permutations by a plurality of iterations of KNN computation in N feature spaces with a Quasi-Gibbs Structure Sampling (QGSS) process;

assigning labels to the nodes that minimize a conditional random field (CRF) energy function over all possible labels, wherein the all possible labels represent all different persons-of-interest in the gallery set; and

deriving the N structure samples from the plurality of iterations and the assigned labels;

learning the N structure samples from an implicit common latent feature space embedded in the N structure samples; and

according to the learned structures, identifying one or more images from the probe set containing a same person-of-interest as an image in the gallery set.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present disclosure provides a method and device for visual appearance based person identity inference. The method may include obtaining a plurality of input images. The input images include a gallery set of images containing, persons-of-interest and a probe set of images containing person detections, and one input image corresponds to one person. The method may further include extracting N feature maps from the input images using a Deep Neural Network, N being a natural number; constructing N structure samples of the N feature maps using conditional random field (CRF) graphical models; learning the N structure samples from an implicit common latent feature space embedded in the N structure samples; and according to the learned structures, identifying one or more images from the probe set containing a same person-of-interest as an image in the gallery set.

4 Citations

View as Search Results

14 Claims

1. A method for visual appearance based person identity inference, comprising:
- obtaining a plurality of input images, wherein the input images include a gallery set of images containing persons-of-interest and a probe set of images containing person detections, and one input image corresponds to one person;
  
  extracting N feature maps from the input images using a Deep Neural Network (DNN), N being a natural number;
  
  constructing N structure samples of the N feature maps using conditional random field (CRF) graphical models, comprising;
  
  for a feature map, constructing an initial graph structure by K Nearest Neighbor (KNN) based on feature similarity in a feature space corresponding to the feature map, the graph model including nodes and edges, a node representing one person;
  
  performing structure permutations by a plurality of iterations of KNN computation in N feature spaces with a Quasi-Gibbs Structure Sampling (QGSS) process;
  
  assigning labels to the nodes that minimize a conditional random field (CRF) energy function over all possible labels, wherein the all possible labels represent all different persons-of-interest in the gallery set; and
  
  deriving the N structure samples from the plurality of iterations and the assigned labels;
  
  learning the N structure samples from an implicit common latent feature space embedded in the N structure samples; and
  
  according to the learned structures, identifying one or more images from the probe set containing a same person-of-interest as an image in the gallery set.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method according to claim 1, wherein:
    - the plurality of iterations include first a iterations and later b iterations, wherein a and b are natural numbers;
      
      results of the first a iterations are discarded, andthe N structure samples are derived from the later b iterations.
  - 3. The method according to claim 1, wherein:
    - a node in the graph model has m possible states, m representing a quantity of different persons-of-interest in the gallery set.
  - 4. The method according to claim 1, wherein:
    - the labels are assigned to the nodes according to the graph structure after the plurality of iterations are finished.
  - 5. The method according to claim 1, wherein:
    - a graph of a CRF model representing a re-identification structure is learned through the N structure samples; and
      
      an energy minimization with sparse approach is performed to cut the graph into a plurality of clusters, each cluster containing images corresponding to one of the persons-of-interest.
  - 6. The method according to claim 1, whereinN different kernels are used in the DNN for convolutions with the images in the gallery set and the probe set;
    - andthe N feature maps are produced by a last couple of convolution layers in the DNN.
  - 7. The method according to claim 5, wherein the CRF model with pairwise potentials is:

8. A device for visual appearance based person identity inference, comprising one or more processors configured to:
- obtain a plurality of input images, wherein the input images include a gallery set of images containing persons-of-interest and a probe set of images containing person detections, and one input image corresponds to one person;
  
  extract N feature maps from the input images using a Deep Neural Network (DNN), N being a natural number;
  
  construct N structure samples of the N feature maps using conditional random field (CRF) graphical models, comprising;
  
  for a feature map, constructing an initial graph structure by K Nearest Neighbor (KNN) based on feature similarity in a feature space corresponding to the feature map, the graph model including nodes and edges, a node representing one person;
  
  performing structure permutations by a plurality of iterations of KNN computation in N feature spaces with a Quasi-Gibbs Structure Sampling (QGSS) process;
  
  assigning labels to the nodes that minimize a conditional random field (CRF) energy function over all possible labels, wherein the all possible labels represent all different persons-of-interest in the gallery set; and
  
  deriving the N structure samples from the plurality of iterations and the assigned labels;
  
  learn the N structure samples from an implicit common latent feature space embedded in the N structure samples; and
  
  according to the learned structures, identify one or more images from the probe set containing a same person-of-interest as an image in the gallery set.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The device according to claim 8, wherein:
    - the plurality of iterations include first a iterations and later b iterations, wherein a and b are natural numbers;
      
      results of the first a iterations are discarded, andthe N structure samples are derived from the later b iterations.
  - 10. The device according to claim 8, wherein:
    - a node in the graph model has m possible states, m representing a quantity of different persons-of-interest in the gallery set.
  - 11. The device according to claim 8, wherein:
    - the labels are assigned to the nodes according to the graph structure after the plurality of iterations are finished.
  - 12. The device according to claim 8, wherein:
    - a graph of a CRF model representing a re-identification structure is learned through the N structure samples; and
      
      an energy minimization with sparse approach is performed to cut the graph into a plurality of clusters, each cluster containing images corresponding to one of the persons-of-interest.
  - 13. The device according to claim 8, whereinN different kernels are used in the DNN for convolutions with the images in the gallery set and the probe set;
    - andthe N feature maps are produced by a last couple of convolution layers in the DNN.
  - 14. The device according to claim 12, wherein the CRF model with pairwise potentials is:

Specification

Resources

Litigation Campaign Assessment

Current Assignee
TCL Research America, Inc. (TCL Technology Group Corp.)
Original Assignee
TCL Research America, Inc. (TCL Technology Group Corp.)
Inventors
Liao, Xinpeng, Sun, Xinyao, Ren, Xiaobo, Wang, Haohong
Primary Examiner(s)
Lee, John W

Application Number

US15/388,039
Publication Number

US 20180181842A1
Time in Patent Office

922 Days
Field of Search
US Class Current
CPC Class Codes

G06F 18/24143   Distances to neighbourhood ...

G06F 18/24147   Distances to closest patter...

G06F 18/29   Graphical models, e.g. Baye...

G06V 10/454   Integrating the filters int...

G06V 10/764   using classification, e.g. ...

G06V 10/82   using neural networks

G06V 10/84   using probabilistic graphic...

G06V 20/13   Satellite images

G06V 20/52   Surveillance or monitoring ...

Method and device for Quasi-Gibbs structure sampling by deep permutation for person identity inference

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

4 Citations

14 Claims

Specification

Solutions

Use Cases

Quick Links

Method and device for Quasi-Gibbs structure sampling by deep permutation for person identity inference

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

4 Citations

14 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links