Recognition and reconstruction of objects with partial appearance

US 10,460,470 B2
Filed: 07/06/2017
Issued: 10/29/2019
Est. Priority Date: 07/06/2017
Status: Active Grant

First Claim

Patent Images

1. A system operable to recognize and reconstruct objects in images, the system comprising:

a non-transitory memory storage comprising instructions; and

one or more processors in communication with the non-transitory memory storage, wherein the one or more processors execute the instructions to perform operations comprising;

acquiring a plurality of images, each image including an object within the image, the object having a different appearance in a number of images of the plurality of images;

decomposing, for each image, the object into components and generating label data for each image;

inputting the images and associated label data into a learning module to train the learning module to recognize the components of the object, the training being based on an overall objectness score of the object, an objectness score of each component of the object, a pose of the object, and a pose of each component of the object for each image input;

inputting an additional image into the trained learning module;

detecting that the additional image has one or more of the components of the object within the additional image in response to input into the trained learning module; and

identifying the object and estimating its pose information in the additional image and/or constructing a complete view of the object in the additional image or a complete sketch of the object in response to detection of the one or more of the components, wherein training the learning module includes minimizing a cost function generated as a sum of discrepancies between the label data and computed data for a set of items, the set of items including the overall objectness score, the objectness scores of the components, the pose of the object, and the poses of the components.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Various embodiments include systems and methods structured to provide recognition of an object in an image using a learning module trained using decomposition of the object into components in a number of training images. The training can be based on an overall objectness score of the object, an objectness score of each component of the object, a pose of the object, and a pose of each component of the object for each training image input. Additional systems and methods can be implemented in a variety of applications.

Citations

18 Claims

1. A system operable to recognize and reconstruct objects in images, the system comprising:
- a non-transitory memory storage comprising instructions; and
  
  one or more processors in communication with the non-transitory memory storage, wherein the one or more processors execute the instructions to perform operations comprising;
  
  acquiring a plurality of images, each image including an object within the image, the object having a different appearance in a number of images of the plurality of images;
  
  decomposing, for each image, the object into components and generating label data for each image;
  
  inputting the images and associated label data into a learning module to train the learning module to recognize the components of the object, the training being based on an overall objectness score of the object, an objectness score of each component of the object, a pose of the object, and a pose of each component of the object for each image input;
  
  inputting an additional image into the trained learning module;
  
  detecting that the additional image has one or more of the components of the object within the additional image in response to input into the trained learning module; and
  
  identifying the object and estimating its pose information in the additional image and/or constructing a complete view of the object in the additional image or a complete sketch of the object in response to detection of the one or more of the components, wherein training the learning module includes minimizing a cost function generated as a sum of discrepancies between the label data and computed data for a set of items, the set of items including the overall objectness score, the objectness scores of the components, the pose of the object, and the poses of the components.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The system of claim 1, wherein decomposing the object in the images into components includes using a user interface to provide user input to decompose the object into the components.
  - 3. The system of claim 2, wherein providing user input to decompose the object into components includes using one or more actions from a set of actions, the set including using heuristics to decompose the object, using a guess of a best fit from generated or matched existing decomposed candidates, and manually decomposing a number of typical examples.
  - 4. The system of claim 1, wherein the learning module is a deep neural network.
  - 5. The system of claim 1, wherein the plurality of images includes one or more of a full appearance of the object, a partial appearance of the object, an orientation variation of the object, or an appearance variation of the object.
  - 6. The system of claim 1, wherein acquiring the plurality of images includes creating training images of the object with partial appearance of the object in the training images and labeling the training images with pose information from creating the training images or labeling the training images with pose information based on deformation descriptors from conducting an object template matching.
  - 7. The system of claim 1, wherein constructing the complete view includes applying pose information from detecting that the additional image has one or more of the components of the object within the additional image and the additional image as input to a generative adversarial network to reconstruct a full appearance of the object.

8. A system operable to recognize and reconstruct objects in images, the system comprising:
- a non-transitory memory storage comprising instructions; and
  
  one or more processors in communication with the non-transitory memory storage, wherein the one or more processors execute the instructions to perform operations comprising;
  
  acquiring a plurality of images, each image including an object within the image, the object having a different appearance in a number of images of the plurality of images;
  
  decomposing, for each image, the object into components and generating label data for each image;
  
  inputting the images and associated label data into a learning module to train the learning module to recognize the components of the object, the training being based on an overall objectness score of the object, an objectness score of each component of the object, a pose of the object, and a pose of each component of the object for each image input;
  
  inputting an additional image into the trained learning module;
  
  detecting that the additional image has one or more of the components of the object within the additional image in response to input into the trained learning module; and
  
  identifying the object and estimating its pose information in the additional image and/or constructing a complete view of the object in the additional image or a complete sketch of the object in response to detection of the one or more of the components, wherein the learning module is a deep neural network and training the deep neural network includesinputting the images and associated label data to a first convolution layer of a plurality of convolution layers i, i=1, . . . n−
  
  1, n arranged in series such that input to each convolution layer includes an output of a previous convolution layer in the series beginning with convolution layer 2, convolution layer 1 being the first convolution layer;
  
  generating, from the output of convolution layer n−
  
  1, region proposals for each component;
  
  generating, from the output of convolution layer n, region proposals for the object in whole; and
  
  performing region of interest pooling using the region proposals for each component, output of convolution layers n−
  
  1 and n, and the region proposals for the object in whole.
- View Dependent Claims (9)
- - 9. The system of claim 8, wherein the region of interest pooling includes a consistency checking among the region proposals for the object in whole and the region proposals for the components to discard inconsistent region proposals.

10. A computer-implemented method comprising:
- acquiring a plurality of images, each image including an object within the image, the object having a different appearance in a number of images of the plurality of images;
  
  decomposing, for each image, the object into components and generating label data for each image;
  
  inputting the images and associated label data into a learning module to train the learning module to recognize the components of the object, the training being based on an overall objectness score of the object, an objectness score of each component of the object, a pose of the object, and a pose of each component of the object for each image input;
  
  inputting an additional image into the trained learning module;
  
  detecting that the additional image has one or more of the components of the object within the additional image in response to input into the trained learning module; and
  
  identifying the object and estimating its pose information in the additional image and/or constructing a complete view of the object in the additional image or a complete sketch of the object in response to detection of the one or more of the components, wherein training the learning module includes minimizing a cost function generated as a sum of discrepancies between the label data and computed data for a set of items, the set of items including the overall objectness score, the objectness scores of the components, the pose of the object, and the poses of the components.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
- - 11. The computer-implemented method of claim 10, wherein decomposing the object in the images into components includes using a user interface to provide user input to decompose the object into the components.
  - 12. The computer-implemented method of claim 11, wherein providing user input to decompose the object into components includes using one or more actions from a set of actions, the set including using heuristics to decompose the object, using a guess of a best fit from generated or matched existing decomposed candidates, and manually decomposing a number of typical examples.
  - 13. The computer-implemented method of claim 10, wherein the learning module is a deep neural network.
  - 14. The computer-implemented method of claim 13, wherein training the deep neural network includesinputting the images and associated label data to a first convolution layer of a plurality of convolution layers i, i=1, . . . n−
    - 1, n arranged in series such that input to each convolution layer includes an output of a previous convolution layer in the series beginning with convolution layer 2, convolution layer 1 being the first convolution layer;
      
      generating, from the output of convolution layer n−
      
      1, region proposals for each component;
      
      generating, from the output of convolution layer n, region proposals for the object in whole; and
      
      performing region of interest pooling using the region proposals for each component, output of convolution layers n−
      
      1 and n, and the region proposals for the object in whole.
  - 15. The computer-implemented method of claim 10, wherein acquiring the plurality of images includes acquiring one or more of a full appearance of the object, a partial appearance of the object, an orientation variation of the object, or an appearance variation of the object.
  - 16. The computer-implemented method of claim 10, wherein acquiring the plurality of images includes creating training images of the object with partial appearance of the object in the training images and labeling the training images with pose information from creating the training images or labeling the training images with pose information based on deformation descriptors from conducting an object template matching.
  - 17. The computer-implemented method of claim 10, wherein constructing the complete view includes making a correspondence between an object template and a detected object having a partial appearance.
  - 18. The computer-implemented method of claim 10, wherein constructing the complete view includes applying pose information from detecting that the additional image has one or more of the components of the object within the additional image and the additional image as input to a generative adversarial network to construct a full appearance of the object.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Futurewei Technologies Incorporated (Huawei Investment & Holding Co., Ltd.)
Original Assignee
Futurewei Technologies Incorporated (Huawei Investment & Holding Co., Ltd.)
Inventors
Liu, Lifeng, Yin, Xiaotian, Zhu, Yingxuan, Zhang, Jun, Li, Jian
Primary Examiner(s)
Goradia, Shefali D

Application Number

US15/643,453
Publication Number

US 20190012802A1
Time in Patent Office

845 Days
Field of Search
US Class Current
CPC Class Codes

G06F 18/214   Generating training pattern...

G06F 18/24143   Distances to neighbourhood ...

G06T 2207/20081   Training; Learning

G06T 2207/20084   Artificial neural networks ...

G06T 7/73   using feature-based methods

G06V 10/82   using neural networks

G06V 20/10   Terrestrial scenes scenes u...

G06V 20/70   Labelling scene content, e....

G06V 30/19147   Obtaining sets of training ...

G06V 30/19173   Classification techniques

G06V 30/2504   Coarse or fine approaches, ...

Recognition and reconstruction of objects with partial appearance

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Recognition and reconstruction of objects with partial appearance

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links