Three dimensional scene reconstruction based on contextual analysis

US 10,573,018 B2
Filed: 07/13/2016
Issued: 02/25/2020
Est. Priority Date: 07/13/2016
Status: Active Grant

First Claim

Patent Images

1. A processor-implemented method for 3-Dimensional (3D) scene reconstruction, the method comprising:

receiving, by a processor, a plurality of 3D image frames of a scene, each frame comprising a red-green-blue (RGB) image frame comprising color pixels and a depth map frame, the depth map frame comprising depth pixels, wherein each of the 3D image frames is associated with a pose of a depth camera that generated the 3D image frames;

projecting, by the processor, the depth pixels into points in a global coordinate system based on the camera pose;

accumulating, by the processor, the projected points into a 3D reconstruction of the scene;

detecting, by the processor, objects for each 3D image frame, based on the camera pose, the 3D reconstruction, the RGB image frame and the depth map frame;

classifying, by the processor, the detected objects into one or more object classes;

grouping, by the processor, a plurality of instances of objects in one of the object classes based on a measure of similarity of features between the object instances, wherein the grouping comprisesdetecting features associated with surfaces of each of first and second object instances,applying feature descriptors to the detected features,matching descriptors between the first and second object instances, andpairing the first object instance with the second object instance, the first and second object instances associated with a greatest number of descriptor matches;

validating the paired object instances by calculating a distance between the matched descriptors of the paired object instances and comparing the distance to a threshold value; and

combining, by the processor, point clouds associated with each of the first and second object instances to generate a fused object.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Techniques are provided for context-based 3D scene reconstruction employing fusion of multiple instances of an object within the scene. A methodology implementing the techniques according to an embodiment includes receiving 3D image frames of the scene, each frame associated with a pose of a depth camera, and creating a 3D reconstruction of the scene based on depth pixels that are projected and accumulated into a global coordinate system. The method may also include detecting objects, based on the 3D reconstruction, the camera pose and the image frames. The method may further include classifying the detected objects into one or more object classes; grouping two or more instances of objects in one of the object classes based on a measure of similarity of features between the object instances; and combining point clouds associated with each of the grouped object instances to generate a fused object.

Citations

20 Claims

1. A processor-implemented method for 3-Dimensional (3D) scene reconstruction, the method comprising:
- receiving, by a processor, a plurality of 3D image frames of a scene, each frame comprising a red-green-blue (RGB) image frame comprising color pixels and a depth map frame, the depth map frame comprising depth pixels, wherein each of the 3D image frames is associated with a pose of a depth camera that generated the 3D image frames;
  
  projecting, by the processor, the depth pixels into points in a global coordinate system based on the camera pose;
  
  accumulating, by the processor, the projected points into a 3D reconstruction of the scene;
  
  detecting, by the processor, objects for each 3D image frame, based on the camera pose, the 3D reconstruction, the RGB image frame and the depth map frame;
  
  classifying, by the processor, the detected objects into one or more object classes;
  
  grouping, by the processor, a plurality of instances of objects in one of the object classes based on a measure of similarity of features between the object instances, wherein the grouping comprisesdetecting features associated with surfaces of each of first and second object instances,applying feature descriptors to the detected features,matching descriptors between the first and second object instances, andpairing the first object instance with the second object instance, the first and second object instances associated with a greatest number of descriptor matches;
  
  validating the paired object instances by calculating a distance between the matched descriptors of the paired object instances and comparing the distance to a threshold value; and
  
  combining, by the processor, point clouds associated with each of the first and second object instances to generate a fused object.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the features are 3D features and the 3D features comprise at least one of surface variation, results of application of a min-max operator on panoramic range-images, or results of application of a 3D Harris detector.
  - 3. The method of claim 1, wherein the features are 2D features and the 2D features comprise at least one of Harris corners or results of application of a Scale Invariant Feature Transform.
  - 4. The method of claim 1, wherein the feature descriptors comprise at least one of Spin-Images, 3D Speeded-Up Robust Feature descriptors, Point Feature Histograms, or a Histogram of Oriented Gradients.
  - 5. The method of claim 1, further comprising:
    - registering the paired object instances by computing a rigid transformation to map shared regions between the paired object instances;
      
      wherein validating the paired object instances includes validating the registration of the paired object instances by calculating the distance between the matched descriptors of the paired object instances and comparing the distance to the threshold value.
  - 6. The method of claim 1, further comprising generating an updated 3D reconstruction of the scene based on the fused object.
  - 7. The method of claim 6, further comprising:
    - segmenting first and second objects in the scene, the segmented objects comprising the points of the 3D reconstruction corresponding to contours of the associated object; and
      
      registering the fused objects to a 3D model of the associated fused object to determine an alignment of the fused object in the scene.

8. An electronic system for 3-Dimensional (3D) scene reconstruction, the system comprising:
- a 3D reconstruction circuit to receive a plurality of 3D image frames of a scene, each frame comprising a red-green-blue (RGB) image frame comprising color pixels and a depth map frame, the depth map frame comprising depth pixels, wherein each of the 3D image frames is associated with a pose of a depth camera that generated the 3D image frames,the 3D reconstruction circuit further to project the depth pixels into points in a global coordinate system based on the camera pose and accumulate the projected points into a 3D reconstruction of the scene;
  
  an object detection circuit to detect objects for each 3D image frame, based on the camera pose, the 3D reconstruction, the RGB image frame and the depth map frame;
  
  an object recognition circuit to classify the detected objects into one or more object classes;
  
  a feature detection circuit to detect features associated with surfaces of each of first and second object instances;
  
  a feature descriptor application circuit to apply feature descriptors to the detected features;
  
  a descriptor matching circuit to match descriptors between the first and second object instances;
  
  an instance pairing circuit to pair the first object instance with the second object instance, the first and second object instances associated with the greatest number of descriptor matches;
  
  an instance registration circuit to register the paired object instances by computing a rigid transformation to map shared regions between the paired object instances;
  
  a registration confirmation circuit to validate the registration of the paired object instances by calculating a distance between the matched descriptors of the paired object instances and comparing the distance to a threshold value; and
  
  a context based fusion circuit to group a plurality of instances of objects in one of the object classes based on a measure of similarity of features between the object instances and to combine point clouds associated with each of the first and second object instances to generate a fused object.
- View Dependent Claims (9, 10, 11, 12, 13)
- - 9. The system of claim 8, wherein the features are 3D features and the 3D features comprise at least one of surface variation, results of application of a min-max operator on panoramic range-images, or results of application of a 3D Harris detector.
  - 10. The system of claim 8, wherein the features are 2D features and the 2D features comprise at least one of Harris corners or results of application of a Scale Invariant Feature Transform.
  - 11. The system of claim 8, wherein the feature descriptors comprise at least one of Spin-Images, 3D Speeded-Up Robust Feature descriptors, Point Feature Histograms, or a Histogram of Oriented Gradients.
  - 12. The system of claim 8, wherein the 3D reconstruction circuit is further to generate an updated 3D reconstruction of the scene based on the fused object.
  - 13. The system of claim 12, further comprising:
    - a 3D segmentation circuit to segment first and second objects in the scene, the segmented objects comprising the points of the 3D reconstruction corresponding to contours of the associated object; and
      
      a 3D registration circuit to register the fused objects to a 3D model of the associated fused object to determine an alignment of the fused object in the scene.

14. At least one non-transitory computer readable storage medium having instructions encoded thereon that, when executed by one or more processors, result in the following operations for 3-Dimensional (3D) scene reconstruction, the operations comprising:
- receiving a plurality of 3D image frames of a scene, each frame comprising a red-green-blue (RGB) image frame comprising color pixels and a depth map frame, the depth map frame comprising depth pixels, wherein each of the 3D image frames is associated with a pose of a depth camera that generated the 3D image frames;
  
  projecting the depth pixels into points in a global coordinate system based on the camera pose;
  
  accumulating the projected points into a 3D reconstruction of the scene;
  
  detecting objects for each 3D image frame, based on the camera pose, the 3D reconstruction, the RGB image frame and the depth map frame;
  
  classifying the detected objects into one or more object classes;
  
  grouping a plurality of instances of objects in one of the object classes based on a measure of similarity of features between the object instances, wherein the grouping comprisesdetecting features associated with surfaces of each of first and second object instances,applying feature descriptors to the detected features,matching descriptors between the first and second object instances, andpairing the first object instance with the second object instance, the first and second object instances associated with a greatest number of descriptor matches;
  
  validating the paired object instances by calculating a distance between the matched descriptors of the paired object instances and comparing the distance to a threshold value; and
  
  combining point clouds associated with each of the first and second object instances to generate a fused object.
- View Dependent Claims (15, 16, 17, 18, 19, 20)
- - 15. The computer readable storage medium of claim 14, wherein the features are 3D features and the 3D features comprise at least one of surface variation, results of application of a min-max operator on panoramic range-images, or results of application of a 3D Harris detector.
  - 16. The computer readable storage medium of claim 14, wherein the features are 2D features and the 2D features comprise at least one of Harris corners or results of application of a Scale Invariant Feature Transform.
  - 17. The computer readable storage medium of claim 14, wherein the feature descriptors comprise at least one of Spin-Images, 3D Speeded-Up Robust Feature descriptors, Point Feature Histograms, or a Histogram of Oriented Gradients.
  - 18. The computer readable storage medium of claim 14, the operations further comprising:
    - registering the paired object instances by computing a rigid transformation to map shared regions between the paired object instances;
      
      wherein validating the paired object instances includes validating the registration of the paired object instances by calculating the distance between the matched descriptors of the paired object instances and comparing the distance to the threshold value.
  - 19. The computer readable storage medium of claim 14, the operations further comprising generating an updated 3D reconstruction of the scene based on the fused object.
  - 20. The computer readable storage medium of claim 19, the operations further comprising:
    - segmenting first and second objects in the scene, the segmented objects comprising the points of the 3D reconstruction corresponding to contours of the associated object; and
      
      registering the fused objects to a 3D model of the associated fused object to determine an alignment of the fused object in the scene.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intel Corporation
Original Assignee
Intel Corporation
Inventors
Kutliroff, Gershom, Fleishman, Shahar, Kliger, Mark
Primary Examiner(s)
Yenke, Brian P

Application Number

US15/209,014
Publication Number

US 20180018805A1
Time in Patent Office

1,322 Days
Field of Search
US Class Current
CPC Class Codes

G06T 11/60   Editing figures and text; C...

G06T 17/00   Three dimensional [3D] mode...

G06T 2207/20021   Dividing image into blocks,...

G06T 2207/20048   Transform domain processing

G06T 2207/20212   Image combination

G06T 2207/30244   Camera pose

G06T 2210/56   Particle system, point base...

G06T 7/33   using feature-based methods

G06T 7/50   Depth or shape recovery

G06T 7/60   Analysis of geometric attri...

G06T 7/90   Determination of colour cha...

G06V 20/20   in augmented reality scenes

H04N 13/111   Transformation of image sig...

H04N 13/128   Adjusting depth or disparity

H04N 13/257   Colour aspects

H04N 13/271   wherein the generated image...

Three dimensional scene reconstruction based on contextual analysis

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Three dimensional scene reconstruction based on contextual analysis

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links