Simultaneous localization and mapping using multiple view feature descriptors

US 7,831,094 B2
Filed: 12/22/2004
Issued: 11/09/2010
Est. Priority Date: 04/27/2004
Status: Expired due to Fees

First Claim

Patent Images

1. A method for simultaneous localization and mapping of a position of a camera, comprising the steps of:

receiving a sequence of images from the camera having an incremental baseline change between two images, the sequence of images describing a three-dimensional environment surrounding the camera;

generating, using a feature tracking module included in a computing device, one or more training feature descriptors using an approximate Kernel Principal Analysis (KPCA), a training feature descriptor based on the incremental baseline change between the two images in the sequence of images received from the camera, and the training feature vector associated with position information describing a location of an object within the three dimensional environment relative to the camera;

creating, using a structure-from-motion module included in the computing device, a three-dimensional map of the three dimensional environment captured by the sequence of images using an extended Kalman Filter, and the one or more training feature descriptors, the map including position information associated with one or more training feature descriptors describing locations of one or more objects within the three dimensional environment relative to the camera;

receiving a recognition image containing a wide baseline appearance variation relative to at least a last image from the sequence of images;

extracting a recognition feature descriptor from the recognition image using the feature tracking module; and

determining a position of the camera within the three-dimensional map, using the feature tracking module, by matching the recognition feature descriptor to a training feature descriptor from the one or more training feature descriptors and identifying a position information associated with the training feature descriptor from the one or more training feature descriptors which describes a location of an identified object within the three dimensional environment relative to the camera.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Simultaneous localization and mapping (SLAM) utilizes multiple view feature descriptors to robustly determine location despite appearance changes that would stifle conventional systems. A SLAM algorithm generates a feature descriptor for a scene from different perspectives using kernel principal component analysis (KPCA). When the SLAM module subsequently receives a recognition image after a wide baseline change, it can refer to correspondences from the feature descriptor to continue map building and/or determine location. Appearance variations can result from, for example, a change in illumination, partial occlusion, a change in scale, a change in orientation, change in distance, warping, and the like. After an appearance variation, a structure-from-motion module uses feature descriptors to reorient itself and continue map building using an extended Kalman Filter. Through the use of a database of comprehensive feature descriptors, the SLAM module is also able to refine a position estimation despite appearance variations.

43 Citations

View as Search Results

45 Claims

1. A method for simultaneous localization and mapping of a position of a camera, comprising the steps of:
- receiving a sequence of images from the camera having an incremental baseline change between two images, the sequence of images describing a three-dimensional environment surrounding the camera;
  
  generating, using a feature tracking module included in a computing device, one or more training feature descriptors using an approximate Kernel Principal Analysis (KPCA), a training feature descriptor based on the incremental baseline change between the two images in the sequence of images received from the camera, and the training feature vector associated with position information describing a location of an object within the three dimensional environment relative to the camera;
  
  creating, using a structure-from-motion module included in the computing device, a three-dimensional map of the three dimensional environment captured by the sequence of images using an extended Kalman Filter, and the one or more training feature descriptors, the map including position information associated with one or more training feature descriptors describing locations of one or more objects within the three dimensional environment relative to the camera;
  
  receiving a recognition image containing a wide baseline appearance variation relative to at least a last image from the sequence of images;
  
  extracting a recognition feature descriptor from the recognition image using the feature tracking module; and
  
  determining a position of the camera within the three-dimensional map, using the feature tracking module, by matching the recognition feature descriptor to a training feature descriptor from the one or more training feature descriptors and identifying a position information associated with the training feature descriptor from the one or more training feature descriptors which describes a location of an identified object within the three dimensional environment relative to the camera.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 2. The method of claim 1, wherein said generating a training feature descriptor further comprises:
    - selecting a tracking feature from a plurality of tracking points derived from the sequence of images; and
      
      matching the tracking points over the sequence of images.
  - 3. The method of claim 2, wherein the step of selecting the tracking feature comprises:
    - selecting the tracking feature using an affine invariant tracker.
  - 4. The method of claim 2, wherein the step of generating the training feature descriptor comprises:
    - building the training feature descriptor from the tracking feature using approximate Kernel Principal Component Analysis.
  - 5. The method of claim 1, wherein the step of creating a map comprises:
    - generating a three-dimensional map using an extended Kalman Filter.
  - 6. The method of claim 1 , further comprising:
    - storing the training feature descriptor.
  - 7. The method of claim 1 , wherein the step of matching the recognition feature descriptor to the training feature descriptor comprises:
    - matching the recognition feature descriptor to the training feature descriptor through wide baseline correspondences.
  - 8. The method of claim 1, wherein the two images of the sequence of images exhibit small baseline viewpoint changes.
  - 9. The method of claim 1, wherein the step of matching the recognition feature descriptor to the training feature descriptor comprises:
    - projecting the recognition feature descriptor onto the training feature descriptor;
      
      determining a projection distance; and
      
      determining whether the projection distance falls below a threshold.
  - 10. The method of claim 1, further comprising:
    - estimating a current position of the camera based on an initial position and a distance traveled; and
      
      adjusting said estimation for drift based on a difference between the training feature descriptor and the recognition feature descriptor.
  - 11. The method of claim 1, wherein the appearance variation comprises at least one from the group containing a viewpoint variation, an illumination variation, a scale variation, a geometry variation and an occlusion.
  - 12. The method of claim 1, further comprising the step of receiving an initial position.
  - 13. The method of claim 1, further comprising the step of updating the map of the environment.
  - 14. The method of claim 1, wherein the camera is included in a camera-based computing system.
  - 15. The method of claim 14, wherein the camera-based computing system comprises a robot.
  - 16. The method of claim 1, wherein the camera is included in a camera based computing system.
  - 17. The method of claim 16, wherein the camera-based computing system comprises a robot.
  - 18. The method of claim 1, wherein the two images comprise consecutive images.

19. A system for determining a position of a camera using wide baseline matching, comprising:
- means for tracking within a sequence of images received from the camera having an incremental baseline change between two images, the sequence of images describing a three-dimensional environment surrounding the camera;
  
  means for receiving a recognition image from the camera and extracting a recognition feature descriptor from the recognition image;
  
  means for describing coupled to the means for tracking, the means for describing to generate one or more training feature descriptors using an approximate Kernel Principal Analysis (KPCA), a training feature descriptor based on the incremental baseline change between the two images in the sequence of images received from the camera and the training feature vector associated with position information describing a location of an object within the three dimensional environment relative to the camera;
  
  means for mapping a scene from the recognition image, coupled to the means for tracking and the means for matching, the means for mapping to create a three dimensional map of the three dimensional environment captured by the sequence of images using an extended Kalman Filter and the one or more training feature descriptors, said three dimensional map including position information associated with one or more feature vectors describing locations of one or more objects within the three dimensional environment relative to the camera;
  
  means for matching coupled to the means for tracking, the means for matching to match the recognition feature descriptor to a training feature descriptor from the one or more training feature descriptors and identifying a position information associated with the training feature descriptor from the one or more training feature descriptors which describes a location of an identified object within the three dimensional environment relative to the camera; and
  
  means for positioning, coupled to the means for mapping and the means for matching, the means for positioning to determine a position of the camera within the three dimensional map, and responsive to the appearance variation within the recognition image.
- View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
- - 20. The system of claim 19, wherein the means for tracking selects a tracking feature based on a plurality of tracking points from the sequence of images, and matches the tracking points among one or more of the sequence of images.
  - 21. The system of claim 20, wherein the means for tracking selects the tracking feature using an affine invariant tracker.
  - 22. The system of claim 19, wherein the means for describing stores the training feature descriptor.
  - 23. The system of claim 19, wherein the means for matching matches the recognition feature descriptor to the training feature descriptor through wide baseline correspondences.
  - 24. The system of claim 19, wherein an image contains a small baseline viewpoint change with respect to a preceding image.
  - 25. The system of claim 19, wherein the means for matching projects the recognition feature descriptor onto the training feature descriptor, determines a projection distance, and determines whether the projection distance falls below a threshold.
  - 26. The system of claim 19, wherein the means for positioning estimates a current position of the camera based on an initial position and a distance traveled, and adjusts said estimate for drift based on a difference between the training feature descriptor and the recognition feature descriptor.
  - 27. The system of claim 19, wherein the appearance variation comprises at least one from the group containing:
    - a viewpoint variation, an illumination variation, a scale variation, a geometry variation and an occlusion.
  - 28. The system of claim 19, wherein the means for positioning receives an initial position of the camera.
  - 29. The system of claim 19, wherein the camera is included in a camera-based computing system.
  - 30. The system of claim 29, wherein the camera-based computing system comprises a robot.
  - 31. The system of claim 19, wherein the two images comprise consecutive images.

32. A computer program product, comprising a non-transitory computer-readable storage medium having computer program instructions and data embodied thereon for implementing a method for determining a position of a camera using wide baseline matching, the method comprising the steps of:
- receiving a sequence of images from the camera wherein there is an incremental change between two images, the sequence of images describing a three dimensional environment surrounding the camera;
  
  generating one or more training feature descriptors using an approximate Kernel Principal Analysis (KPCA), a training feature descriptor based on the incremental change between the two images in the sequence of images received from the camera and a training feature vector associated with position information describing a location of an object within the three dimensional environment relative to the camera;
  
  creating a three dimensional map of the three dimensional environment captured by the sequence of images using an extended Kalman Filter and the one or more training feature descriptors, the three dimensional map including position information associated with one or more training feature descriptors describing locations of one or more objects within the three dimensional environment relative to the camera;
  
  receiving a recognition image containing an appearance variation relative to at least a last image from the sequence of images;
  
  extracting a recognition feature descriptor from the recognition image; and
  
  determining a position of the camera within the map by matching the recognition feature descriptor to the training feature descriptor from the one or more training feature descriptors and identifying a position information associated with the training feature descriptor from the one or more training feature descriptors which describes a location of an identified object within the three dimensional environment relative to the camera.
- View Dependent Claims (33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45)
- - 33. The computer program product of claim 32, wherein said step of generating a training feature descriptor further comprises:
    - selecting a tracking feature from a plurality of tracking points derived from the sequence of images; and
      
      matching the tracking points over the sequence of images.
  - 34. The computer program product of claim 33, wherein the step of selecting the tracking feature comprises selecting the tracking feature using an affine invariant tracker.
  - 35. The computer program product of claim 32, further comprising the step of storing the training feature descriptor.
  - 36. The computer program product of claim 32, wherein the step of matching the recognition feature descriptor to the training feature descriptor comprises matching the recognition feature descriptor to the training feature descriptor through wide baseline correspondences.
  - 37. The computer program product of claim 32, wherein the two images of the sequence of images exhibit small baseline viewpoint changes.
  - 38. The computer program product of claim 32, wherein the step of matching the recognition feature descriptor to the training feature descriptor comprises:
    - projecting the recognition feature descriptor onto the training feature descriptor;
      
      determining a projection distance; and
      
      determining whether the projection distance falls below a threshold.
  - 39. The computer program product of claim 32, further comprising:
    - estimating a current position of the camera based on an initial position and a distance traveled; and
      
      adjusting said estimate for motion drift based on a difference between the training feature descriptor and the recognition feature descriptor.
  - 40. The computer program product of claim 32, wherein the appearance variation comprises at least one from the group containing:
    - a viewpoint variation, an illumination variation, a scale variation, a geometry variation, and an occlusion.
  - 41. The computer program product of claim 32, further comprising the step of receiving an initial position.
  - 42. The computer program product of claim 32, further comprising the step of updating the map of the environment.
  - 43. The computer program product of claim 32, wherein the camera is included in a camera-based computing system.
  - 44. The computer program product of claim 43, wherein the camera-based computing system comprises a robot.
  - 45. The computer program product of claim 32, wherein the two images comprise consecutive images.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Honda Motor Company
Original Assignee
Honda Motor Co., Ltd. (Honda Motor Company)
Inventors
Yang, Ming-Hsuan, Meltzer, Jason, Gupta, Rakesh
Primary Examiner(s)
Le; Brian Q
Assistant Examiner(s)
Park; Edward

Application Number

US11/021,672
Publication Number

US 20050238200A1
Time in Patent Office

2,148 Days
Field of Search

382/190, 382/208
US Class Current

382/190
CPC Class Codes

G06F 18/21355   nonlinear criteria, e.g. em...

G06F 18/28   Determining representative ...

G06V 10/7715   Feature extraction, e.g. by...

G06V 10/772   Determining representative ...

G06V 20/10   Terrestrial scenes scenes u...

G06V 20/64   Three-dimensional objects

G06V 2201/11   Technique with transformati...

Simultaneous localization and mapping using multiple view feature descriptors

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

43 Citations

45 Claims

Specification

Solutions

Use Cases

Quick Links

Simultaneous localization and mapping using multiple view feature descriptors

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

43 Citations

45 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links