Simultaneous localization and mapping using multiple view feature descriptors
First Claim
1. A method for simultaneous localization and mapping of a position of a camera, comprising the steps of:
- receiving a sequence of images from the camera having an incremental baseline change between two images, the sequence of images describing a three-dimensional environment surrounding the camera;
generating, using a feature tracking module included in a computing device, one or more training feature descriptors using an approximate Kernel Principal Analysis (KPCA), a training feature descriptor based on the incremental baseline change between the two images in the sequence of images received from the camera, and the training feature vector associated with position information describing a location of an object within the three dimensional environment relative to the camera;
creating, using a structure-from-motion module included in the computing device, a three-dimensional map of the three dimensional environment captured by the sequence of images using an extended Kalman Filter, and the one or more training feature descriptors, the map including position information associated with one or more training feature descriptors describing locations of one or more objects within the three dimensional environment relative to the camera;
receiving a recognition image containing a wide baseline appearance variation relative to at least a last image from the sequence of images;
extracting a recognition feature descriptor from the recognition image using the feature tracking module; and
determining a position of the camera within the three-dimensional map, using the feature tracking module, by matching the recognition feature descriptor to a training feature descriptor from the one or more training feature descriptors and identifying a position information associated with the training feature descriptor from the one or more training feature descriptors which describes a location of an identified object within the three dimensional environment relative to the camera.
2 Assignments
0 Petitions
Accused Products
Abstract
Simultaneous localization and mapping (SLAM) utilizes multiple view feature descriptors to robustly determine location despite appearance changes that would stifle conventional systems. A SLAM algorithm generates a feature descriptor for a scene from different perspectives using kernel principal component analysis (KPCA). When the SLAM module subsequently receives a recognition image after a wide baseline change, it can refer to correspondences from the feature descriptor to continue map building and/or determine location. Appearance variations can result from, for example, a change in illumination, partial occlusion, a change in scale, a change in orientation, change in distance, warping, and the like. After an appearance variation, a structure-from-motion module uses feature descriptors to reorient itself and continue map building using an extended Kalman Filter. Through the use of a database of comprehensive feature descriptors, the SLAM module is also able to refine a position estimation despite appearance variations.
43 Citations
45 Claims
-
1. A method for simultaneous localization and mapping of a position of a camera, comprising the steps of:
-
receiving a sequence of images from the camera having an incremental baseline change between two images, the sequence of images describing a three-dimensional environment surrounding the camera; generating, using a feature tracking module included in a computing device, one or more training feature descriptors using an approximate Kernel Principal Analysis (KPCA), a training feature descriptor based on the incremental baseline change between the two images in the sequence of images received from the camera, and the training feature vector associated with position information describing a location of an object within the three dimensional environment relative to the camera; creating, using a structure-from-motion module included in the computing device, a three-dimensional map of the three dimensional environment captured by the sequence of images using an extended Kalman Filter, and the one or more training feature descriptors, the map including position information associated with one or more training feature descriptors describing locations of one or more objects within the three dimensional environment relative to the camera; receiving a recognition image containing a wide baseline appearance variation relative to at least a last image from the sequence of images; extracting a recognition feature descriptor from the recognition image using the feature tracking module; and determining a position of the camera within the three-dimensional map, using the feature tracking module, by matching the recognition feature descriptor to a training feature descriptor from the one or more training feature descriptors and identifying a position information associated with the training feature descriptor from the one or more training feature descriptors which describes a location of an identified object within the three dimensional environment relative to the camera. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A system for determining a position of a camera using wide baseline matching, comprising:
-
means for tracking within a sequence of images received from the camera having an incremental baseline change between two images, the sequence of images describing a three-dimensional environment surrounding the camera; means for receiving a recognition image from the camera and extracting a recognition feature descriptor from the recognition image; means for describing coupled to the means for tracking, the means for describing to generate one or more training feature descriptors using an approximate Kernel Principal Analysis (KPCA), a training feature descriptor based on the incremental baseline change between the two images in the sequence of images received from the camera and the training feature vector associated with position information describing a location of an object within the three dimensional environment relative to the camera; means for mapping a scene from the recognition image, coupled to the means for tracking and the means for matching, the means for mapping to create a three dimensional map of the three dimensional environment captured by the sequence of images using an extended Kalman Filter and the one or more training feature descriptors, said three dimensional map including position information associated with one or more feature vectors describing locations of one or more objects within the three dimensional environment relative to the camera; means for matching coupled to the means for tracking, the means for matching to match the recognition feature descriptor to a training feature descriptor from the one or more training feature descriptors and identifying a position information associated with the training feature descriptor from the one or more training feature descriptors which describes a location of an identified object within the three dimensional environment relative to the camera; and means for positioning, coupled to the means for mapping and the means for matching, the means for positioning to determine a position of the camera within the three dimensional map, and responsive to the appearance variation within the recognition image. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
-
-
32. A computer program product, comprising a non-transitory computer-readable storage medium having computer program instructions and data embodied thereon for implementing a method for determining a position of a camera using wide baseline matching, the method comprising the steps of:
-
receiving a sequence of images from the camera wherein there is an incremental change between two images, the sequence of images describing a three dimensional environment surrounding the camera; generating one or more training feature descriptors using an approximate Kernel Principal Analysis (KPCA), a training feature descriptor based on the incremental change between the two images in the sequence of images received from the camera and a training feature vector associated with position information describing a location of an object within the three dimensional environment relative to the camera; creating a three dimensional map of the three dimensional environment captured by the sequence of images using an extended Kalman Filter and the one or more training feature descriptors, the three dimensional map including position information associated with one or more training feature descriptors describing locations of one or more objects within the three dimensional environment relative to the camera; receiving a recognition image containing an appearance variation relative to at least a last image from the sequence of images; extracting a recognition feature descriptor from the recognition image; and determining a position of the camera within the map by matching the recognition feature descriptor to the training feature descriptor from the one or more training feature descriptors and identifying a position information associated with the training feature descriptor from the one or more training feature descriptors which describes a location of an identified object within the three dimensional environment relative to the camera. - View Dependent Claims (33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45)
-
Specification