Machine learning based model localization system

US 10,977,818 B2
Filed: 02/23/2018
Issued: 04/13/2021
Est. Priority Date: 05/19/2017
Status: Active Grant

First Claim

Patent Images

1. A computer-operated method of determining an angular field of view and estimating a pose of a source imaging sensor based on at least one two-dimensional (2D) image input comprising a plurality of image pixels of an observed scene, the method comprising:

(a) accessing an input data set comprising a 2D image data set to be analyzed and source imaging sensor parameter information, (b) executing a Machine Learning algorithm that uses said 2D image data set of the input data set to generate estimated depth values for at least a portion of image pixels output by the source imaging sensor to provide real three-dimensional (3D) image points having associated depth values,(c) in parallel with executing the Machine Learning algorithm, determining an angular field of view of the source imaging sensor based on the input data set and generating a the source imaging sensor angular field of view as output,(d) in response to the real 3D image points including the generated estimated depth values and the generated angular field of view, generating a source imaging sensor (3D) pose estimate relative to the real 3D image points in the observed scene, and(e) outputting the generated 3D pose estimate in conjunction with the estimated depth values.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for deriving an image sensor'"'"'s 3D pose estimate from a 2D scene image input includes at least one Machine Learning algorithm trained a priori to generate a 3D depth map estimate from the 2D image input, which is used in conjunction with physical attributes of the source imaging device to make an accurate estimate of the imaging device 3D location and orientation relative to the 3D content of the imaged scene. The system may optionally employ additional Machine Learning algorithms to recognize objects within the scene to further infer contextual information about the scene, such as the image sensor pose estimate relative to the floor plane or the gravity vector. The resultant refined imaging device localization data can be applied to static (picture) or dynamic (video), 2D or 3D images, and is useful in many applications, most specifically for the purposes of improving the realism and accuracy of primarily static, but also dynamic Augmented Reality (AR) applications.

17 Citations

View as Search Results

20 Claims

1. A computer-operated method of determining an angular field of view and estimating a pose of a source imaging sensor based on at least one two-dimensional (2D) image input comprising a plurality of image pixels of an observed scene, the method comprising:
- (a) accessing an input data set comprising a 2D image data set to be analyzed and source imaging sensor parameter information, (b) executing a Machine Learning algorithm that uses said 2D image data set of the input data set to generate estimated depth values for at least a portion of image pixels output by the source imaging sensor to provide real three-dimensional (3D) image points having associated depth values,(c) in parallel with executing the Machine Learning algorithm, determining an angular field of view of the source imaging sensor based on the input data set and generating a the source imaging sensor angular field of view as output,(d) in response to the real 3D image points including the generated estimated depth values and the generated angular field of view, generating a source imaging sensor (3D) pose estimate relative to the real 3D image points in the observed scene, and(e) outputting the generated 3D pose estimate in conjunction with the estimated depth values.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The method of claim 1, wherein the Machine Learning algorithm comprises an Artificial Neural Network (ANN).
  - 3. The method of claim 1, wherein generating the 3D pose estimate comprises generating a vector with spherical coordinates.
  - 4. The method of claim 1, further comprising performing a secondary Machine Learning algorithm, executed directly before, directly after, or in parallel with (b) and (c), wherein the secondary Machine Learning algorithm is trained to recognize a floor plane in a scene image, wherein performing the secondary Machine Learning algorithm receives an input scene image data set, and segments and classifies a floor plane object within the scene image and subsequently issues a matching pixel set as output.
  - 5. The method of claim 4, wherein performing the secondary Machine Learning algorithm includes (i) receiving floor plane pixel values, (ii) determining associated depth estimates for said received pixel values, and (iii) using trigonometry or another suitable mathematical method to calculate location and orientation of a floor plane relative to the imaging sensor.
  - 6. The method of claim 5, wherein performing the secondary Machine Learning algorithm includes encoding the calculated location and orientation of the floor plane as relative 3D location and orientation measurements, respectively, into an output imaging sensor pose estimate vector in the form of additional vector components.
  - 7. The method of claim 1, further comprising performing an arbitrary number of additional Machine Learning algorithms, executed directly before, directly after, or in parallel with (b) and (c), or with each other, wherein an arbitrary one of the Machine Learning algorithm(s) has been trained to recognize an arbitrary object in a scene image, and upon receiving an input scene image data set, is capable of segmenting and classifying said object within the scene image and subsequently issuing the matching pixel set as output.
  - 8. The method of claim 7, wherein (c) further encodes relative 3D location and orientation measurements of each detectable object into an output imaging sensor pose estimate vector in the form of additional vector components.
  - 9. The method of claim 2, wherein the ANN comprises a Convolutional Neural Network (CNN).
  - 10. The method of claim 1 further including training the Machine Learning algorithm using supervised training.
  - 11. The method of claim 1, wherein a training data set for the Machine Learning algorithm comprises a plurality of image data set pairs, a 2D image of the observed scene and a 3D image of the same scene.
  - 12. The method of claim 7, further including storing and transmitting additional semantic context definitions of objects in conjunction with pose estimate vector outputs of a respective object detection Machine Learning algorithm.
  - 13. The method of claim 4, wherein (c) includes performing an additional calculation to create a file containing all pixels not coincident with a detected floor plane.
  - 14. The method of claim 7, further including training the Machine Learning algorithm to detect wall plane objects.
  - 15. The method of claim 7, further including training the Machine Learning algorithm to detect pendulous objects.

16. A computer-operated system for determining an angular field of view and estimating a three-dimensional (3D) pose of a source imaging sensor, comprising at least one processor configured to perform operations comprising:
- (a) receiving, based on image capture by a source imaging sensor, a two-dimensional (2D) input image data set comprising image pixels,(b) determining an angular field of view of the source imaging sensor based on the 2D input image data set,(c) using a Machine Learning algorithm that analyzes the 2D input image data set, estimating depth values for at least some of the image pixels to provide real 3D image points, and(d) in response to the determined angular field of view and the real 3D image points including the estimated depth values, estimating a 3D pose of the source imaging sensor relative to the real 3D image points.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The system of claim 16 wherein the 3D pose comprises a vector with spherical coordinates.
  - 18. The system of claim 16 wherein the at least one processor is further configured to encode relative 3D location and orientation measurements of at least one detectable object into an output imaging sensor pose estimate vector in the form of additional vector components.
  - 19. The system of claim 16 wherein the at least one processor is further configured to manipulate additional semantic context definitions of objects in conjunction with pose estimate vector outputs.
  - 20. The system of claim 16 wherein the at least one processor is further configured to detect wall plane, floor plane and pendulous objects.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Manor Financial, Inc.
Original Assignee
Manor Financial, Inc.
Inventors
Clark, Taylor, Kemendo, Andrew, Shanks, Mark
Primary Examiner(s)
Bost, Dwayne D
Assistant Examiner(s)
Brinich, Stephen M

Application Number

US15/904,273
Publication Number

US 20180189974A1
Time in Patent Office

1,145 Days
Field of Search

358103, 358100, 358155-159, 358181, 358189-190, 358205, 358276, 358285
US Class Current
CPC Class Codes

G06N 3/04   Architecture, e.g. intercon...

G06N 3/08   Learning methods

G06T 2207/20081   Training; Learning

G06T 2207/20084   Artificial neural networks ...

G06T 2207/30244   Camera pose

G06T 7/50   Depth or shape recovery

G06T 7/536   from perspective effects, e...

G06T 7/70   Determining position or ori...

G06T 7/77   using statistical methods

Machine learning based model localization system

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

17 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Machine learning based model localization system

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

17 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links