Learning image processing tasks from scene reconstructions

US 8,971,612 B2
Filed: 12/15/2011
Issued: 03/03/2015
Est. Priority Date: 12/15/2011
Status: Active Grant

First Claim

Patent Images

1. A method of image processing comprising:

receiving a plurality of first input empirical images of a scene from an image capture device in real-time;

at a processor, calculating a 2D or higher dimensional reconstruction of the scene from the first input images, reconstruction being based at least in part on a real-time frame alignment engine;

forming training data from the reconstruction of the scene and the first input images;

using the training data to learn at least one parameter of a function for transforming an image;

receiving a second input image; and

transforming the second input image using the function and the at least one parameter,wherein forming the training data comprises rendering images from the reconstruction of the scene according to specified poses of an image capture apparatus used to capture the empirical first images.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Learning image processing tasks from scene reconstructions is described where the tasks may include but are not limited to: image de-noising, image in-painting, optical flow detection, interest point detection. In various embodiments training data is generated from a 2 or higher dimensional reconstruction of a scene and from empirical images of the same scene. In an example a machine learning system learns at least one parameter of a function for performing the image processing task by using the training data. In an example, the machine learning system comprises a random decision forest. In an example, the scene reconstruction is obtained by moving an image capture apparatus in an environment where the image capture apparatus has an associated dense reconstruction and camera tracking system.

Citations

19 Claims

1. A method of image processing comprising:
- receiving a plurality of first input empirical images of a scene from an image capture device in real-time;
  
  at a processor, calculating a 2D or higher dimensional reconstruction of the scene from the first input images, reconstruction being based at least in part on a real-time frame alignment engine;
  
  forming training data from the reconstruction of the scene and the first input images;
  
  using the training data to learn at least one parameter of a function for transforming an image;
  
  receiving a second input image; and
  
  transforming the second input image using the function and the at least one parameter,wherein forming the training data comprises rendering images from the reconstruction of the scene according to specified poses of an image capture apparatus used to capture the empirical first images.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. A method as claimed in claim 1 wherein forming the training data comprises using information from the reconstruction of the scene such that the function for transforming an image is able to take into account knowledge of the scene reconstruction.
  - 3. A method as claimed in claim 1 wherein forming the training data comprises accessing specified poses of an image capture apparatus used to capture the empirical first input images and calculating optical flow vectors for time sequence pairs of the first input images, the specified poses being obtained from a real-time tracking system, the real-time tracking system receiving input from camera-mounted orientation and motion tracking sensors and at least one GPS antenna.
  - 4. A method as claimed in claim 1 wherein forming the training data comprises using the scene reconstruction to determine occlusion boundaries and/or visible points in the empirical first input images where the empirical first input images are captured from different views of the scene.
  - 5. A method as claimed in claim 1 comprising forming the training data by rendering images from the reconstruction of the scene according to specified poses of an image capture apparatus used to capture the empirical first input images such that an empirical first input image has a corresponding clean image rendered from the reconstruction of the scene.
  - 6. A method as claimed in claim 1 comprising forming the training data by using information from multiple views of the reconstruction of the scene.
  - 7. A method as claimed in claim 1 wherein using the training data to learn comprises training a random decision forest and transforming the second input image comprises passing image elements of the second input image through the trained random decision forest.

8. A method of image processing comprising:
- receiving at least one input image from an image capture device in real-time; and
  
  at a processor, transforming the input image using a function having at least one parameter which has been learnt from training data which has been obtained from a 2D, or higher dimensional, reconstruction of a scene reconstructed from empirical data, the reconstruction being based at least in part on real-time frame alignment engine,wherein the training data is obtained by rendering images from the reconstruction of the scene according to specified poses of an image capture apparatus used to capture the empirical data.
- View Dependent Claims (9, 10, 11)
- - 9. A method as claimed in claim 8 comprising receiving an input image which is any of:
    - a depth image, a color image, a medical image.
  - 10. A method as claimed in claim 8 wherein the processor is arranged to carry out the transformation in order to perform one or more of the following tasks:
    - de-noise the input image;
      
      in-paint the input image;
      
      detect interest points in the input image;
      
      calculate optical flow vectors for pairs of input images.
  - 11. A method as claimed in claim 8 comprising transforming the input image using a random decision forest which has been trained using the training data.

12. An image processing system comprising:
- an input arranged to receive a sequence of first input empirical images of a scene obtained from a camera moving in the scene;
  
  a processor arranged to calculate a 2D or higher dimensional reconstruction of the scene from the first input images and also to track a location and orientation of the camera, the location and orientation of the camera being based at least in part on camera-mounted orientation and motion sensors;
  
  the processor being arranged to form training data from the reconstruction of the scene, the tracked camera location and orientation, and at least some of the first input images;
  
  a machine learning system arranged to use the training data to learn at least one parameter of a function for transforming an image;
  
  the input being arranged to receive a second input image; and
  
  the machine learning system being arranged to transform the second input image using the function and the at least one parameter,wherein forming the training data comprises rendering images from the reconstruction of the scene according to specified poses of an image capture apparatus used to capture the empirical first images.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
- - 13. An image processing system as claimed in claim 12 wherein the processor is arranged to form the training data by rendering images from the reconstruction of the scene according to the tracked camera location and orientation.
  - 14. An image processing system as claimed in claim 12 wherein the processor is arranged to form the training data by calculating optical flow vectors for time sequence pairs of the first input images using the tracked camera location and orientation.
  - 15. An image processing system as claimed in claim 12 wherein the processor is arranged to form the training data by using the scene reconstruction to determine occlusion boundaries and/or visible points in the empirical first input images.
  - 16. An image processing system as claimed in claim 12 wherein the processor is arranged to form the training data by rendering images from the reconstruction of the scene according to the tracked camera location and orientation such that an empirical first input image has a corresponding clean image rendered from the reconstruction of the scene.
  - 17. An image processing system as claimed in claim 12 wherein the processor is arranged to form the training data by using information from multiple views of the reconstruction of the scene.
  - 18. An image processing system as claimed in claim 12 wherein the processor is arranged to carry out the transformation in order to perform one or more of the following tasks:
    - de-noise the second input image;
      
      in-paint the second input image;
      
      detect interest points in the second input image;
      
      calculate optical flow vectors for pairs of second input images.
  - 19. An image processing system as claimed in claim 12 wherein the processor calculates the reconstruction of the scene in real time using a real-time frame alignment engine.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Shotton, Jamie Daniel Joseph, Kohli, Pushmeet, Izadi, Shahram, Rother, Carsten Curt Eckard, Nowozin, Sebastian, Kim, David, Molyneaux, David, Hilliges, Otmar, Holzer, Stefan Johannes Josef
Primary Examiner(s)
DRENNAN, BARRY T

Application Number

US13/327,273
Publication Number

US 20130156297A1
Time in Patent Office

1,174 Days
Field of Search

None
US Class Current

382/159
CPC Class Codes

G06F 18/24323   Tree-organised classifiers

G06F 18/28   Determining representative ...

G06T 2207/20081   Training; Learning

G06T 5/00   Image enhancement or restor...

G06T 7/55   from multiple images

G06V 10/764   using classification, e.g. ...

G06V 10/772   Determining representative ...

Learning image processing tasks from scene reconstructions

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Learning image processing tasks from scene reconstructions

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links