Annotating images based on multi-modal sensor data

US 10,691,943 B1
Filed: 01/31/2018
Issued: 06/23/2020
Est. Priority Date: 01/31/2018
Status: Active Grant

First Claim

Patent Images

1. An aerial vehicle comprising:

a plurality of propulsion motors, wherein each of the propulsion motors comprises a propeller and a drive shaft, and wherein each of the propulsion motors is configured to rotate the propeller about an axis defined by the drive shaft;

a digital camera configured to capture one or more visual images;

a thermal camera configured to capture one or more thermal images, wherein the digital camera and the thermal camera are calibrated and aligned with fields of view that overlap at least in part; and

a control system having at least one computer processor, wherein the control system is in communication with each of the digital camera, the thermal camera and the plurality of propulsion motors, and wherein the at least one computer processor is configured to execute one or more instructions for performing a method comprising;

initiating a first operation of at least one of the plurality of propulsion motors;

during the first operation,capturing a first plurality of visual images by the digital camera; and

capturing a second plurality of thermal images by the thermal camera;

receiving information regarding at least one visual attribute and at least one thermal attribute of an object;

detecting the at least one visual attribute of the object within a first portion of a first one of the first plurality of visual images;

detecting the at least one thermal attribute of the object within a second portion of a second one of the second plurality of thermal images;

determining that the first portion of the first one of the first plurality of visual images corresponds to the second portion of the second one of the second plurality of thermal images;

generating an annotation of the first one of the first plurality of visual images based at least in part on at least one of the first portion of the first one of the first plurality of visual images or the second portion of the second one of the second plurality of thermal images;

storing the annotation in association with at least the first one of the first plurality of visual images;

providing at least the first one of the first plurality of visual images to a classifier as a training input;

providing at least the annotation to the classifier as a training output;

training the classifier using at least the training input and the training output;

capturing at least a second plurality of visual images by the digital camera;

providing at least one of the second plurality of visual images to the classifier as an input;

receiving an output from the classifier; and

identifying a portion of the at least one of the second plurality of visual images depicting the object based at least in part on the output.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Imaging data or other data captured using a camera may be classified based on data captured using another sensor that is calibrated with the camera and operates in a different modality. Where a digital camera configured to capture visual images is calibrated with another sensor such as a thermal camera, a radiographic camera or an ultraviolet camera, and such sensors capture data simultaneously from a scene, the respectively captured data may be processed to detect one or more objects therein. A probability that data depicts one or more objects of interest may be enhanced based on data captured from calibrated sensors operating in different modalities. Where an object of interest is detected to a sufficient degree of confidence, annotated data from which the object was detected may be used to train one or more classifiers to recognize the object, or similar objects, or for any other purpose.

28 Citations

View as Search Results

20 Claims

1. An aerial vehicle comprising:
- a plurality of propulsion motors, wherein each of the propulsion motors comprises a propeller and a drive shaft, and wherein each of the propulsion motors is configured to rotate the propeller about an axis defined by the drive shaft;
  
  a digital camera configured to capture one or more visual images;
  
  a thermal camera configured to capture one or more thermal images, wherein the digital camera and the thermal camera are calibrated and aligned with fields of view that overlap at least in part; and
  
  a control system having at least one computer processor, wherein the control system is in communication with each of the digital camera, the thermal camera and the plurality of propulsion motors, and wherein the at least one computer processor is configured to execute one or more instructions for performing a method comprising;
  
  initiating a first operation of at least one of the plurality of propulsion motors;
  
  during the first operation,capturing a first plurality of visual images by the digital camera; and
  
  capturing a second plurality of thermal images by the thermal camera;
  
  receiving information regarding at least one visual attribute and at least one thermal attribute of an object;
  
  detecting the at least one visual attribute of the object within a first portion of a first one of the first plurality of visual images;
  
  detecting the at least one thermal attribute of the object within a second portion of a second one of the second plurality of thermal images;
  
  determining that the first portion of the first one of the first plurality of visual images corresponds to the second portion of the second one of the second plurality of thermal images;
  
  generating an annotation of the first one of the first plurality of visual images based at least in part on at least one of the first portion of the first one of the first plurality of visual images or the second portion of the second one of the second plurality of thermal images;
  
  storing the annotation in association with at least the first one of the first plurality of visual images;
  
  providing at least the first one of the first plurality of visual images to a classifier as a training input;
  
  providing at least the annotation to the classifier as a training output;
  
  training the classifier using at least the training input and the training output;
  
  capturing at least a second plurality of visual images by the digital camera;
  
  providing at least one of the second plurality of visual images to the classifier as an input;
  
  receiving an output from the classifier; and
  
  identifying a portion of the at least one of the second plurality of visual images depicting the object based at least in part on the output.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The aerial vehicle of claim 1, wherein determining that the first portion of the first one of the first plurality of visual images corresponds to the second portion of the second one of the second plurality of thermal images:
    - determining at least a first coordinate pair of the first portion of the first one of the first plurality of visual images; and
      
      identifying at least a second coordinate pair of the second portion of the second one of the second plurality of thermal images, wherein the second coordinate pair corresponds to the first coordinate pair.
  - 3. The aerial vehicle of claim 1, wherein detecting the at least one visual attribute of the object within the first portion of the first one of the first plurality of visual images comprises:
    - determining a first probability that the first portion of the first one of the first plurality of visual images depicts the object,wherein the method further comprises;
      
      in response to detecting the at least one thermal attribute of the object within the second portion of the second one of the second plurality of thermal images,determining a second probability that the first portion of the first one of the first plurality of visual images depicts the object,wherein the second probability is greater than the first probability.
  - 4. The aerial vehicle of claim 1, further comprising:
    - a secondary sensing system comprising a housing having the thermal camera, at least one processor, at least one memory component and at least one power supply therein,wherein the secondary sensing system further comprises a first plurality of bolt holes extending therethrough arranged in a pattern,wherein the aerial vehicle further comprises at least one external surface having a second plurality of bolt holes arranged in the pattern,wherein the secondary sensing system is affixed to the at least one external surface of the aerial vehicle by way of a plurality of bolts, andwherein each of the plurality of bolts extends through one of the first plurality of bolt holes and into one of the second plurality of bolt holes.
  - 5. The aerial vehicle of claim 1, wherein the object is a human, a non-human animal, an artificial structure or a natural structure.

6. A method comprising:
- capturing first data from a scene by a first sensor operating in a first modality;
  
  capturing second data from the scene by at least a second sensor operating in a second modality, wherein the second sensor is calibrated with the first sensor, and wherein a first field of view of the first sensor overlaps with a second field of view of the second sensor at least in part, wherein one of the first data or the second data comprises visual imaging data, and wherein one of the first data or the second data does not include visual imaging data;
  
  detecting at least a first attribute of an object of a type in a first portion of a first representation of at least some of the first data, wherein the first representation is generated based at least in part on at least some of the first data captured at a first time;
  
  identifying at least a second portion of a second representation of at least some of the second data, wherein the second portion of the second representation corresponds to at least the first portion of the first representation;
  
  providing the at least some of the second data as a second input to a second object detection algorithm, wherein the second object detection algorithm is configured to detect an object of the type within data of the second modality;
  
  receiving a second output from the second object detection algorithm; and
  
  detecting at least a second attribute of an object of the type in the second portion of the second representation of the second data based at least in part on the second output, wherein the second attribute is one of an edge, a contour, an outline, a color, a texture, a silhouette or a shape of an object of the type;
  
  generating at least one annotation of an object of the type based at least in part on at least one of the first portion of the first representation or the second portion of the second representation;
  
  storing at least one annotation in association with at least some of the second data;
  
  capturing third data by a third sensor operating in at least one of the first modality or the second modality;
  
  providing at least some of the third data to a classifier as an input, wherein the classifier is trained to detect an object of the type within data of at least one of the first modality or the second modality based at least in part on at least one of the first portion of the first representation or the second portion of the second representation as a training input and the at least one annotation as a training output;
  
  receiving an output from the classifier; and
  
  detecting at least a portion of an object of the type within a third representation of the third data based at least in part on the output received from the classifier.
- View Dependent Claims (7, 8, 9, 10, 11)
- - 7. The method of claim 6, wherein identifying at least the second portion of the second representation comprises:
    - determining at least a first coordinate pair of the first portion of the first representation based at least in part on the first attribute of the object of the type; and
      
      identifying at least a second coordinate pair of the second representation based at least in part on the first coordinate pair of the first portion of the first representation,wherein the second portion of the second representation is identified based at least in part on the second coordinate pair.
  - 8. The method of claim 6, wherein the second sensor is one of:
    - a thermal imaging device configured to capture thermal imaging data;
      
      a radiographic imaging device configured to capture radiographic imaging data;
      
      oran ultraviolet imaging device configured to capture ultraviolet imaging data.
  - 9. The method of claim 6, wherein the first sensor is a digital camera configured to capture at least one of color visual imaging data, grayscale visual imaging data or black-and-white visual imaging data.
  - 10. The method of claim 6, further comprising:
    - providing the at least some of the first data as a first input to a first object detection algorithm, wherein the first object detection algorithm is configured to detect at least a portion of an object of the type within data of the first modality;
      
      receiving a first output from the first object detection algorithm; and
      
      detecting at least the first attribute in the first portion of the first representation of the first data based at least in part on the first output,wherein the first attribute is one of a thermal property, a radiographic property or an ultraviolet property associated with an object of the type.
  - 11. The method of claim 6, wherein the object of the type is a human, a non-human animal, an artificial structure or a natural structure.

12. A method comprising:
- affixing a sensing system to at least a portion of an unmanned aerial vehicle, wherein the unmanned aerial vehicle comprises a first sensor, wherein the sensing system comprises a second sensor, wherein the second sensor is calibrated with the first sensor, and wherein a first field of view of the first sensor overlaps with a second field of view of the second sensor;
  
  causing the unmanned aerial vehicle to engage in at least one flight operation;
  
  capturing first data from a scene by the first sensor operating in a first modality, wherein the first data is captured with the unmanned aerial vehicle engaged in the at least one flight operation;
  
  capturing second data from the scene by at least the second sensor operating in a second modality, wherein the second data is captured with the unmanned aerial vehicle engaged in the at least one flight operation;
  
  detecting at least a first attribute of an object of a type in a first portion of a first representation of at least some of the first data, wherein the first representation is generated based at least in part on at least some of the first data captured at a first time;
  
  identifying at least a second portion of a second representation of at least some of the second data, wherein the second portion of the second representation corresponds to at least the first portion of the first representation;
  
  in response to identifying at least the second portion of the second representation,generating at least one annotation of an object of the type based at least in part on the first portion of the first representation; and
  
  storing at least one annotation in association with at least some of the second data,wherein one of the first data or the second data comprises visual imaging data, andwherein one of the first data or the second data does not include visual imaging data; and
  
  after capturing the second data from the scene,terminating the at least one flight operation; and
  
  removing the sensing system from at least the portion of the unmanned aerial vehicle.
- View Dependent Claims (13, 14, 15)
- - 13. The method of claim 12, wherein the sensing system further comprises:
    - at least one processor in communication with the second sensor;
      
      at least one memory component in communication with the second sensor; and
      
      at least one power supply for providing power to the second sensor, the at least one computer processor and the at least one memory component;
      
      a housing, wherein each of the second sensor, the at least one processor, the at least one memory component and the at least one power supply is disposed within the housing;
      
      at least one fastening apparatus for joining the housing to at least the portion of unmanned aerial vehicle.
  - 14. The method of claim 12, wherein the object of the type is a human, a non-human animal, an artificial structure or a natural structure.
  - 15. The method of claim 12, wherein the second sensor is one of:
    - a thermal imaging device configured to capture thermal imaging data;
      
      a radiographic imaging device configured to capture radiographic imaging data;
      
      oran ultraviolet imaging device configured to capture ultraviolet imaging data.

16. A method comprising:
- affixing a secondary sensing system to at least one surface of an aerial vehicle, wherein the aerial vehicle comprises a first sensor operating in a first modality, wherein the secondary sensing system comprises a second sensor operating in a second modality, and wherein a first field of view of the first sensor and a second field of view of the second sensor overlap at least in part with the secondary sensing system affixed to the at least one surface of the aerial vehicle;
  
  initiating at least a first flight operation of the aerial vehicle;
  
  capturing, by the first sensor, first data during the first flight operation;
  
  capturing, by the second sensor, second data during the first flight operation;
  
  detecting at least a first attribute of a first object within a first portion of a first representation of the first data, wherein the first attribute relates to the first modality;
  
  identifying a second portion of a second representation of the second data corresponding to the first portion of the first representation of the first data;
  
  determining that the second portion of the second representation depicts at least a second attribute of the first object, wherein the second attribute relates to the second modality;
  
  generating an annotation of at least one of the first data or the second data based at least in part on the first portion or the second portion;
  
  training at least one classifier to recognize one or more objects based at least in part on the at least one of the first data or the second data and the annotation;
  
  removing the secondary sensing system from the at least one surface of the aerial vehicle;
  
  initiating at least a second flight operation of the aerial vehicle;
  
  capturing, by the first sensor, third data during the second flight operation;
  
  providing at least some of the third data to the at least one trained classifier as an input;
  
  receiving at least one output from the at least one trained classifier; and
  
  detecting at least a third attribute of a second object within at least a third portion of a third representation of the third data based at least in part on the at least one output.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The method of claim 16, wherein each of the first object and the second object is of a type,wherein the first sensor is a digital camera,wherein the first data is at least one of color imaging data or grayscale imaging data,wherein the first representation is one of a color visual image or a grayscale visual image, andwherein the first attribute is one of an edge, a contour, an outline, a color, a texture, a silhouette or a shape of the first object.
  - 18. The method of claim 16, wherein at least one of the first object or the second object is a human, a non-human animal, an artificial structure or a natural structure.
  - 19. The method of claim 16, wherein the second sensor is one of:
    - a thermal imaging device configured to capture thermal imaging data;
      
      a radiographic imaging device configured to capture radiographic imaging data;
      
      oran ultraviolet imaging device configured to capture ultraviolet imaging data.
  - 20. The method of claim 16, wherein the first attribute is one of a thermal property, a radiographic property or an ultraviolet property associated with the first object.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Ferstl, David, Koestinger, Martin, Navot, Amir
Primary Examiner(s)
Wang, Carol

Application Number

US15/885,808
Time in Patent Office

874 Days
Field of Search
US Class Current
CPC Class Codes

B64C 39/024   of the remote controlled ve...

B64D 47/08   Arrangements of cameras

B64U 2101/30   for imaging, photography or...

G06F 18/2148   characterised by the proces...

G06F 18/24   Classification techniques

G06F 18/251   of input or preprocessed data

G06N 3/044   Recurrent networks, e.g. Ho...

G06N 3/045   Combinations of networks

G06N 3/08   Learning methods

G06V 10/44   Local feature extraction by...

G06V 10/764   using classification, e.g. ...

G06V 10/82   using neural networks

G06V 20/13   Satellite images

G06V 20/17   taken from planes or by drones

G06V 20/194   using hyperspectral data, i...

G06V 2201/10   Recognition assisted with m...

H04N 23/11   for generating image signal...

Annotating images based on multi-modal sensor data

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

28 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Annotating images based on multi-modal sensor data

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

28 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links