Augmenting Layer-Based Object Detection With Deep Convolutional Neural Networks

US 20160180195A1
Filed: 02/19/2016
Published: 06/23/2016
Est. Priority Date: 09/06/2013
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method for performing object recognition comprising:

receiving image data;

extracting a depth image and a color image from the image data;

creating a mask image by segmenting the image data into a plurality of components;

identifying objects within the plurality of components of the mask image;

determining a first likelihood score from the depth image and the mask image using a layered classifier;

determining a second likelihood score from the color image and the mask image by generating an object image by copying pixels from the first image of the components in the mask image and classifying the object image using the deep convolutional neural network (CNN); and

determining a class for at least a portion of the image data based on the first likelihood score and the second likelihood score.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

By way of example, the technology disclosed by this document receives image data; extracts a depth image and a color image from the image data; creates a mask image by segmenting the depth image; determines a first likelihood score from the depth image and the mask image using a layered classifier; determines a second likelihood score from the color image and the mask image using a deep convolutional neural network; and determines a class of at least a portion of the image data based on the first likelihood score and the second likelihood score. Further, the technology can pre-filter the mask image using the layered classifier and then use the pre-filtered mask image and the color image to calculate a second likelihood score using the deep convolutional neural network to speed up processing.

Citations

20 Claims

1. A computer-implemented method for performing object recognition comprising:
- receiving image data;
  
  extracting a depth image and a color image from the image data;
  
  creating a mask image by segmenting the image data into a plurality of components;
  
  identifying objects within the plurality of components of the mask image;
  
  determining a first likelihood score from the depth image and the mask image using a layered classifier;
  
  determining a second likelihood score from the color image and the mask image by generating an object image by copying pixels from the first image of the components in the mask image and classifying the object image using the deep convolutional neural network (CNN); and
  
  determining a class for at least a portion of the image data based on the first likelihood score and the second likelihood score.

2. A computer-implemented method for performing object recognition comprising:
- receiving image data;
  
  creating a mask image by segmenting the image data into a plurality of components;
  
  determining a first likelihood score from the image data and the mask image using a layered classifier;
  
  determining a second likelihood score from the image data and the mask image using a deep convolutional neural network (CNN); and
  
  determining a class for at least a portion of the image data based on the first likelihood score and the second likelihood score.
- View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10)
- - 3. The computer-implemented method of claim 2, wherein the determining the second likelihood score from the image data and the mask image using the deep convolutional neural network includes:
    - extracting a first image from the image data;
      
      generating an object image by copying pixels from the first image of the components in the mask image;
      
      classifying the object image using the deep CNN;
      
      generating classification likelihood scores indicating probabilities of the object image belonging to different classes of the deep CNN; and
      
      generating the second likelihood score based on the classification likelihood scores.
  - 4. The computer-implemented method of claim 3, wherein the first image is one of a color image, a depth image, and a combination of a color image and a depth image.
  - 5. The computer-implemented method of claim 2, wherein determining the class of at least the portion of the image data includes:
    - fusing the first likelihood score and the second likelihood score into an overall likelihood score; and
      
      responsive to satisfying a predetermined threshold with the overall likelihood score,classifying the at least the portion of the image data as representing a person using the overall likelihood score.
  - 6. The computer-implemented method of claim 2, further comprising:
    - extracting a depth image and a color image from the image data, wherein determining the first likelihood score from the image data and the mask image using the layered classifier includes determining the first likelihood score from the depth image and the mask image using the layered classifier, and determining the second likelihood score from the image data and the mask image using the deep CNN includes determining the second likelihood score from the color image and the mask image using the deep CNN.
  - 7. The computer-implemented method of claim 2 wherein the deep CNN has a soft max layer as a final layer to generate the second likelihood that the at least the portion of the image data represents a person.
  - 8. The computer-implemented method of claim 2, further comprising:
    - converting the first likelihood score and the second likelihood score into a first log likelihood value and a second log likelihood value; and
      
      calculating a combined likelihood score by using a weighted summation of the first log likelihood value and the second log likelihood value.
  - 9. The computer-implemented method of claim 2, wherein the class is a person.
  - 10. The computer-implemented method of claim 2, wherein determining the second likelihood score further comprises:
    - determining the second likelihood score using the image data and the first likelihood score from the layered classifier.

11. A system for performing object recognition comprising:
- a processor; and
  
  a memory storing instructions that, when executed, cause the system to;
  
  create a mask image by segmenting the image data into a plurality of components;
  
  determine a first likelihood score from the image data and the mask image using a layered classifier;
  
  determine a second likelihood score from the image data and the mask image using a deep convolutional neural network (CNN); and
  
  determine a class for at least a portion of the image data based on the first likelihood score and the second likelihood score.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 12. The system of claim 11, wherein the instructions that cause the system to determine the second likelihood score from the image data and the mask image using the deep convolutional neural network further include:
    - extract a first image from the image data;
      
      generate an object image by copying pixels from the first image of the components in the mask image;
      
      classify the object image using the deep CNN;
      
      generate classification likelihood scores indicating probabilities of the object image belonging to different classes of the deep CNN; and
      
      generate the second likelihood score based on the classification likelihood scores.
  - 13. The system of claim 12, wherein the first image is one of a color image, a depth image, and a combination of a color image and a depth image.
  - 14. The system claim 11, wherein the instruction that cause the system to determine the class of at least the portion of the image data include:
    - fuse the first likelihood score and the second likelihood score into an overall likelihood score; and
      
      responsive to satisfying a predetermined threshold with the overall likelihood score,classify the at least the portion of the image data as representing a person using the overall likelihood score.
  - 15. The system of claim 11, wherein the memory stores further instruction that cause the system to:
    - extract a depth image and a color image from the image data, wherein determining the first likelihood score from the image data and the mask image using the layered classifier includes determining the first likelihood score from the depth image and the mask image using the layered classifier, and determining the second likelihood score from the image data and the mask image using the deep CNN includes determining the second likelihood score from the color image and the mask image using the deep CNN.
  - 16. The system of claim 11 wherein the deep CNN has a soft max layer as a final layer to generate the second likelihood that the at least the portion of the image data represents a person.
  - 17. The system of claim 11, wherein the memory stores further instruction that cause the system to:
    - convert the first likelihood score and the second likelihood score into a first log likelihood value and a second log likelihood value; and
      
      calculate a combined likelihood score by using a weighted summation of the first log likelihood value and the second log likelihood value.
  - 18. The system of claim 11, wherein the class is a person.
  - 19. The system of claim 11, wherein the instructions that cause the system to determine the second likelihood score further comprises:
    - pre-filter the mask image using the layered classifier; and
      
      determine the second likelihood score using the image data and the pre-filtered mask image.
  - 20. The system of claim 11, wherein the layered classifier determines the first likelihood score using a Gaussian mixture.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Toyota Jidosha Kabushiki Kaisha (Toyota Motor Corporation)
Original Assignee
Toyota Jidosha Kabushiki Kaisha (Toyota Motor Corporation)
Inventors
Martinson, Eric, Yalla, Veeraganesh

Granted Patent

US 9,542,626 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06F 18/214   Generating training pattern...

G06F 18/2415   based on parametric or prob...

G06F 18/2431   Multiple classes

G06F 18/29   Graphical models, e.g. Baye...

G06N 3/045   Combinations of networks

G06N 3/047   Probabilistic or stochastic...

G06N 7/01   Probabilistic graphical mod...

G06V 10/421   by analysing segments inter...

G06V 10/454   Integrating the filters int...

G06V 10/751   Comparing pixel values or l...

G06V 10/764   using classification, e.g. ...

G06V 10/82   using neural networks

G06V 10/84   using probabilistic graphic...

G06V 20/653   by matching three-dimension...

G06V 40/10   Human or animal bodies, e.g...

G06V 40/103   Static body considered as a...

Augmenting Layer-Based Object Detection With Deep Convolutional Neural Networks

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Augmenting Layer-Based Object Detection With Deep Convolutional Neural Networks

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links