Neural networks for object detection

US 10,013,773 B1
Filed: 12/16/2016
Issued: 07/03/2018
Est. Priority Date: 12/16/2016
Status: Active Grant

First Claim

Patent Images

1. A neural network system for identifying positions of objects in a set of objects shown in an input image, the neural network system comprising:

a detector neural network that is configured to, at each time step in a plurality of time steps;

receive (i) a first neural network input that represents the input image and (ii) a second neural network input that identifies a first set of positions of the input image that have each been classified as showing a respective object of the set of objects; and

process the first neural network input and the second neural network input to generate a set of output scores that each represents a respective likelihood that an object that is not one of the objects shown at any of the positions in the first set of positions is shown at a respective position of the input image that corresponds to the output score, wherein each output score of the set of output scores corresponds to a different position of a plurality of positions of the input image;

an external memory that is configured to store the second neural network input; and

a memory interface subsystem that is configured to, at each time step in the plurality of time steps;

select, based on the set of output scores generated by the detector neural network at the time step, a particular position of the plurality of positions of the input image that is not currently among the first set of positions that have been classified as showing respective objects of the set of objects; and

classify the selected particular position of the input image as showing an object of the set of objects shown in the input image, including updating the second neural network input stored in the external memory by adding the selected particular position of the input image to the first set of positions identified by the second neural network input.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A neural network system for identifying positions of objects in an input image can include an object detector neural network, a memory interface subsystem, and an external memory. The object detector neural network is configured to, at each time step of multiple successive time steps, (i) receive a first neural network input that represents the input image and a second neural network input that identifies a first set of positions of the input image that have each been classified as showing a respective object of the set of objects, and (ii) process the first and second inputs to generate a set of output scores that each represents a respective likelihood that an object that is not one of the objects shown at any of the positions in the first set of positions is shown at a respective position of the input image that corresponds to the output score.

49 Citations

View as Search Results

20 Claims

1. A neural network system for identifying positions of objects in a set of objects shown in an input image, the neural network system comprising:
- a detector neural network that is configured to, at each time step in a plurality of time steps;
  
  receive (i) a first neural network input that represents the input image and (ii) a second neural network input that identifies a first set of positions of the input image that have each been classified as showing a respective object of the set of objects; and
  
  process the first neural network input and the second neural network input to generate a set of output scores that each represents a respective likelihood that an object that is not one of the objects shown at any of the positions in the first set of positions is shown at a respective position of the input image that corresponds to the output score, wherein each output score of the set of output scores corresponds to a different position of a plurality of positions of the input image;
  
  an external memory that is configured to store the second neural network input; and
  
  a memory interface subsystem that is configured to, at each time step in the plurality of time steps;
  
  select, based on the set of output scores generated by the detector neural network at the time step, a particular position of the plurality of positions of the input image that is not currently among the first set of positions that have been classified as showing respective objects of the set of objects; and
  
  classify the selected particular position of the input image as showing an object of the set of objects shown in the input image, including updating the second neural network input stored in the external memory by adding the selected particular position of the input image to the first set of positions identified by the second neural network input.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The neural network system of claim 1, wherein the memory interface subsystem is further configured to, at each time step in the plurality of time steps:
    - provide the second neural network input stored in the external memory to the detector neural network; and
      
      receive, from the detector neural network, the set of output scores generated by the detector neural network at the time step.
  - 3. The neural network system of claim 1, wherein the detector neural network is further configured to, at each of one or more time steps in the plurality of time steps, process the first neural network input and the second neural network input to generate a second output score that represents a likelihood that an object is shown at any of the positions that are not in the first set of positions of the input image.
  - 4. The neural network system of claim 3, wherein the neural network system is configured to determine, at each of the one or more time steps in the plurality of time steps and based on the second output score, whether to continue identifying positions of objects shown in the input image.
  - 5. The neural network system of claim 1, wherein the detector neural network is a feedforward detector neural network.
  - 6. The neural network system of claim 1, wherein the memory interface subsystem is further configured to, at each time step in the plurality of time steps, select the particular position of the plurality of positions of the input image based on a comparison of the respective output score for the particular position with the respective output scores for other positions of the plurality of positions of the input image.
  - 7. The neural network system of claim 1, wherein the input image represents signals that were generated by one or more sensors of a vehicle and that characterize an environment in a vicinity of the vehicle.
  - 8. The neural network system of claim 1, wherein:
    - at a first time step in the plurality of time steps, the first set of positions identified by the second neural network input is a null set that includes zero positions of the input image that have been classified as showing an object; and
      
      at each time step in the plurality of time steps that follows the first time step, the first set of positions identified by the second neural network input specifies at least one position of the input image that has been classified as showing an object.
  - 9. The neural network system of claim 1, wherein the detector neural network comprises a softmax layer, wherein the set of output scores generated by the detector neural network at a given time step are the current values of the softmax layer that result from processing the first neural network input and the second neural network input at the given time step.
  - 10. The neural network system of claim 1, wherein at a given time step after an initial time step in the plurality of time steps, the first set of positions of the input image identified by the second neural network input were each classified as showing a respective object of the set of objects at a respective preceding time step in the plurality of time steps.
  - 11. The neural network system of claim 1, wherein the set of output scores generated by the detector neural network at a given time step each represents a respective likelihood that an object within one or more pre-defined classes, which is not one of the objects shown at any of the positions in the first set of positions, is shown at the respective position of the input image that corresponds to the output score.

12. A computer-implemented method for identifying positions of objects in a set of objects shown in an input image, the method comprising:
- for each time step in a plurality of time steps;
  
  receiving a first neural network input that represents the input image;
  
  receiving a second neural network input that identifies a first set of positions of the input image that have each been classified as showing a respective object of the set of objects;
  
  processing, by a detector neural network, the first neural network input and the second neural network input to generate a set of output scores that each represents a respective likelihood that an object that is not one of the objects shown at any of the positions in the first set of positions is shown at a respective position of the input image that corresponds to the output score, wherein each output score of the set of output scores corresponds to a different position of a plurality of positions of the input image;
  
  selecting, based on the set of output scores, a particular position of the plurality of positions of the input image that is not currently among the first set of positions that have been classified as showing respective objects of the set of objects; and
  
  classifying the selected particular position of the input image as showing an object of the set of objects shown in the input image, including adding the selected particular position of the input image to the first set of positions identified by the second neural network input.
- View Dependent Claims (13, 14, 15, 16, 17)
- - 13. The method of claim 12, further comprising, for each of one or more time steps in the plurality of time steps, processing, by the detector neural network, the first neural network input and the second neural network input to generate a second output score that represents a likelihood that an object is shown at any of the positions that are not in the first set of positions of the input image.
  - 14. The method of claim 13, further comprising determining, at each of the one or more time steps in the plurality of time steps and based on the second output score, whether to continue identifying positions of objects shown in the input image.
  - 15. The method of claim 12, wherein the detector neural network is a feedforward detector neural network.
  - 16. The method of claim 12, wherein selecting the particular position of the plurality of positions of the input image comprises comparing the respective output score for the particular position of the input image with the respective output scores for other positions of the plurality of positions of the input image.
  - 17. The method of claim 12, wherein the input image represents signals that were generated by one or more sensors of a vehicle and that characterize an environment in a vicinity of the vehicle.

18. A computer-implemented method for training a detector neural network, comprising:
- obtaining, by a system of one or more computers, a plurality of training data sets, wherein each training data set includes;
  
  (i) a first training input that represents an input image that shows a set of objects,(ii) a second training input that identifies a first set of positions, of a plurality of positions of the input image, that each shows a respective object of a first subset of the set of objects shown in the input image, and(iii) a target output that identifies a second set of positions, of the plurality of positions of the input image, that each shows a respective object of the set of objects that is not among the first subset of objects;
  
  training, by the system, the detector neural network on the plurality of training data sets, including, for each training data set of the plurality of training data sets;
  
  processing the first training input and the second training input to generate a set of output scores that includes a respective output score for each position of the plurality of positions of the input image;
  
  determining an output error using the target output and the set of output scores; and
  
  adjusting current values of parameters of the detector neural network using the error.
- View Dependent Claims (19, 20)
- - 19. The method of claim 18, wherein, for each training data set, the second set of positions of the input image identified by the target output specifies every position of the plurality of positions of the input image that shows a respective object of the set of objects that is not among the first subset of objects.
  - 20. The method of claim 18, wherein, for each training data set, the second set of positions of the input image identified by the target output specifies only one position of the plurality of positions of the input image that shows an object of the set of objects that is not among the first subset of objects.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Waymo LLC (Alphabet Inc.)
Original Assignee
Waymo LLC (Alphabet Inc.)
Inventors
Ogale, Abhijit, Krizhevsky, Alexander, Lo, Wan-Yen
Primary Examiner(s)
Harandi, Siamak

Application Number

US15/381,288
Time in Patent Office

564 Days
Field of Search
US Class Current
CPC Class Codes

G06F 18/2413   based on distances to train...

G06N 3/04   Architecture, e.g. intercon...

G06N 3/045   Combinations of networks

G06N 3/08   Learning methods

G06N 3/084   Backpropagation, e.g. using...

G06T 2207/10004   Still image; Photographic i...

G06T 2207/10024   Color image

G06T 2207/20081   Training; Learning

G06T 2207/20084   Artificial neural networks ...

G06T 7/73   using feature-based methods

G06T 7/74   involving reference images ...

G06V 10/75   Organisation of the matchin...

G06V 10/764   using classification, e.g. ...

G06V 20/56   exterior to a vehicle by us...

Neural networks for object detection

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

49 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

Neural networks for object detection

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

49 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others