Neural networks for object detection
First Claim
1. A neural network system for identifying positions of objects in a set of objects shown in an input image, the neural network system comprising:
- a detector neural network that is configured to, at each time step in a plurality of time steps;
receive (i) a first neural network input that represents the input image and (ii) a second neural network input that identifies a first set of positions of the input image that have each been classified as showing a respective object of the set of objects; and
process the first neural network input and the second neural network input to generate a set of output scores that each represents a respective likelihood that an object that is not one of the objects shown at any of the positions in the first set of positions is shown at a respective position of the input image that corresponds to the output score, wherein each output score of the set of output scores corresponds to a different position of a plurality of positions of the input image;
an external memory that is configured to store the second neural network input; and
a memory interface subsystem that is configured to, at each time step in the plurality of time steps;
select, based on the set of output scores generated by the detector neural network at the time step, a particular position of the plurality of positions of the input image that is not currently among the first set of positions that have been classified as showing respective objects of the set of objects; and
classify the selected particular position of the input image as showing an object of the set of objects shown in the input image, including updating the second neural network input stored in the external memory by adding the selected particular position of the input image to the first set of positions identified by the second neural network input.
3 Assignments
0 Petitions
Accused Products
Abstract
A neural network system for identifying positions of objects in an input image can include an object detector neural network, a memory interface subsystem, and an external memory. The object detector neural network is configured to, at each time step of multiple successive time steps, (i) receive a first neural network input that represents the input image and a second neural network input that identifies a first set of positions of the input image that have each been classified as showing a respective object of the set of objects, and (ii) process the first and second inputs to generate a set of output scores that each represents a respective likelihood that an object that is not one of the objects shown at any of the positions in the first set of positions is shown at a respective position of the input image that corresponds to the output score.
49 Citations
20 Claims
-
1. A neural network system for identifying positions of objects in a set of objects shown in an input image, the neural network system comprising:
-
a detector neural network that is configured to, at each time step in a plurality of time steps; receive (i) a first neural network input that represents the input image and (ii) a second neural network input that identifies a first set of positions of the input image that have each been classified as showing a respective object of the set of objects; and process the first neural network input and the second neural network input to generate a set of output scores that each represents a respective likelihood that an object that is not one of the objects shown at any of the positions in the first set of positions is shown at a respective position of the input image that corresponds to the output score, wherein each output score of the set of output scores corresponds to a different position of a plurality of positions of the input image; an external memory that is configured to store the second neural network input; and a memory interface subsystem that is configured to, at each time step in the plurality of time steps; select, based on the set of output scores generated by the detector neural network at the time step, a particular position of the plurality of positions of the input image that is not currently among the first set of positions that have been classified as showing respective objects of the set of objects; and classify the selected particular position of the input image as showing an object of the set of objects shown in the input image, including updating the second neural network input stored in the external memory by adding the selected particular position of the input image to the first set of positions identified by the second neural network input. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented method for identifying positions of objects in a set of objects shown in an input image, the method comprising:
for each time step in a plurality of time steps; receiving a first neural network input that represents the input image; receiving a second neural network input that identifies a first set of positions of the input image that have each been classified as showing a respective object of the set of objects; processing, by a detector neural network, the first neural network input and the second neural network input to generate a set of output scores that each represents a respective likelihood that an object that is not one of the objects shown at any of the positions in the first set of positions is shown at a respective position of the input image that corresponds to the output score, wherein each output score of the set of output scores corresponds to a different position of a plurality of positions of the input image; selecting, based on the set of output scores, a particular position of the plurality of positions of the input image that is not currently among the first set of positions that have been classified as showing respective objects of the set of objects; and classifying the selected particular position of the input image as showing an object of the set of objects shown in the input image, including adding the selected particular position of the input image to the first set of positions identified by the second neural network input. - View Dependent Claims (13, 14, 15, 16, 17)
-
18. A computer-implemented method for training a detector neural network, comprising:
-
obtaining, by a system of one or more computers, a plurality of training data sets, wherein each training data set includes; (i) a first training input that represents an input image that shows a set of objects, (ii) a second training input that identifies a first set of positions, of a plurality of positions of the input image, that each shows a respective object of a first subset of the set of objects shown in the input image, and (iii) a target output that identifies a second set of positions, of the plurality of positions of the input image, that each shows a respective object of the set of objects that is not among the first subset of objects; training, by the system, the detector neural network on the plurality of training data sets, including, for each training data set of the plurality of training data sets; processing the first training input and the second training input to generate a set of output scores that includes a respective output score for each position of the plurality of positions of the input image; determining an output error using the target output and the set of output scores; and adjusting current values of parameters of the detector neural network using the error. - View Dependent Claims (19, 20)
-
Specification