Training a neural network to detect objects in images
First Claim
1. A method for training a neural network that receives an input image and outputs a predetermined number of candidate bounding boxes that each cover a respective portion of the input image at a respective position in the input image and a respective confidence score for each candidate bounding box that represents a likelihood that the candidate bounding box contains an image of an object, the method comprising:
- receiving a training image and object location data for the training image, wherein the object location data identifies one or more object locations in the training image;
providing the training image to the neural network and obtaining bounding box data for the training image from the neural network, wherein the bounding box data comprises data defining a plurality of candidate bounding boxes in the training image and a respective confidence score for each candidate bounding box in the training image;
determining an optimal set of assignments using the object location data for the training image and the bounding box data for the training image, wherein the optimal set of assignments assigns a respective candidate bounding box to each of the object locations; and
training the neural network on the training image using the optimal set of assignments.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network to detect object in images. One of the methods includes receiving a training image and object location data for the training image; providing the training image to a neural network and obtaining bounding box data for the training image from the neural network, wherein the bounding box data comprises data defining a plurality of candidate bounding boxes in the training image and a respective confidence score for each candidate bounding box in the training image; determining an optimal set of assignments using the object location data for the training image and the bounding box data for the training image, wherein the optimal set of assignments assigns a respective candidate bounding box to each of the object locations; and training the neural network on the training image using the optimal set of assignments.
-
Citations
20 Claims
-
1. A method for training a neural network that receives an input image and outputs a predetermined number of candidate bounding boxes that each cover a respective portion of the input image at a respective position in the input image and a respective confidence score for each candidate bounding box that represents a likelihood that the candidate bounding box contains an image of an object, the method comprising:
-
receiving a training image and object location data for the training image, wherein the object location data identifies one or more object locations in the training image; providing the training image to the neural network and obtaining bounding box data for the training image from the neural network, wherein the bounding box data comprises data defining a plurality of candidate bounding boxes in the training image and a respective confidence score for each candidate bounding box in the training image; determining an optimal set of assignments using the object location data for the training image and the bounding box data for the training image, wherein the optimal set of assignments assigns a respective candidate bounding box to each of the object locations; and training the neural network on the training image using the optimal set of assignments. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for training a neural network that receives an input image and outputs a predetermined number of candidate bounding boxes that each cover a respective portion of the input image at a respective position in the input image and a respective confidence score for each candidate bounding box that represents a likelihood that the candidate bounding box contains an image of an object, the system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to perform operations comprising:
-
receiving a training image and object location data for the training image, wherein the object location data identifies one or more object locations in the training image; providing the training image to the neural network and obtaining bounding box data for the training image from the neural network, wherein the bounding box data comprises data defining a plurality of candidate bounding boxes in the training image and a respective confidence score for each candidate bounding box in the training image; determining an optimal set of assignments using the object location data for the training image and the bounding box data for the training image, wherein the optimal set of assignments assigns a respective candidate bounding box to each of the object locations; and training the neural network on the training image using the optimal set of assignments. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A computer storage medium encoded with a computer program, the computer program comprising instructions that when executed by one or more computers cause the one or more computers to perform operations for training a neural network that receives an input image and outputs a predetermined number of candidate bounding boxes that each cover a respective portion of the input image at a respective position in the input image and a respective confidence score for each candidate bounding box that represents a likelihood that the candidate bounding box contains an image of an object, the operations comprising:
-
receiving a training image and object location data for the training image, wherein the object location data identifies one or more object locations in the training image; providing the training image to the neural network and obtaining bounding box data for the training image from the neural network, wherein the bounding box data comprises data defining a plurality of candidate bounding boxes in the training image and a respective confidence score for each candidate bounding box in the training image; determining an optimal set of assignments using the object location data for the training image and the bounding box data for the training image, wherein the optimal set of assignments assigns a respective candidate bounding box to each of the object locations; and training the neural network on the training image using the optimal set of assignments.
-
Specification