Object detection and classification in images
First Claim
1. A method comprising:
- receiving an input image;
generating a convolutional feature map;
identifying, by a first type of neural network, a candidate object in the input image;
determining, by a second type of neural network, a category of the candidate object; and
assigning a confidence score to the category of the candidate object,wherein the first type of neural network comprises a translation invariant component configured to;
classify an anchor based on overlap with a ground-truth box; and
predict a shift and a scale of the anchor.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems, methods, and computer-readable media for providing fast and accurate object detection and classification in images are described herein. In some examples, a computing device can receive an input image. The computing device can process the image, and generate a convolutional feature map. In some configurations, the convolutional feature map can be processed through a Region Proposal Network (RPN) to generate proposals for candidate objects in the image. In various examples, the computing device can process the convolutional feature map with the proposals through a Fast Region-Based Convolutional Neural Network (FRCN) proposal classifier to determine a class of each object in the image and a confidence score associated therewith. The computing device can then provide a requestor with an output including the object classification and/or confidence score.
69 Citations
17 Claims
-
1. A method comprising:
-
receiving an input image; generating a convolutional feature map; identifying, by a first type of neural network, a candidate object in the input image; determining, by a second type of neural network, a category of the candidate object; and assigning a confidence score to the category of the candidate object, wherein the first type of neural network comprises a translation invariant component configured to; classify an anchor based on overlap with a ground-truth box; and predict a shift and a scale of the anchor. - View Dependent Claims (2, 3)
-
-
4. A method as comprising:
-
receiving an input image; generating a convolutional feature map; identifying, by a first type of neural network, a candidate object in the input image, wherein the identifying the candidate object in the input image comprises; generating one or more anchors at a point of the input image; determining an overlap of individual ones of the one or more anchors to a ground-truth box; assigning a label to each anchor of the one or more anchors based at least in part on the overlap; assigning a score to the label based at least in part on the overlap; and identifying the candidate object at the point based at least in part on the score; determining, by a second type of neural network, a category of the candidate object, wherein the first type of neural network and the second type of neural network share at least one algorithm; and assigning a confidence score to the category of the candidate object. - View Dependent Claims (5, 6, 7)
-
-
8. A system comprising:
-
a processor; and a computer-readable medium including instructions for an object detection and classification network, for execution by the processor, the object detection and classification network comprising; an initial processing module configured to input an image and generate a convolutional feature map; an object proposal module configured to generate a proposal corresponding to a candidate object in the image, and further comprising a translation invariant component configured to; classify an anchor based on overlap with a ground-truth box; and predict a shift and a scale of the anchor; and a proposal classifier module configured to assign a category associated with the candidate object, wherein the object proposal module and the proposal classifier module share at least one convolutional layer. - View Dependent Claims (9, 10)
-
-
11. A system comprising:
-
a processor; and a computer-readable medium including instructions for an object detection and classification network, for execution by the processor, the object detection and classification network comprising; an initial processing module configured to input an image and generate a convolutional feature map; an object proposal module configured to generate a proposal corresponding to a candidate object in the image, wherein the object proposal module is further configured to; identify an anchor corresponding to a highest score, the highest score corresponding to a percentage of the overlap; shift the anchor corresponding to the highest score to better define the candidate object;
orscale the anchor corresponding to the highest score to better define the candidate object; and a proposal classifier module configured to assign a category associated with the candidate object, wherein the object proposal module and the proposal classifier module share at least one convolutional layer.
-
-
12. A system comprising:
-
a processor; a computer-readable medium including instructions for an object detection and classification network, for execution by the processor, the object detection and classification network comprising; an initial processing module configured to input an image and generate a convolutional feature map; an object proposal module configured to generate a proposal corresponding to a candidate object in the image; and a proposal classifier module configured to assign a category associated with the candidate object, wherein the object proposal module and the proposal classifier module share at least one convolutional layer; and a machine learning module configured to; train one or more parameters of the initial processing module and the object proposal module to generate one or more proposals on a training image; and train one or more parameters of the proposal classifier module to assign a category to each of the one or more proposals on the training image. - View Dependent Claims (13)
-
-
14. A non-transitory computer readable storage medium having instructions stored thereon, the instructions when executed by a computing device cause the computing device to:
-
receive an input image; generate a convolutional feature map; generate one or more anchors at a point of the input image; determine an overlap of individual ones of the one or more anchors to a ground-truth box; assign a label to each anchor of the one or more anchors based at least in part on the overlap; assign a score to the label based at least in part on the overlap; identify, by a neural network, a candidate object in the input image, the candidate object at the point based at least in part on the score; determine, by a proposal classifier sharing an algorithm with the neural network, a category of the candidate object; and assign, by the proposal classifier, a confidence score to the category of the candidate object. - View Dependent Claims (15, 16)
-
-
17. A non-transitory computer readable storage medium having instructions stored thereon, the instructions when executed on a computing device cause the computing device to:
-
receive an input image; generate a convolutional feature map; identify, by a neural network, a candidate object in the input image; determine, by a proposal classifier sharing an algorithm with the neural network, a category of the candidate object; and assign, by the proposal classifier, a confidence score to the category of the candidate object, wherein the neural network comprises a translation invariant component configured to; classify an anchor based on overlap with a ground-truth box; and predict a shift and a scale of the anchor.
-
Specification