Object detection using cascaded convolutional neural networks
First Claim
1. A method comprising:
- identifying multiple candidate windows in an image, each candidate window including a group of pixels of the image, the multiple candidate windows including overlapping candidate windows;
identifying one or more of the multiple candidate windows that include an object, the identifying including analyzing the multiple candidate windows using cascaded convolutional neural networks, the cascaded convolutional neural networks including multiple cascade layers, each cascade layer comprising a convolutional neural network, the multiple cascade layers including a first cascade layer that analyzes the identified multiple candidate windows, a second cascade layer that analyzes ones of the multiple candidate windows identified by the first cascade layer as including an object, and a third cascade layer that analyzes ones of the multiple candidate windows identified by the second cascade layer as including an object; and
outputting, as an indication of one or more objects in the image, an indication of one or more of the multiple candidate windows identified by the third cascade layer as including an object.
1 Assignment
0 Petitions
Accused Products
Abstract
Different candidate windows in an image are identified, such as by sliding a rectangular or other geometric shape of different sizes over an image to identify portions of the image (groups of pixels in the image). The candidate windows are analyzed by a set of convolutional neural networks, which are cascaded so that the input of one convolutional neural network layer is based on the input of another convolutional neural network layer. Each convolutional neural network layer drops or rejects one or more candidate windows that the convolutional neural network layer determines does not include an object (e.g., a face). The candidate windows that are identified as including an object (e.g., a face) are analyzed by another one of the convolutional neural network layers. The candidate windows identified by the last of the convolutional neural network layers are the indications of the objects (e.g., faces) in the image.
35 Citations
20 Claims
-
1. A method comprising:
-
identifying multiple candidate windows in an image, each candidate window including a group of pixels of the image, the multiple candidate windows including overlapping candidate windows; identifying one or more of the multiple candidate windows that include an object, the identifying including analyzing the multiple candidate windows using cascaded convolutional neural networks, the cascaded convolutional neural networks including multiple cascade layers, each cascade layer comprising a convolutional neural network, the multiple cascade layers including a first cascade layer that analyzes the identified multiple candidate windows, a second cascade layer that analyzes ones of the multiple candidate windows identified by the first cascade layer as including an object, and a third cascade layer that analyzes ones of the multiple candidate windows identified by the second cascade layer as including an object; and outputting, as an indication of one or more objects in the image, an indication of one or more of the multiple candidate windows identified by the third cascade layer as including an object. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A cascaded convolutional neural networks object detection system comprising:
-
an image access module configured to obtain an image; a first cascade layer comprising a first convolutional neural network, the first cascade layer configured to analyze multiple candidate windows identified in the image to identify a first set of the multiple candidate windows that include an object, each candidate window including a group of pixels of the image, the multiple candidate windows including overlapping candidate windows; a second cascade layer comprising a second convolutional neural network, the second cascade layer configured to analyze the first set of the multiple candidate windows to identify a second set of the multiple candidate windows that include an object; a third cascade layer comprising a third convolutional neural network, the third cascade layer configured to analyze the second set of the multiple candidate windows to identify a third set of the multiple candidate windows that include an object; and an output module configured to output, for each candidate window of the third set of the multiple candidate windows, an indication of the object included in the candidate window. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A computing device comprising:
-
one or more processors; and one or more computer-readable storage media having stored thereon multiple instructions that, when executed by the one or more processors, cause the one or more processors to perform acts comprising; identifying multiple candidate windows in an image, each candidate window including a group of pixels of the image, the multiple candidate windows including overlapping candidate windows; identifying one or more of the multiple candidate windows that include an object, the identifying including analyzing the multiple candidate windows using cascaded convolutional neural networks, the cascaded convolutional neural networks including multiple cascade layers, each cascade layer comprising a convolutional neural network, the multiple cascade layers including a first cascade layer that analyzes the identified multiple candidate windows, a second cascade layer that analyzes ones of the multiple candidate windows identified by the first cascade layer as including an object, and a third cascade layer that analyzes ones of the multiple candidate windows identified by the second cascade layer as including an object; and outputting, as an indication of one or more objects in the image, an indication of one or more of the multiple candidate windows identified by the third cascade layer as including an object. - View Dependent Claims (18, 19, 20)
-
Specification