Item put and take detection using image recognition
First Claim
1. A system for tracking puts and takes of inventory items by subjects in an area of real space, comprising:
- a plurality of cameras, cameras in the plurality of cameras producing respective sequences of images of corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras;
a processing system coupled to the plurality of cameras, the processing system including a plurality of image recognition engines, receiving corresponding sequences of images from the plurality of cameras, image recognition engines in the plurality of image recognition engines processing the images in the corresponding sequences to identify subjects represented in the images; and
logic to process sets of images in the sequences of images that include the identified subjects to detect takes of inventory items by identified subjects and puts of inventory items on shelves by identified subjects, wherein the logic to process sets of images includes;
for identified subjects, logic to process images to generate classifications of the images of the identified subjects, the classifications including whether the identified subject is holding an inventory item, a first nearness classification indicating a location of a hand of the identified subject relative to a shelf, a second nearness classification indicating a location a hand of the identified subject relative to the identified subject.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and techniques are provided for tracking puts and takes of inventory items by subjects in an area of real space. A plurality of cameras with overlapping fields of view produce respective sequences of images of corresponding fields of view in the real space. A processing system is coupled to the system. In one embodiment, the processing system comprises image recognition engines receiving corresponding sequences of images from the plurality of cameras. The image recognition engines process the images in the corresponding sequences to identify subjects represented in the images and generate classifications of the identified subjects. The system processes the classifications of identified subjects for sets of images in the sequences of images to detect takes and puts of inventory items on shelves by identified subjects.
209 Citations
27 Claims
-
1. A system for tracking puts and takes of inventory items by subjects in an area of real space, comprising:
-
a plurality of cameras, cameras in the plurality of cameras producing respective sequences of images of corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras; a processing system coupled to the plurality of cameras, the processing system including a plurality of image recognition engines, receiving corresponding sequences of images from the plurality of cameras, image recognition engines in the plurality of image recognition engines processing the images in the corresponding sequences to identify subjects represented in the images; and logic to process sets of images in the sequences of images that include the identified subjects to detect takes of inventory items by identified subjects and puts of inventory items on shelves by identified subjects, wherein the logic to process sets of images includes; for identified subjects, logic to process images to generate classifications of the images of the identified subjects, the classifications including whether the identified subject is holding an inventory item, a first nearness classification indicating a location of a hand of the identified subject relative to a shelf, a second nearness classification indicating a location a hand of the identified subject relative to the identified subject. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for tracking puts and takes of inventory items by subjects in an area of real space, the method including:
-
using a plurality of cameras to produce respective sequences of images of corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras; receiving corresponding sequences of images from the plurality of cameras, processing the images in the corresponding sequences using image recognition engines in a plurality of image recognition engines and identifying subjects represented in the images wherein the plurality of image recognition engines are part of a processing system coupled to the plurality of cameras; and processing sets of images in the sequences of images that include the identified subjects to detect takes of inventory items by identified subjects and puts of inventory items on shelves by identified subjects, wherein the processing sets of images includes; for identified subjects, generating classifications of the images of the identified subjects, the classifications including whether the identified subject is holding an inventory item, a first nearness classification indicating a location of a hand of the identified subject relative to a shelf, a second nearness classification indicating a location of a hand of the identified subject relative to a body of the identified subject, a third nearness classification indicating a location a hand of the identified subject relative to a basket associated with an identified subject, and an identifier of a likely inventory item. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A system for tracking puts and takes of inventory items by subjects in an area of real space, comprising:
-
a plurality of cameras, cameras in the plurality of cameras producing respective sequences of images of corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras; a processing system coupled to the plurality of cameras, the processing system including; first image recognition engines, receiving the sequences of images from the plurality of cameras, which process images to generate first data sets that identify subjects and locations of the identified subjects in the real space; logic to process the first data sets to specify bounding boxes which include images of hands of identified subjects in images in the sequences of images; second image recognition engines, receiving the sequences of images from the plurality of cameras, which process the specified bounding boxes in the images to generate a classification of hands of the identified subjects, the classification including whether the identified subject is holding an inventory item, a first nearness classification indicating a location of a hand of the identified subject relative to a shelf, a second nearness classification indicating a location of a hand of the identified subject relative to the identified subject, and an identifier of a likely inventory item; and logic to process the classifications of hands for sets of images in the sequences of images of identified subjects to detect takes of inventory items by identified subjects and puts of inventory items on shelves by identified subjects; and logic to generate a log data structure including a list of inventory items for each identified subject. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27)
-
Specification