Predicting inventory events using semantic diffing
First Claim
1. A system for tracking changes in an area of real space, comprising:
- a plurality of cameras, cameras in the plurality of cameras producing respective sequences of images of corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras;
a processing system coupled to the plurality of cameras, the processing system including;
first image processors, including subject image recognition engines, receiving corresponding sequences of images from the plurality of cameras, which process images to identify subjects represented in the images in the corresponding sequences of images;
second image processors, including background image recognition engines, receiving corresponding sequences of images from the plurality of cameras, which mask the identified subjects to generate masked images, process the masked images to identify and classify background changes represented in the images in the corresponding sequences of images, wherein the second image processors includea background image store to store background images for corresponding sequences of images; and
mask logic to process images in the sequences of images to replace foreground image data representing the identified subjects with background image data from the background images for the corresponding sequences of images to provide the masked images, wherein the mask logic combines sets of N masked images in the sequences of images to generate sequences of factored images for each camera, and the second image processors identify and classify background changes by processing the sequence of factored images.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and techniques are provided for tracking puts and takes of inventory items by subjects in an area of real space. A plurality of cameras with overlapping fields of view produce respective sequences of images of corresponding fields of view in the real space. In one embodiment, the system includes first image processors, including subject image recognition engines, receiving corresponding sequences of images from the plurality of cameras. The first image processors process images to identify subjects represented in the images in the corresponding sequences of images. The system includes second image processors, including background image recognition engines, receiving corresponding sequences of images from the plurality of cameras. The second image processors mask the identified subjects to generate masked images. Following this, the second image processors process the masked images to identify and classify background changes represented in the images in the corresponding sequences of images.
230 Citations
24 Claims
-
1. A system for tracking changes in an area of real space, comprising:
-
a plurality of cameras, cameras in the plurality of cameras producing respective sequences of images of corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras; a processing system coupled to the plurality of cameras, the processing system including; first image processors, including subject image recognition engines, receiving corresponding sequences of images from the plurality of cameras, which process images to identify subjects represented in the images in the corresponding sequences of images; second image processors, including background image recognition engines, receiving corresponding sequences of images from the plurality of cameras, which mask the identified subjects to generate masked images, process the masked images to identify and classify background changes represented in the images in the corresponding sequences of images, wherein the second image processors include a background image store to store background images for corresponding sequences of images; and mask logic to process images in the sequences of images to replace foreground image data representing the identified subjects with background image data from the background images for the corresponding sequences of images to provide the masked images, wherein the mask logic combines sets of N masked images in the sequences of images to generate sequences of factored images for each camera, and the second image processors identify and classify background changes by processing the sequence of factored images. - View Dependent Claims (2, 3, 9)
-
-
4. A system for tracking changes in an area of real space, comprising:
-
a plurality of cameras, cameras in the plurality of cameras producing respective sequences of images of corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras; a processing system coupled to the plurality of cameras, the processing system including; first image processors, including subject image recognition engines, receiving corresponding sequences of images from the plurality of cameras, which process images to identify subjects represented in the images in the corresponding sequences of images; second image processors, including background image recognition engines, receiving corresponding sequences of images from the plurality of cameras, which mask the identified subjects to generate masked images, process the masked images to identify and classify background changes represented in the images in the corresponding sequences of images, wherein the second image processors include logic to produce change data structures for the corresponding sequences of images, the change data structures including coordinates in the masked images of identified background changes, identifiers of an inventory item subject of the identified background changes and classifications of the identified background changes; and coordination logic to process change data structures from sets of cameras having overlapping fields of view to locate the identified background changes in real space. - View Dependent Claims (5, 6)
-
-
7. A system for tracking changes in an area of real space, comprising:
-
a plurality of cameras, cameras in the plurality of cameras producing respective sequences of images of corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras; a processing system coupled to the plurality of cameras, the processing system including; first image processors, including subject image recognition engines, receiving corresponding sequences of images from the plurality of cameras, which process images to identify subjects represented in the images in the corresponding sequences of images; second image processors, including background image recognition engines, receiving corresponding sequences of images from the plurality of cameras, which mask the identified subjects to generate masked images, process the masked images to identify and classify background changes represented in the images in the corresponding sequences of images; and logic to associate background changes with identified subjects, and to make detections of takes of inventory items by the identified subjects and of puts of inventory items on inventory display structures by the identified subjects.
-
-
8. A system for tracking changes in an area of real space, comprising:
-
a plurality of cameras, cameras in the plurality of cameras producing respective sequences of images of corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras; a processing system coupled to the plurality of cameras, the processing system including; first image processors, including subject image recognition engines, receiving corresponding sequences of images from the plurality of cameras, which process images to identify subjects represented in the images in the corresponding sequences of images, wherein the first image processors identify locations of hands of identified subjects; second image processors, including background image recognition engines, receiving corresponding sequences of images from the plurality of cameras, which mask the identified subjects to generate masked images, process the masked images to identify and classify background changes represented in the images in the corresponding sequences of images; and logic to associate background changes with identified subjects by comparing the locations of the changes with the locations of hands of identified subjects, and to make detections of takes of inventory items by the identified subjects and of puts of inventory items on inventory display structures by the identified subjects.
-
-
10. A method for tracking put and takes of inventory items by subjects in an area of real space including inventory display structures, comprising:
-
using a plurality of cameras disposed above the inventory display structures to produce respective sequences of images of inventory display structures in corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras; and detecting puts and takes of inventory items by identifying semantically significant changes in the sequences of images relating to inventory items on inventory display structures and associating the semantically significant changes with subjects represented in the sequences of images, wherein said detecting puts and takes includes using first image processors, including subject image recognition engines, to process images to identify subjects represented in the images in the corresponding sequences of images; using second image processors, including background image recognition engines, to mask identified subjects in images in the sequences of images, to generate masked images, to process the masked images to identify and to classify background changes represented in the images in the corresponding sequences of images; and associating identified background changes with identified subjects. - View Dependent Claims (11, 12, 13, 19)
-
-
14. A method for tracking put and takes of inventory items by subjects in an area of real space including inventory display structures, comprising:
-
using a plurality of cameras disposed above the inventory display structures to produce respective sequences of images of inventory display structures in corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras; and detecting puts and takes of inventory items by identifying semantically significant changes in the sequences of images relating to inventory items on inventory display structures and associating the semantically significant changes with subjects represented in the sequences of images, wherein said detecting puts and takes includes using first image processors, including subject image recognition engines, to process images to identify subjects represented in the images in the corresponding sequences of images; using second image processors, including background image recognition engines, to mask identified subjects in images in the sequences of images, to generate masked images, to process the masked images to identify and to classify background changes represented in the images in the corresponding sequences of images, wherein using the second image processors includes producing change data structures for the corresponding sequences of images, the change data structures including coordinates in the masked images of identified background changes, identifiers of an inventory item subject of the identified background changes and classifications of the identified background changes; and processing change data structures from sets of cameras having overlapping fields of view to locate the identified background changes in real space. - View Dependent Claims (15, 16)
-
-
17. A method for tracking put and takes of inventory items by subjects in an area of real space including inventory display structures, comprising:
-
using a plurality of cameras disposed above the inventory display structures to produce respective sequences of images of inventory display structures in corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras; and detecting puts and takes of inventory items by identifying semantically significant changes in the sequences of images relating to inventory items on inventory display structures and associating the semantically significant changes with subjects represented in the sequences of images, wherein said detecting puts and takes includes using first image processors, including subject image recognition engines, to process images to identify subjects represented in the images in the corresponding sequences of images; using second image processors, including background image recognition engines, to mask identified subjects in images in the sequences of images, to generate masked images, to process the masked images to identify and to classify background changes represented in the images in the corresponding sequences of images, including associating background changes with identified subjects, and making detections of takes of inventory items by the identified subjects and of puts of inventory items on inventory display structures by the identified subjects.
-
-
18. A method for tracking put and takes of inventory items by subjects in an area of real space including inventory display structures, comprising:
-
using a plurality of cameras disposed above the inventory display structures to produce respective sequences of images of inventory display structures in corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras; and detecting puts and takes of inventory items by identifying semantically significant changes in the sequences of images relating to inventory items on inventory display structures and associating the semantically significant changes with subjects represented in the sequences of images, wherein said detecting puts and takes includes using first image processors, including subject image recognition engines, to process images to identify subjects represented in the images in the corresponding sequences of images; using second image processors, including background image recognition engines, to mask identified subjects in images in the sequences of images, to generate masked images, to process the masked images to identify and to classify background changes represented in the images in the corresponding sequences of images, wherein using the first image processors includes identifying locations of hands of identified subjects; and
includingassociating background changes with identified subjects by comparing the locations of the changes with the locations of hands of identified subjects, and making detections of takes of inventory items by the identified subjects and of puts of inventory items on inventory display structures by the identified subjects.
-
-
20. A computer program product, comprising:
-
a computer readable memory comprising a non-transitory data storage medium; computer instructions stored in the non-transitory data storage medium executable by a computer to track multi-joint subjects in an area of real space by a process including; using sequences of images from a plurality of cameras having corresponding fields of view in the real space, the field of view of each camera overlapping with the field of view of at least one other camera in the plurality of cameras; using first image processors, including subject image recognition engines, to process images to identify subjects represented in the images in the corresponding sequences of images; using second image processors, including background image recognition engines, to mask identified subjects in images in the sequences of images, to generate masked images, to process the masked images to identify and to classify background changes represented in the images in the corresponding sequences of images; and associating background changes with identified subjects, and making detections of takes of inventory items by the identified subjects and of puts of inventory items on inventory display structures by the identified subjects. - View Dependent Claims (21, 22, 23, 24)
-
Specification