Method for augmenting transaction data with visually extracted demographics of people using computer vision
First Claim
1. A method for combining automatically detected demographic, group, and behavior features of people in a retail transaction area with transaction data, comprising the following steps of:
- a) acquiring facial images of the people from first input images captured by at least a first means for capturing images, including a visual sensing device,b) determining demographic categories of the facial images to generate a demographic data, using at least a demographic feature extractor,c) acquiring person images from second input images captured by at least a second means for capturing images, including a visual sensing device, near the transaction area,d) determining shopping group membership of the people to generate group data from the second input images,e) analyzing the movement of the people for behavior analysis by determining checkout behaviors of the people based on analysis of the second input images by recognizing interactions of the people with merchandise in the transaction area,f) associating transaction data with the demographics data, the behavior analysis, and the group data, andwherein the analysis of the second input images comprises body orientation estimation of the people, proximity calculation between the people and checkout shelves, and foreground object analysis, in the transaction area,wherein the first means for capturing images comprises a face view camera that captures the facial images of the people waiting in the checkout queue for face detection,wherein the second means for capturing images comprises a top-down view camera that captures the top-down view and person images of the checkout queue for person detection,wherein the face view camera and the top-down view camera are placed and oriented so that the image positions of the facial images and the person images are translated into a common world-coordinate system, andwherein the steps are performed in a control and processing system that is connected to the means for capturing images.
19 Assignments
0 Petitions
Accused Products
Abstract
The present invention is a system and framework for augmenting any retail transaction system with information about the involved customers. This invention provides a method to combine the transaction data records and a customer or a group of customers with the automatically extracted demographic features (e.g., gender, age, and ethnicity), shopping group information, and behavioral information using computer vision algorithms. First, the system detects faces from face view, tracks them individually, and estimates poses of each of the tracked faces to normalize. These facial images are processed by the demographics classification module to determine and record the demographics feature vector. The system detects and tracks customers to analyze the dynamic behavior of the tracked customers so that their shopping group membership and checkout behavior can be recognized. Then the instances of faces and the instances of bodies can be matched and combined. Finally, the transaction data from the transaction data and the demographics, group, and checkout behavior data that belong to the same person or the same group of people are combined.
158 Citations
18 Claims
-
1. A method for combining automatically detected demographic, group, and behavior features of people in a retail transaction area with transaction data, comprising the following steps of:
-
a) acquiring facial images of the people from first input images captured by at least a first means for capturing images, including a visual sensing device, b) determining demographic categories of the facial images to generate a demographic data, using at least a demographic feature extractor, c) acquiring person images from second input images captured by at least a second means for capturing images, including a visual sensing device, near the transaction area, d) determining shopping group membership of the people to generate group data from the second input images, e) analyzing the movement of the people for behavior analysis by determining checkout behaviors of the people based on analysis of the second input images by recognizing interactions of the people with merchandise in the transaction area, f) associating transaction data with the demographics data, the behavior analysis, and the group data, and wherein the analysis of the second input images comprises body orientation estimation of the people, proximity calculation between the people and checkout shelves, and foreground object analysis, in the transaction area, wherein the first means for capturing images comprises a face view camera that captures the facial images of the people waiting in the checkout queue for face detection, wherein the second means for capturing images comprises a top-down view camera that captures the top-down view and person images of the checkout queue for person detection, wherein the face view camera and the top-down view camera are placed and oriented so that the image positions of the facial images and the person images are translated into a common world-coordinate system, and wherein the steps are performed in a control and processing system that is connected to the means for capturing images. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. An apparatus for combining automatically detected demographic, group, and behavior features of people in a retail transaction area with transaction data, comprising:
-
a) means for acquiring facial images of the people from first input images captured by at least a first means for capturing images, including a visual sensing device, b) means for determining demographic categories of the facial images to generate a demographic data, wherein the means for determining demographic categories includes at least a demographic feature extractor, c) means for acquiring person images from second input images captured by at least a second means for capturing images, including a visual sensing device, near the transaction area, d) means for determining shopping group membership of the people to generate group data from the second input images, e) means for analyzing the movement of the people for behavior analysis by determining checkout behaviors of the people based on analysis of the second input images by recognizing interactions of the people with merchandise in the transaction area, and f) means for associating transaction data with the demographics data, the behavior analysis, and the group data, wherein the analysis of the second input images comprises body orientation estimation of the people, proximity calculation between the people and checkout shelves, and foreground object analysis, in the transaction area, wherein the first means for capturing images comprises a face view camera that captures the facial images of the people waiting in the checkout queue for face detection, wherein the second means for capturing images comprises a top-down view camera that captures the top-down view and person images of the checkout queue for person detection, wherein the face view camera and the top-down view camera are placed and oriented so that the image positions of the facial images and the person images are translated into a common world-coordinate system, and wherein the apparatus further comprises a control and processing system that is connected to the means for capturing images. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification