Method and system for optimizing the observation and annotation of complex human behavior from video sources
First Claim
1. A method for efficiently annotating behavior and characteristics of a person or a plurality of persons in a first video stream and a second video stream in a physical space, comprising the following steps of:
- a) capturing the first video stream of the person or the plurality of persons by a first means for capturing images,b) processing the first video stream in order to track and detect predefined behavior and demographics of the person or the plurality of persons in a field of view of the first means for capturing images automatically using at least a means for control and processing that executes computer vision algorithms on the first video stream,c) providing demographic segmentation of the person or the plurality of persons to create a plurality of demographic groups,d) analyzing the behavior based on spatio-temporal primitives and a model for interaction levels of the person or the plurality of persons, wherein the behavior of each demographic group is analyzed to obtain segment-specific insights,e) generating time-stamped lists of events based on the automatically detected predefined behavior and demographics, using a time server,f) using the time-stamped lists of events and timestamps of events in the time-stamped lists of events to access at least a corresponding sub-stream for the events in the second video stream from a second means for capturing images,g) manually annotating each of the events with a plurality of labels for a synchronized annotation using a user interface, andh) utilizing the annotation for quantitative behavior analysis about interaction of the person or the plurality of persons with a plurality of commercial products in the physical space,wherein the first video stream is synchronized with the second video stream, and whereby the user interface allows users to mark time-based annotations describing complex behavioral issues, including expressions of the person or the plurality of persons.
9 Assignments
0 Petitions
Accused Products
Abstract
The present invention is a method and system for optimizing the observation and annotation of complex human behavior from video sources by automatically detecting predefined events based on the behavior of people in a first video stream from a first means for capturing images in a physical space, accessing a synchronized second video stream from a second means for capturing images that are positioned to observe the people more closely using the timestamps associated with the detected events from the first video stream, and enabling an annotator to annotate each of the events with more labels using a tool. The present invention captures a plurality of input images of the persons by a plurality of means for capturing images and processes the plurality of input images in order to detect the predefined events based on the behavior in an exemplary embodiment. The processes are based on a novel usage of a plurality of computer vision technologies to analyze the human behavior from the plurality of input images. The physical space may be a retail space, and the people may be customers in the retail space.
-
Citations
20 Claims
-
1. A method for efficiently annotating behavior and characteristics of a person or a plurality of persons in a first video stream and a second video stream in a physical space, comprising the following steps of:
-
a) capturing the first video stream of the person or the plurality of persons by a first means for capturing images, b) processing the first video stream in order to track and detect predefined behavior and demographics of the person or the plurality of persons in a field of view of the first means for capturing images automatically using at least a means for control and processing that executes computer vision algorithms on the first video stream, c) providing demographic segmentation of the person or the plurality of persons to create a plurality of demographic groups, d) analyzing the behavior based on spatio-temporal primitives and a model for interaction levels of the person or the plurality of persons, wherein the behavior of each demographic group is analyzed to obtain segment-specific insights, e) generating time-stamped lists of events based on the automatically detected predefined behavior and demographics, using a time server, f) using the time-stamped lists of events and timestamps of events in the time-stamped lists of events to access at least a corresponding sub-stream for the events in the second video stream from a second means for capturing images, g) manually annotating each of the events with a plurality of labels for a synchronized annotation using a user interface, and h) utilizing the annotation for quantitative behavior analysis about interaction of the person or the plurality of persons with a plurality of commercial products in the physical space, wherein the first video stream is synchronized with the second video stream, and whereby the user interface allows users to mark time-based annotations describing complex behavioral issues, including expressions of the person or the plurality of persons. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for efficiently annotating behavior and characteristics of a person or a plurality of persons in a first video stream and a second video stream in a physical space, comprising:
-
a) at least a first means for capturing images that captures the first video stream of the person or the plurality of persons, b) at least a first means for control and processing that executes computer vision algorithms on the first video stream, performing the following steps of; processing the first video stream in order to track and detect predefined behavior and demographics of the person or the plurality of persons in a field of view of the first means for capturing images automatically, providing demographic segmentation of the person or the plurality of persons to create a plurality of demographic groups, analyzing the behavior based on spatio-temporal primitives and a model for interaction levels of the person or the plurality of persons, wherein the behavior of each demographic group is analyzed to obtain segment-specific insights, and generating time-stamped lists of events based on the automatically detected predefined behavior and demographics, using a time server, and c) an annotation tool for using the time-stamped lists of events and timestamps of events in the time-stamped lists of events to access at least a corresponding sub-stream for the events in the second video stream from a second means for capturing images, and for annotating each of the events with a plurality of labels for a synchronized annotation including a user interface for the annotation, wherein the first video stream is synchronized with the second video stream, wherein the annotation is utilized for quantitative behavior analysis about interaction of the person or the plurality of persons with a plurality of commercial products in the physical space, and whereby the user interface allows users to mark time-based annotations describing complex behavioral issues, including expressions of the person or the plurality of persons. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification