Method and apparatus for real-time gesture recognition
First Claim
1. A computer-implemented method of storing and recognizing gestures made by a moving subject within an image, the method including:
- a) building a background model by obtaining at least one frame of an image;
b) obtaining a data frame containing a subject performing part of a gesture;
c) analyzing the data frame thereby determining particular coordinates of the subject at a particular time while the subject is performing the gesture;
d) adding the particular coordinates to a frame data set;
e) examining the particular coordinates such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein a recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture;
f) repeating b through e for a plurality of data frames; and
g) determining whether the plurality of the data frames when examined in a particular sequence, conveys a subject gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer.
7 Assignments
0 Petitions
Accused Products
Abstract
A system and method are disclosed for providing a gesture recognition system for recognizing gestures made by a moving subject within an image and performing an operation based on the semantic meaning of the gesture. A subject, such as a human being, enters the viewing field of a camera connected to a computer and performs a gesture, such as flapping of the arms. The gesture is then examined by the system one image frame at a time. Positional data is derived from the input frames and compared to data representing gestures already known to the system. The comparisons are done in real-time and the system can be trained to better recognize known gestures or to recognize new gestures. A frame of the input image containing the subject is obtained after a background image model has been created. An input frame is used to derive a frame data set that contains particular coordinates of the subject at a given moment in time. This series of frame data sets is examined to determine whether it conveys a gesture that is known to the system. If the subject gesture is recognizable to the system, an operation based on the semantic meaning of the gesture can be performed by a computer.
-
Citations
58 Claims
-
1. A computer-implemented method of storing and recognizing gestures made by a moving subject within an image, the method including:
-
a) building a background model by obtaining at least one frame of an image; b) obtaining a data frame containing a subject performing part of a gesture; c) analyzing the data frame thereby determining particular coordinates of the subject at a particular time while the subject is performing the gesture; d) adding the particular coordinates to a frame data set; e) examining the particular coordinates such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein a recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture; f) repeating b through e for a plurality of data frames; and g) determining whether the plurality of the data frames when examined in a particular sequence, conveys a subject gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A computer-implemented system for storing and recognizing gestures made by a moving subject within an image, the system comprising:
-
an image modeller for creating a background model by examining a plurality of frames of an input image that does not contain a subject; a frame capturer for obtaining a data frame containing the subject performing part of a subject gesture; a frame analyzer for analyzing the data frame thereby determining relevant coordinates of the subject at a particular time while the subject is performing the subject gesture; a data set creator for creating a frame data set by collecting the relevant coordinates; a data set analyzer for examining the particular coordinates in the frame data set such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein each recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture; and a gesture recognizer for determining whether a plurality of the data frames, wherein a data frame is represented by a frame data set, when examined in a particular sequence, conveys a gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37)
-
-
38. A computer-implemented system for storing and recognizing gestures made by a moving subject within an image, the system comprising:
-
means for building a background model by obtaining at least one frame of an image; means for obtaining a data frame containing a subject performing a part of a subject gesture; means for analyzing the data frame thereby determining particular coordinates of the subject at a particular time while the subject is performing the subject gesture; means for adding the particular coordinates to a frame data set; means for examining the particular coordinates such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein a recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture; and means for determining whether a plurality of data frames, where a data frame is represented by the frame data set, when examined in a particular sequence, conveys a subject gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer. - View Dependent Claims (39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55)
-
-
56. A computer-implemented method of storing and recognizing gestures made by a moving subject within an image, the method including:
-
a) building a background model by obtaining at least one frame of an image including determining whether there is significant activity in the background image thereby restarting the process for building the background model; b) obtaining a data frame containing a subject performing part of a gesture including separating the subject in the data frame into a plurality of identifiable parts wherein an identifiable part is assigned particular coordinates; c) analyzing the data frame thereby determining particular coordinates of the subject at a particular time while the subject is performing the gesture; d) adding the particular coordinates to a frame data set; e) examining the particular coordinates such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein a recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture; f) repeating b through e for a plurality of data frames; g) determining whether the plurality of the data frames when examined in a particular sequence, conveys a subject gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer; h) storing a plurality of samples of a subject gesture; i) inputting a number of key points that fit in the subject gesture and a time value representing the time for the subject gesture to complete; j) inputting a number of dimensions of the subject gesture; k) determining locations of key points in a model representative of the subject gesture; l) calculating a probability distribution for key points indicating the likelihood that a certain output will be observed; and m) refining the model such that the plurality of samples of the subject gesture fit within the model.
-
-
57. A computer-implemented system for storing and recognizing gestures made by a moving subject within an image, the system comprising:
-
an image modeller for creating a background model by examining a plurality of frames of an input image that does not contain a subject comprising an image initializer for initializing the input image that does not contain the subject; a frame capturer for obtaining a data frame containing the subject performing part of a subject gesture comprising a frame separator for categorizing the subject represented in the data frame into a plurality of identifiable parts wherein an identifiable part is assigned particular coordinates; a frame analyzer for analyzing the data frame thereby determining relevant coordinates of the subject at a particular time while the subject is performing the subject gesture; a data set creator for creating a frame data set by collecting the relevant coordinates; a data set analyzer for examining the particular coordinates in the frame data set such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein each recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture; a gesture recognizer for determining whether a plurality of the data frames, wherein a data frame is represented by a frame data set, when examined in a particular sequence, conveys a gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer; a sample receiver for storing a plurality of samples of a subject gesture; a gesture data intaker for accepting a plurality of key points that fits in the subject gesture, a time value representing the time for the subject gesture to complete and a plurality of dimensions of the subject gesture; a key point locator for determining locations of key points in a model representative of the subject gesture; a probability evaluator for calculating a probability distribution at the key points indicating the likelihood of observing a particular output; and a model refiner for refining the model such that the plurality of samples of the subject gesture fit within the model.
-
-
58. A computer-implemented system for storing and recognizing gestures made by a moving subject within an image, the system comprising:
-
means for building a background model by obtaining at least one frame of an image including means for determining whether there is significant activity in the background image thereby restarting the process for building the background model; means for obtaining a data frame containing a subject performing a part of a subject gesture including means for separating the subject in the data frame into a plurality of identifiable parts wherein an identifiable part is assigned particular coordinates; means for analyzing the data frame thereby determining particular coordinates of the subject at a particular time while the subject is performing the subject gesture; means for adding the particular coordinates to a frame data set; means for examining the particular coordinates such that the particular coordinates are compared to positional data making up a plurality of recognizable gestures, wherein a recognizable gesture is made up of at least one dimension such that the positional data describes dimensions of the recognized gesture; means for determining whether a plurality of data frames, where a data frame is represented by the frame data set, when examined in a particular sequence, conveys a subject gesture by the subject that resembles a recognizable gesture, thereby causing an operation based on a predetermined meaning of the recognizable gesture be performed by a computer; means for storing a plurality of samples of a subject gesture; means for inputting a number of key points that fit in the gesture and a time value representing the time for the subject gesture to complete; means for inputting a number of dimensions of the subject gesture; means for determining locations of key points in a model representative of the subject gesture; means for calculating a probability distribution for key points indicating the likelihood of observing a particular output; and means for refining the model such that the plurality of samples of the subject gesture fit within the model.
-
Specification