Method and system for gesture category recognition and training using a feature vector
First Claim
1. In an electronic system having a processor, a memory, unit, an alphanumeric input device and a cursor directing device, a method of providing a user interface comprising the computer implemented steps of:
- a) accessing gesture data representing a gesture formed by tracking movement of a cursor moved by a user with said cursor directing device, said gesture data comprising coordinate positions and timing information and having one or more individual strokes;
b) generating a multi-dimensional feature vector based on said gesture data;
c) providing said multi-dimensional feature vector to a radial basis function neural network for recognition, said radial basis function neural network associating said multi-dimensional feature vector with a gesture category from a predefined plurality of gesture categories and supplying said gesture category as an output value; and
d) applying a set of predetermined commands to said electronic system, said set of predetermined commands being associated with said gesture category output from said radial basis function neural network.
4 Assignments
0 Petitions
Accused Products
Abstract
A computer implemented method and system for gesture category recognition and training. Generally, a gesture is a hand or body initiated movement of a cursor directing device to outline a particular pattern in particular directions done in particular periods of time. The present invention allows a computer system to accept input data, originating from a user, in the form gesture data that are made using the cursor directing device. In one embodiment, a mouse device is used, but the present invention is equally well suited for use with other cursor directing devices (e.g., a track ball, a finger pad, an electronic stylus, etc.). In one embodiment, gesture data is accepted by pressing a key on the keyboard and then moving the mouse (with mouse button pressed) to trace out the gesture. Mouse position information and time stamps are recorded. The present invention then determines a multi-dimensional feature vector based on the gesture data. The feature vector is then passed through a gesture category recognition engine that, in one implementation, uses a radial basis function neural network to associate the feature vector to a pre-existing gesture category. Once identified, a set of user commands that are associated with the gesture category are applied to the computer system. The user commands can originate from an automatic process that extracts commands that are associated with the menu items of a particular application program. The present invention also allows user training so that user-defined gestures, and the computer commands associated therewith, can be programmed into the computer system.
-
Citations
26 Claims
-
1. In an electronic system having a processor, a memory, unit, an alphanumeric input device and a cursor directing device, a method of providing a user interface comprising the computer implemented steps of:
-
a) accessing gesture data representing a gesture formed by tracking movement of a cursor moved by a user with said cursor directing device, said gesture data comprising coordinate positions and timing information and having one or more individual strokes;
b) generating a multi-dimensional feature vector based on said gesture data;
c) providing said multi-dimensional feature vector to a radial basis function neural network for recognition, said radial basis function neural network associating said multi-dimensional feature vector with a gesture category from a predefined plurality of gesture categories and supplying said gesture category as an output value; and
d) applying a set of predetermined commands to said electronic system, said set of predetermined commands being associated with said gesture category output from said radial basis function neural network. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
a1) referencing received coordinate positions and timing information with a current stroke while a gesture key is pressed and while a button on said cursor directing device is pressed;
a2) referencing received coordinate positions and timing information with a next stroke after said button is released and while said gesture key is pressed and while said button is pressed again; and
a3) terminating receiving said gesture data upon said gesture key being released.
-
-
3. A method as described in claim 1 wherein said step b) comprises the steps of:
-
b1) normalizing said gesture data;
b2) dividing each stroke of said gesture data into a plurality of segments, N;
b3) determining first feature elements for each stroke of said gesture data based on an end point of a respective stroke and a start point of a next stroke;
b4) determining second feature elements for each segment of each stroke of said gesture data based on an orientation of each segment with respect to a reference line, wherein said multi-dimensional feature vector comprises;
a value indicating the number of strokes within the gesture data;
said first feature elements; and
said second feature elements.
-
-
4. A method as described in claim 3 wherein said value of N is inversely related to the number of strokes of said gesture data.
-
5. A method as described in claim 3 wherein said step b3) is performed for each stroke of said gesture data and comprises the steps of:
-
determining a coordinate position of said end point of said current stroke;
determining a coordinate position of said start point of said next stroke; and
determining said first feature elements based on the difference in x-coordinate positions and the difference in y-coordinate positions of said end point and said start point.
-
-
6. A method as described in claim 3 wherein said step b4) is performed for each segment of each stroke of said gesture data and comprises the steps of:
-
determining a start point and an end point of a respective stroke segment; and
determining said stroke feature elements for said respective stroke segment according to the sine and cosine of the directed angle between a straight line between said start point and said end point of said respective stroke segment and a horizontal reference.
-
-
7. A method as described in claim 3 further comprising the step of e) adding a new gesture category to said radial basis function neural network by performing the steps of:
-
e1) receiving a new gesture category name;
e2) receiving first gesture data;
e3) generating a first multi-dimensional feature vector based on said first gesture data;
e4) creating a bounded area within a predefined space associated with said radial basis function neural network according to said first multi-dimensional feature vector; and
e5) associating said bounded area with said new gesture category within said predefined plurality of gesture categories and associating a function to be performed with said new gesture category.
-
-
8. A method as described in claim 7 further comprising the step of f) modifying an existing gesture category of said radial basis function neural network by performing the steps of:
-
f1) receiving an existing gesture category name;
f2) receiving second gesture data;
f3) generating a second multi-dimensional feature vector based on said second gesture data; and
f4) modifying a pre-existing bounded area within said predefined space associated with said radial basis function neural network that is associated with said existing gesture category according to said second multi-dimensional feature vector.
-
-
9. A method as described in claim 8 wherein said second gesture data corresponds to a positive gesture example.
-
10. A method as described in claim 8 wherein said second gesture data corresponds to a negative gesture example.
-
11. A method as described in claim 1 further comprising the steps of:
-
e) generating a first portion of said predefined plurality of gesture categories by automatically extracting menu items from a selected first application and storing said first portion into said memory unit upon said user invoking said first application; and
f) generating a second portion of said predefined plurality of gesture categories by automatically extracting menu items from a selected second application and storing said second portion into said memory unit upon said user invoking said second application.
-
-
12. In an electronic system having a processor, a memory unit, an alphanumeric input device and a cursor directing device, a method of determining a feature vector representing a gesture comprising the computer implemented steps of:
-
a) accessing gesture data representing a gesture formed by tracking movement of a cursor moved by a user with said cursor directing device, said gesture data comprising coordinate positions and timing information and having one or more individual strokes;
b) normalizing said gesture data;
c) dividing each stroke of said gesture data into a plurality of segments, N;
d) determining first feature elements for a respective stroke of said gesture data based on an end point of said respective stroke and a start point of a next stroke, wherein step d) is performed for each stroke of said gesture data; and
e) determining second feature elements for each segment of each stroke of said gesture data based on an orientation of each segment with respect to a reference line, wherein said feature vector comprises;
a value indicating the number of strokes of said gesture data;
said first feature elements for each stroke; and
said second feature elements for each segment of each stroke.- View Dependent Claims (13, 14, 15)
e1) determining a start point and an end point of a respective stroke segment;
e2) determining said stroke feature elements for said respective stroke segment according to the sine and cosine of the directed angle between a straight line between said start point and said end point of said respective stroke segment and a horizontal reference; and
e3) performing steps e1) and e2) for each segment of each stroke of said gesture data.
-
-
16. In an electronic system having a processor, a memory unit, an alphanumeric input device and a cursor directing device, a method of training said system to recognize gestures comprising the computer implemented steps of:
-
a) defining a new gesture category name;
b) accessing gesture data representing a gesture formed by tracking movement of a cursor moved by a user with said cursor directing device, said gesture data comprising coordinate positions and timing information and having one or more individual strokes;
c) generating a feature vector based on said gesture data;
d) using said feature vector to generate a bounded area within a predefined space associated with a radial basis function neural network;
e) associating said bounded area with said new gesture category name within a predefined plurality of gesture categories stored in said memory unit; and
f) associating a set of predetermined commands to said new gesture category name, said set of predetermined commands for application to said electronic system upon said new gesture category name being recognized by said radial basis function neural network. - View Dependent Claims (17, 18, 19, 20)
c1) normalizing said gesture data;
c2) dividing each stroke of said gesture data into a plurality of segments, N;
c3) determining first feature elements for a respective stroke of said gesture data based on an end point of said respective stroke and a start point of a next stroke, said step c3) performed for each stroke of said gesture data; and
c4) determining second feature elements for each segment of each stroke of said gesture data based on an orientation of each segment with respect to a reference line, wherein said feature vector comprises;
a value indicating the number of strokes of said gesture data;
said first feature elements for each stroke; and
said second feature elements for each segment of each stroke.
-
-
18. A method as described in claim 17 wherein said step c4) is performed for each segment of each stroke and comprises the steps of:
-
determining a start point and an end point of a respective stroke segment; and
determining said stroke feature elements for said respective stroke segment according to the sine and cosine of the directed angle between a straight line between said start point and said end point of said respective stroke segment and a horizontal reference.
-
-
19. A method as described in claim 17 wherein said value of N is inversely related to the number of strokes of said gesture data.
-
20. A method as described in claim 17 wherein said step c3) comprises the step of determining said first feature elements based on the difference in x-coordinate positions and the difference in y-coordinate positions of said start point and said end point.
-
21. In an electronic system having a processor, a memory unit, an alphanumeric input device and a cursor directing device, a method of training said system to recognize gestures comprising the computer implemented steps of:
-
a) identifying an existing gesture category name within a predefined set of gesture categories stored within said memory unit;
b) accessing gesture data representing a gesture formed by tracking movement of a cursor moved by a user with said cursor directing device, said gesture data comprising coordinate positions and timing information and having one or more individual strokes;
c) generating a feature vector based on said gesture data; and
d) using said feature vector to modify a pre-existing bounded area within a predefined space associated with a radial basis function neural network, said pre-existing bounded area being associated with said existing gesture category. - View Dependent Claims (22, 23, 24)
c1) normalizing said gesture data;
c2) dividing each stroke of said gesture data into a plurality of segments, N;
c3) determining first feature elements for a respective stroke of said gesture data based on an end point of said respective stroke and a start point of a next stroke, said step d) performed for each stroke of said gesture data; and
c4) determining second feature elements for each segment of each stroke of said gesture data based on an orientation of each segment with respect to a reference line, wherein said feature vector comprises;
a value indicating the number of strokes of said gesture data;
said first feature elements for each stroke; and
said second feature elements for each segment of each stroke.
-
-
23. A method as described in claim 22 wherein said gesture data represents a positive gesture example which increases the size of said pre-existing bounded area within said predefined space associated with said radial basis function neural network.
-
24. A method as described in claim 22 wherein said gesture data represents a negative gesture example which decreases the size of said pre-existing bounded area within said predefined space associated with said radial basis function neural network.
-
25. An electronic system comprising:
-
a processor coupled to a bus;
an alphanumeric input device and a cursor directing device coupled to said bus; and
a memory unit coupled to said bus, said memory unit containing instructions that when executed implement a method of providing a user interface comprising the steps of;
a) accessing gesture data representing a gesture formed by tracking movement of a cursor moved by a user with said cursor directing device, said gesture data comprising coordinate positions and timing information and having one or more individual strokes;
b) generating a multi-dimensional feature vector based on said gesture data;
c) providing said multi-dimensional feature vector to a radial basis function neural network for recognition, said radial basis function neural network associating said multi-dimensional feature vector with a gesture category from a predefined plurality of gesture categories and supplying said gesture category as an output value; and
d) applying a set of predetermined commands to said electronic system, said set of predetermined commands being associated with said gesture category output from said radial basis function neural network. - View Dependent Claims (26)
b1) normalizing said gesture data;
b2) dividing each stroke of said gesture data into a plurality of segments, N;
b3) determining first feature elements for each stroke of said gesture data based on an end point of a respective stroke and a start point of a next stroke;
b4) determining second feature elements for each segment of each stroke of said gesture data based on an orientation of each segment with respect to a reference line, wherein said multi-dimensional feature vector comprises;
a value indicating the number of strokes within the gesture data;
said first feature elements; and
said second feature elements.
-
Specification