Techniques for using gesture recognition to effectuate character selection
First Claim
1. A method, comprising:
- receiving with a computing device digital data representing a string of images, the images representing at least part of a user;
processing the data using software to detect therefrom a string of gestures of the user, each gesture represented as a vector;
mapping the vectors to a string of phonetic elements; and
identifying one or more words in a written language, the one or more words corresponding to the string of the phonetic elements;
whereinthe written language is a one of a plurality of regional languages,mapping the vectors to the string of phonetic elements comprises selecting each phonetic element in the string of phonetic elements in a manner that is agnostic to any one of the plurality of regional languages, andidentifying the one or more words comprises selecting a contextual dictionary specific to a selected one of the plurality of regional languages, the selected one corresponding to the written language, and translating the string of phonetic elements to the selected language; and
wherein furthereach vector comprises a position in n-degree space, and wherein mapping further comprises translating the position in n-degree space to a position in m-degree space, where n>
m, and selecting a phonetic element uniquely associated with the position in m-degree space,translating includes accessing a dictionary to map positions in n-degree space to corresponding positions in m-degree space, andthe method further comprises using principal components analysis to adaptively learn the dictionary.
0 Assignments
0 Petitions
Accused Products
Abstract
This disclosure provides a pose- or gesture-based recognition system that processes images of the human hand, downconverts degrees of freedom of the human hand to lower-dimensional space, and then maps the downconverted space to a character set. In one embodiment, the system is implemented in a smart phone or as a computer-input device that uses a virtual keyboard. As the user moves his or her hand, the smart phone or computer provides simulated vocal feedback, permitting the user to adjust hand position or motion to arrive at any desired character; this is particularly useful for embodiments which use a phonetic character set. Software that performs character selection can be implemented in a manner that is language/region agnostic, with a contextual dictionary being used to interpret a phonetic character set according to a specific language or region.
43 Citations
17 Claims
-
1. A method, comprising:
-
receiving with a computing device digital data representing a string of images, the images representing at least part of a user; processing the data using software to detect therefrom a string of gestures of the user, each gesture represented as a vector; mapping the vectors to a string of phonetic elements; and identifying one or more words in a written language, the one or more words corresponding to the string of the phonetic elements; wherein the written language is a one of a plurality of regional languages, mapping the vectors to the string of phonetic elements comprises selecting each phonetic element in the string of phonetic elements in a manner that is agnostic to any one of the plurality of regional languages, and identifying the one or more words comprises selecting a contextual dictionary specific to a selected one of the plurality of regional languages, the selected one corresponding to the written language, and translating the string of phonetic elements to the selected language; and wherein further each vector comprises a position in n-degree space, and wherein mapping further comprises translating the position in n-degree space to a position in m-degree space, where n>
m, and selecting a phonetic element uniquely associated with the position in m-degree space,translating includes accessing a dictionary to map positions in n-degree space to corresponding positions in m-degree space, and the method further comprises using principal components analysis to adaptively learn the dictionary. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. An apparatus comprising instructions stored on non-transitory machine-readable media, the instructions when executed to cause at least one processor to:
-
receive data representing a string of images, the images representing at least part of a user; process the data to detect therefrom a string of gestures, each gesture represented by a vector; map the vectors to a string of phonetic elements; and automatically identify words in a written language, the words corresponding to the string of the phonetic elements; wherein the written language is a one of a plurality of regional languages, the instructions when executed are to cause the at least one processor to map the vectors to the string of phonetic elements by selecting each element in the string of phonetic elements in a manner that is agnostic to any one of the plurality of regional languages, an the instructions when executed are to cause the at least one processor to automatically identify the words in the written language by selecting a contextual dictionary specific to a selected one of the plurality of regional languages, the selected one corresponding to the written language, and translating the string of phonetic elements to the selected language; and wherein further each vector comprises a position in n-degree space, the instructions when executed are to cause the at least one processor to map the vectors by translating the position in n-degree space to a position in m-degree space, where n>
m, and selecting a phonetic element uniquely associated with the position in m-degree space,the translating is performed by accessing a dictionary to map positions in n-degree space to corresponding positions in m-degree space, and the instructions when executed are further to cause the at least one processor to use principal components analysis to adaptively learn the dictionary.
-
-
14. An apparatus, comprising:
-
an input device to receive data representing a string of images, the string of images representing at least part of a user; and at least one processor to process the data to detect a string of gestures of the user from the images, each gesture corresponding to a vector, map the vectors to a string of phonetic elements, and automatically identify words in a written language, the identifies words corresponding to the string of the phonetic elements; wherein the written language is a one of a plurality of regional languages, the at least one processor is to map the vectors to the string of phonetic elements by selecting each element in the string of phonetic elements in a manner that is agnostic to any one of the plurality of regional languages, an the at least one processor is to automatically identify the words in the written language by selecting a contextual dictionary specific to a selected one of the plurality of regional languages, the selected one corresponding to the written language, and translating the string of phonetic elements to the selected language; and wherein further each vector comprises a position in n-degree space, the at least one processor is to map the vectors by translating the position in n-degree space to a position in m-degree space, where n>
m, and selecting a phonetic element uniquely associated with the position in m-degree space,the translating is performed by accessing a dictionary to map positions in n-degree space to corresponding positions in m-degree space, and the at least one processor is further to use principal components analysis to adaptively learn the dictionary. - View Dependent Claims (15, 16, 17)
-
Specification