Techniques for using gesture recognition to effectuate character selection

US 10,146,318 B2
Filed: 05/20/2015
Issued: 12/04/2018
Est. Priority Date: 06/13/2014
Status: Expired due to Fees

First Claim

Patent Images

1. A method, comprising:

receiving with a computing device digital data representing a string of images, the images representing at least part of a user;

processing the data using software to detect therefrom a string of gestures of the user, each gesture represented as a vector;

mapping the vectors to a string of phonetic elements; and

identifying one or more words in a written language, the one or more words corresponding to the string of the phonetic elements;

whereinthe written language is a one of a plurality of regional languages,mapping the vectors to the string of phonetic elements comprises selecting each phonetic element in the string of phonetic elements in a manner that is agnostic to any one of the plurality of regional languages, andidentifying the one or more words comprises selecting a contextual dictionary specific to a selected one of the plurality of regional languages, the selected one corresponding to the written language, and translating the string of phonetic elements to the selected language; and

wherein furthereach vector comprises a position in n-degree space, and wherein mapping further comprises translating the position in n-degree space to a position in m-degree space, where n>

m, and selecting a phonetic element uniquely associated with the position in m-degree space,translating includes accessing a dictionary to map positions in n-degree space to corresponding positions in m-degree space, andthe method further comprises using principal components analysis to adaptively learn the dictionary.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

This disclosure provides a pose- or gesture-based recognition system that processes images of the human hand, downconverts degrees of freedom of the human hand to lower-dimensional space, and then maps the downconverted space to a character set. In one embodiment, the system is implemented in a smart phone or as a computer-input device that uses a virtual keyboard. As the user moves his or her hand, the smart phone or computer provides simulated vocal feedback, permitting the user to adjust hand position or motion to arrive at any desired character; this is particularly useful for embodiments which use a phonetic character set. Software that performs character selection can be implemented in a manner that is language/region agnostic, with a contextual dictionary being used to interpret a phonetic character set according to a specific language or region.

43 Citations

View as Search Results

17 Claims

1. A method, comprising:
- receiving with a computing device digital data representing a string of images, the images representing at least part of a user;
  
  processing the data using software to detect therefrom a string of gestures of the user, each gesture represented as a vector;
  
  mapping the vectors to a string of phonetic elements; and
  
  identifying one or more words in a written language, the one or more words corresponding to the string of the phonetic elements;
  
  whereinthe written language is a one of a plurality of regional languages,mapping the vectors to the string of phonetic elements comprises selecting each phonetic element in the string of phonetic elements in a manner that is agnostic to any one of the plurality of regional languages, andidentifying the one or more words comprises selecting a contextual dictionary specific to a selected one of the plurality of regional languages, the selected one corresponding to the written language, and translating the string of phonetic elements to the selected language; and
  
  wherein furthereach vector comprises a position in n-degree space, and wherein mapping further comprises translating the position in n-degree space to a position in m-degree space, where n>
  
  m, and selecting a phonetic element uniquely associated with the position in m-degree space,translating includes accessing a dictionary to map positions in n-degree space to corresponding positions in m-degree space, andthe method further comprises using principal components analysis to adaptively learn the dictionary.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1, wherein receiving comprises capturing the string of images by controlling a camera of the computing device, and where in the data comprises image data from the camera.
  - 3. The method of claim 2, wherein the software comprises image processing software, wherein controlling the camera further comprises capturing video using the camera, and wherein processing the data further comprises processing the video using the software to detect the string of gestures.
  - 4. The method of claim 2, wherein the camera is a 3-dimensional image capture device, and wherein processing the data further comprises processing three-dimensional images to detect the string of gestures.
  - 5. The method of claim 1, wherein the computing device comprises a smart phone and wherein the software comprises an application executable by the smart phone.
  - 6. The method of claim 5, wherein mapping the string of gestures comprises transmitting information representing the string of gestures to a remote server and wherein identifying words in the written language comprises receiving data representing the words from the remote server.
  - 7. The method of claim 1, further comprising displaying the words to the user via a visual display of the computing device.
  - 8. The method of claim 1, further comprising playing sounds to the user corresponding to mapped phonetic elements.
  - 9. The method of claim 8, further comprising providing the user with an option to selectively enable and disable the playing of the sounds.
  - 10. The method of claim 1, wherein at least one of processing the data and mapping the string of gestures to the string of phonetic elements comprises using a neural net to learn desired user phonetic selections responsive to unique gestures of a particular user.
  - 11. The method of claim 1, wherein the images encompass a hand of the user, wherein the gestures encompass hand poses, and wherein at least some of the vectors correspond to hand poses.
  - 12. The method of claim 1, embodied as a method of inputting words to a computing device.

13. An apparatus comprising instructions stored on non-transitory machine-readable media, the instructions when executed to cause at least one processor to:
- receive data representing a string of images, the images representing at least part of a user;
  
  process the data to detect therefrom a string of gestures, each gesture represented by a vector;
  
  map the vectors to a string of phonetic elements; and
  
  automatically identify words in a written language, the words corresponding to the string of the phonetic elements;
  
  whereinthe written language is a one of a plurality of regional languages,the instructions when executed are to cause the at least one processor to map the vectors to the string of phonetic elements by selecting each element in the string of phonetic elements in a manner that is agnostic to any one of the plurality of regional languages, anthe instructions when executed are to cause the at least one processor to automatically identify the words in the written language by selecting a contextual dictionary specific to a selected one of the plurality of regional languages, the selected one corresponding to the written language, and translating the string of phonetic elements to the selected language; and
  
  wherein furthereach vector comprises a position in n-degree space,the instructions when executed are to cause the at least one processor to map the vectors by translating the position in n-degree space to a position in m-degree space, where n>
  
  m, and selecting a phonetic element uniquely associated with the position in m-degree space,the translating is performed by accessing a dictionary to map positions in n-degree space to corresponding positions in m-degree space, andthe instructions when executed are further to cause the at least one processor to use principal components analysis to adaptively learn the dictionary.

14. An apparatus, comprising:
- an input device to receive data representing a string of images, the string of images representing at least part of a user; and
  
  at least one processor toprocess the data to detect a string of gestures of the user from the images, each gesture corresponding to a vector,map the vectors to a string of phonetic elements, andautomatically identify words in a written language, the identifies words corresponding to the string of the phonetic elements;
  
  whereinthe written language is a one of a plurality of regional languages,the at least one processor is to map the vectors to the string of phonetic elements by selecting each element in the string of phonetic elements in a manner that is agnostic to any one of the plurality of regional languages, anthe at least one processor is to automatically identify the words in the written language by selecting a contextual dictionary specific to a selected one of the plurality of regional languages, the selected one corresponding to the written language, and translating the string of phonetic elements to the selected language; and
  
  wherein furthereach vector comprises a position in n-degree space,the at least one processor is to map the vectors by translating the position in n-degree space to a position in m-degree space, where n>
  
  m, and selecting a phonetic element uniquely associated with the position in m-degree space,the translating is performed by accessing a dictionary to map positions in n-degree space to corresponding positions in m-degree space, andthe at least one processor is further to use principal components analysis to adaptively learn the dictionary.
- View Dependent Claims (15, 16, 17)
- - 15. The apparatus of claim 14, embodied as a smart phone.
  - 16. The apparatus of claim 14, wherein the input device comprises a 3-dimensional image capture device, and wherein the apparatus further comprises instructions stored on machine-readable media to cause the at least one processor to process three-dimensional images from the 3-dimensional image capture device to detect the string of gestures.
  - 17. The apparatus of claim 14, wherein the at least part of the user includes a hand, wherein each image comprises information sufficient to determine a pose of the hand, and wherein each vector represents a pose of the hand.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Thomas Malzbender
Original Assignee
Thomas Malzbender
Inventors
Malzbender, Thomas
Primary Examiner(s)
Le, Vu
Assistant Examiner(s)
Rivera-Martinez, Guillermo M

Application Number

US14/717,998
Publication Number

US 20150363001A1
Time in Patent Office

1,294 Days
Field of Search

None
US Class Current
CPC Class Codes

G06F 18/2135   based on approximation crit...

G06F 3/017   Gesture based interaction, ...

G06F 3/0233   Character input methods

G06F 3/0236   using selection techniques ...

G06F 3/0482   Interaction with lists of s...

G06F 40/274   Converting codes to words; ...

G06N 3/02   Neural networks

G06V 10/7715   Feature extraction, e.g. by...

G06V 40/28   Recognition of hand or arm ...

Techniques for using gesture recognition to effectuate character selection

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

43 Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Techniques for using gesture recognition to effectuate character selection

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

43 Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links