Controlling objects via gesturing

US 8,707,216 B2
Filed: 02/26/2009
Issued: 04/22/2014
Est. Priority Date: 02/07/2002
Status: Expired due to Fees

First Claim

Patent Images

1. A system for controlling an object via gesturing comprising:

at least one camera for receiving light from a device;

a host computer coupled to the at least one camera for processing the received light to produce image data to determine a 3-dimensional location of a gesture, wherein receiving light comprises the at least one camera periodically capturing images associated with the light, and wherein the host computer processes the image data against a coordinate system to determine the 3-dimensional location of the gesture; and

a storage device coupled to the host computer for storing a plurality of prototype sequences, wherein the host computer processes the image data to determine that the gesture has been performed by matching the image data against the plurality of prototype sequences, and wherein if the image data matches one of the prototype sequences a control action corresponding to the gesture is performed on the object, wherein the storage device further comprises a list that includes one or more modified prototype sequences corresponding to each of the plurality of prototype sequences, wherein the one or more modified prototype sequences are different versions of a corresponding prototype sequence.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention is directed toward a system and process that controls a group of networked electronic components using a multimodal integration scheme in which inputs from a speech recognition subsystem, gesture recognition subsystem employing a wireless pointing device and pointing analysis subsystem also employing the pointing device, are combined to determine what component a user wants to control and what control action is desired. In this multimodal integration scheme, the desired action concerning an electronic component is decomposed into a command and a referent pair. The referent can be identified using the pointing device to identify the component by pointing at the component or an object associated with it, by using speech recognition, or both. The command may be specified by pressing a button on the pointing device, by a gesture performed with the pointing device, by a speech recognition event, or by any combination of these inputs.

368 Citations

24 Claims

1. A system for controlling an object via gesturing comprising:
- at least one camera for receiving light from a device;
  
  a host computer coupled to the at least one camera for processing the received light to produce image data to determine a 3-dimensional location of a gesture, wherein receiving light comprises the at least one camera periodically capturing images associated with the light, and wherein the host computer processes the image data against a coordinate system to determine the 3-dimensional location of the gesture; and
  
  a storage device coupled to the host computer for storing a plurality of prototype sequences, wherein the host computer processes the image data to determine that the gesture has been performed by matching the image data against the plurality of prototype sequences, and wherein if the image data matches one of the prototype sequences a control action corresponding to the gesture is performed on the object, wherein the storage device further comprises a list that includes one or more modified prototype sequences corresponding to each of the plurality of prototype sequences, wherein the one or more modified prototype sequences are different versions of a corresponding prototype sequence.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The system of claim 1, wherein the modified prototype sequences are different from their corresponding prototype sequence in speed or amplitude.
  - 3. The system of claim 1, wherein the modified prototype sequences are different from their corresponding prototype sequence in time.
  - 4. The system of claim 1, wherein when the host computer matches the image data against the prototype sequences, it also matches the image data against the corresponding modified prototype sequences.
  - 5. The system of claim 4, wherein if the image data matches one of the modified prototype sequences, the control action for the corresponding prototype sequence is performed on the object.
  - 6. The system of claim 1, wherein matching the image data against the plurality of prototype sequences comprises computing a similarity indicator between the image data and each of the prototype sequences.
  - 7. The system of claim 6, wherein after the similarity indicator between the image data and each of the prototype sequences is computed, the host computer determines a best match from one of the similarity indicators.
  - 8. The system of claim 7, wherein the host computer determines whether the best match exceeds a threshold and if the best match does not exceed the threshold, the host computer determines that the gesture has not been performed and that the control action associated with the object should also not be performed.
  - 9. The system of claim 1, wherein one of the at least one cameras is an infrared camera.

10. A method for controlling an object via gesturing comprising:
- receiving light with at least one camera;
  
  capturing images caused by the light with the at least one camera;
  
  producing image data from the light, said image data comprising a user interacting with the object in a scene;
  
  generating a list that includes one or more modified prototype sequences for each of a plurality of prototype sequences, wherein the modified prototype sequences are different versions of their corresponding prototype sequence; and
  
  matching the image data against the plurality of prototype sequences and their corresponding modified prototype sequences;
  
  determining whether the image data matches one of the prototype sequences;
  
  determining that a gesture has been performed by the user, if the image data matches the one of the prototype sequences, said gesture comprising a movement performed by the user in a defined direction to interact with the object; and
  
  performing a control action on the object corresponding to the gesture.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
- - 11. The method of claim 10, wherein the modified prototype sequences are different from their corresponding prototype sequence in speed or amplitude.
  - 12. The method of claim 10, wherein the modified prototype sequences are different from their corresponding prototype sequence in time.
  - 13. The method of claim 10, wherein determining that a gesture has been performed further comprises determining that a gesture has been performed, if the image data matches the one of the modified prototype sequences corresponding to the prototype sequence.
  - 14. The method of claim 10, wherein matching the image data further comprises:
    - computing a similarity indicator between the image data and each of the prototype sequences; and
      
      selecting one of the prototype sequences based on the similarity indicator.
  - 15. The method of claim 14, wherein selecting one of the prototype sequences further comprises:
    - determining a best match based on the similarity indicators; and
      
      determining if the best match exceeds a threshold.
  - 16. The method of claim 15, wherein performing a control action further comprises:
    - performing the control action on the object corresponding to the gesture, if the best match exceeds the threshold; and
      
      performing no action on the object, if the best match does not exceed the threshold.
  - 17. The method of claim 10, wherein one of the at least one camera is an infrared camera.

18. One or more computer-readable storage media in a computing device having computer-useable instructions embodied thereon for performing a method for controlling an object, the method comprising:
- receiving infrared light with at least one infrared camera;
  
  capturing images associated with the infrared light with the at least one infrared camera;
  
  producing infrared image data from the infrared light, said infrared image data comprising a user interacting with the object in a scene;
  
  generating a list that includes one or more modified prototype sequences for each of a plurality of prototype sequences, wherein the modified prototype sequences are different versions of their corresponding prototype sequence; and
  
  matching the infrared image data against the plurality of prototype sequences and their corresponding modified prototype sequences;
  
  determining whether the infrared image data matches one of the prototype sequences;
  
  determining that a gesture has been performed by the user, if the infrared image data matches the one of the prototype sequences, said gesture comprising at least one of a movement performed by the user in a defined direction to interact with the object or speech uttered by the user to interact with the object; and
  
  performing a control action on the object corresponding to the gesture.
- View Dependent Claims (19, 20, 21, 22, 23, 24)
- - 19. The one or more computer-readable storage media of claim 18, wherein the modified prototype sequences are different from their corresponding prototype sequence in speed or amplitude.
  - 20. The one or more computer-readable storage media of claim 18, wherein the modified prototype sequences are different from their corresponding prototype sequence in time.
  - 21. The one or more computer-readable storage media of claim 18, wherein determining that a gesture has been performed further comprises determining that a gesture has been performed, if the infrared image data matches the one of the modified prototype sequences corresponding to the prototype sequence.
  - 22. The one or more computer-readable storage media of claim 18, wherein matching the infrared image data farther comprises:
    - computing a similarity indicator between the infrared image data and each of the prototype sequences; and
      
      selecting one of the prototype sequences based on the similarity indicator.
  - 23. The one or more computer-readable storage media of claim 22, wherein selecting one of the prototype sequences further comprises:
    - determining a best match based on the similarity indicators; and
      
      determining if the best match exceeds a threshold.
  - 24. The one or more computer-readable storage media of claim 23, wherein performing a control action further comprises:
    - performing the control action on the object corresponding to the gesture, if the best match exceeds the threshold; and
      
      performing no action on the object, if the best match does not exceed the threshold.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Wilson, Andrew
Primary Examiner(s)
HAILU, TADESSE

Application Number

US12/393,045
Publication Number

US 20090198354A1
Time in Patent Office

1,881 Days
Field of Search

715/702, 715/856, 715/863, 345/156, 345/158, 345/633, 382/103, 382/181, 382/203
US Class Current

715/863
CPC Class Codes

G06F 2203/0381   Multimodal input, i.e. inte...

G06F 3/0346   with detection of the devic...

G06F 3/038   Control and interface arran...

G08C 17/00   Arrangements for transmitti...

G08C 2201/31   Voice input

G08C 2201/32   Remote control based on mov...

G08C 2201/41   Remote control of gateways

G08C 2201/50   Receiving or transmitting f...

Controlling objects via gesturing

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

368 Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Controlling objects via gesturing

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

368 Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links