Video-based image control system

US 7,227,526 B2
Filed: 07/23/2001
Issued: 06/05/2007
Est. Priority Date: 07/24/2000
Status: Expired due to Term

First Claim

Patent Images

1. A stereo vision system for interfacing with an application program running on a computer, the stereo vision system comprising:

first and second video cameras arranged in an adjacent configuration and operable to produce at least first and second stereo video images; and

a processor operable to receive the first and second stereo video images and detect objects appearing in an intersecting field of view of the cameras, the processor executing a process to;

define an object detection region in three-dimensional coordinates relative to a position of the first and second video cameras;

divide the first and second stereo video images into features;

pair features of the first stereo video image with features of the second stereo video image;

generate a depth description map, the depth description map describing the position and disparity of paired features relative to the first and second stereo video images;

generate a scene description based upon the depth description map, the scene description defining a three-dimensional position for each feature;

cluster adjacent features;

crop clustered feature based upon predefined thresholds;

analyze the three-dimensional position of each clustered feature within the object detection region to determine position information of a control object; and

map the position information of the control object to a position indicator associated with an application program as the control object moves within the object detection region.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of using stereo vision to interface with a computer is provided. The method includes capturing a stereo image, and processing the stereo image to determine position information of an object in the stereo image. The object is controlled by a user. The method also includes communicating the position information to the computer to allow the user to interact with a computer application.

848 Citations

24 Claims

1. A stereo vision system for interfacing with an application program running on a computer, the stereo vision system comprising:
- first and second video cameras arranged in an adjacent configuration and operable to produce at least first and second stereo video images; and
  
  a processor operable to receive the first and second stereo video images and detect objects appearing in an intersecting field of view of the cameras, the processor executing a process to;
  
  define an object detection region in three-dimensional coordinates relative to a position of the first and second video cameras;
  
  divide the first and second stereo video images into features;
  
  pair features of the first stereo video image with features of the second stereo video image;
  
  generate a depth description map, the depth description map describing the position and disparity of paired features relative to the first and second stereo video images;
  
  generate a scene description based upon the depth description map, the scene description defining a three-dimensional position for each feature;
  
  cluster adjacent features;
  
  crop clustered feature based upon predefined thresholds;
  
  analyze the three-dimensional position of each clustered feature within the object detection region to determine position information of a control object; and
  
  map the position information of the control object to a position indicator associated with an application program as the control object moves within the object detection region.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 17)
- - 2. The stereo vision system of claim 1 wherein the process selects as the control object a detected object appearing closest to the video cameras and within the object detection region.
  - 3. The stereo vision system of claim 1 wherein the control object is a human hand.
  - 4. The stereo vision system of claim 1 wherein a horizontal position of the control object relative to the video cameras is mapped to an x-axis screen coordinate of the position indicator.
  - 5. The stereo vision system of claim 1 wherein a vertical position of the control object relative to the video cameras is mapped to a y-axis screen coordinate of the position indicator.
  - 6. The stereo vision system of claim 1 wherein the processor is configured to:
    - map a horizontal position of the control object relative to the video cameras to a x-axis screen coordinate of the position indicator;
      
      map a vertical position of the control object relative to the video cameras to a y-axis screen coordinate of the position indicator; and
      
      emulate a mouse function using the combined x-axis and y-axis screen coordinates provided to the application program.
  - 7. The stereo vision system of claim 6 wherein the processor is further configured to emulate buttons of a mouse using gestures derived from the motion of the object position.
  - 8. The stereo vision system of claim 6 wherein the processor is further configured to emulate buttons of a mouse based upon a sustained position of the control object in any position within the object detection region for a predetermined time period.
  - 9. The stereo vision system of claim 6 wherein the processor is further configured to emulate buttons of a mouse based upon a position of the position indicator being sustained within the bounds of an interactive display region for a predetermined time period.
  - 10. The stereo vision system of claim 1 wherein the processor is further configured to map a z-axis depth position of the control object relative to the video cameras to a virtual z-axis screen coordinate of the position indicator.
  - 11. The stereo vision system of claim 1 wherein the processor is further configured to:
    - map a x-axis position of the control object relative to the video cameras to an x-axis screen coordinate of the position indicator;
      
      map a y-axis position of the control object relative to the video cameras to a y-axis screen coordinate of the position indicator; and
      
      map a z-axis depth position of the control object relative to the video cameras to a virtual z-axis screen coordinate of the position indicator.
  - 12. The stereo vision system of claim 11 wherein a position of the position indicator being within the bounds of an interactive display region triggers an action within the application program.
  - 13. The stereo vision system of claim 1 wherein movement of the control object along a z-axis depth position that covers a predetermined distance within a predetermined time period triggers a selection action within the application program.
  - 14. The stereo vision system of claim 1 wherein a position of the control object being sustained in any position within the object detection region for a predetermined time period triggers a selection action within the application program.
  - 17. The method of claim 3 further comprising:
    - determining an application state of the computer application; and
      
      using the application state in recognizing the gesture.

15. A method of using computer vision to interface with a computer, the method comprising:
- capturing at least first and second images of a scene;
  
  dividing the first and second images into features;
  
  pairing features of the first image with features of the second image;
  
  generating a depth description map, the depth description map describing the position and disparity of paired features relative to the first and second images;
  
  generating a scene description based upon the depth description map, the scene description defining a three-dimensional position for each feature;
  
  clustering adjacent features;
  
  cropping clustered features based upon predefined thresholds;
  
  defining an object detection region;
  
  analyzing the three-dimensional position of each clustered feature within the object detection region to determine position information of an object; and
  
  using the position information to control a computer application.
- View Dependent Claims (16, 18, 19, 20, 21, 22, 23, 24)
- - 16. The method of claim 15 further comprising:
    - recognizing a gesture associated with the object by analyzing changes in the position information of the object, andcontrolling the computer application based on the recognized gesture.
  - 18. The method of claim 15 wherein the object is the user.
  - 19. The method of claim 15 wherein the object is a part of the user.
  - 20. The method of claim 18 further comprising providing feedback to the user relative to the computer application.
  - 21. The method of claim 15 further comprising mapping the position information from position coordinates associated with the object to screen coordinates associated with the computer application.
  - 22. The method of claim 15 further comprising:
    - analyzing the scene description to identify a change in position of the object; and
      
      mapping the change in position of the object.
  - 23. The method of claim 15 wherein generating the scene description comprises generating the scene description from stereo images.
  - 24. The method of claim 15 wherein:
    - generating a scene description comprises generating a scene description that includes an indication of a three-dimensional position of a feature included in a scene and an indication a shape of the feature; and
      
      analyzing the scene description comprises analyzing the scene description including the indication of the three-dimensional position of the feature and the indication of the shape of the feature to determine position information of an object.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Qualcomm, Inc.
Original Assignee
GestureTek Incorporated (Winning Brands Corp.)
Inventors
MacDougall, Francis, Hildreth, Evan
Primary Examiner(s)
Shalwala; Bipin
Assistant Examiner(s)
DHARIA, PRABODH M

Application Number

US09/909,857
Publication Number

US 20020041327A1
Time in Patent Office

2,143 Days
Field of Search

345156-158, 345/127, 345/757, 345/356, 345/161, 345/475, 345/7, 345/473, 345/418, 345419-422, 345/2.1, 382/103, 382/154, 382/209, 382/281, 382/218, 382254-260, 382/262, 701/1, 340/901, 348/143, 348/169, 348/219.1
US Class Current

345/156
CPC Class Codes

A63F 2300/1093   using visible light

A63F 2300/69   Involving elements of the r...

A63F 2300/8082   Virtual reality

G06F 3/011   Arrangements for interactio...

G06F 3/017   Gesture based interaction, ...

G06T 19/006   Mixed reality object pose d...

G06T 2207/10012   Stereo images

G06T 2207/10021   Stereoscopic video; Stereos...

G06T 2207/20021   Dividing image into blocks,...

G06T 2207/30196   Human being; Person

G06T 7/593   from stereo images

G06T 7/74   involving reference images ...

G06V 40/107   Static hand or arm

H04N 13/239   using two 2D image sensors ...

H04N 13/246   Calibration of cameras

H04N 2013/0081   Depth or disparity estimati...

Video-based image control system

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

848 Citations

24 Claims

Specification

Use Cases

Quick Links

Others

Video-based image control system

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

848 Citations

24 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others