Two-dimensional method and system enabling three-dimensional user interaction with a device

US 8,686,943 B1
Filed: 05/14/2012
Issued: 04/01/2014
Est. Priority Date: 05/13/2011
Status: Expired due to Fees

First Claim

Patent Images

1. A method to enable at least one user object interaction, user-made in a three-dimensional hover zone, with a device, said interaction creating a detectable event useable by said device, where at least a portion of said user object is representable by at least one landmark, the method including the following steps:

(a) disposing a first camera having a first FOV and disposing a second camera having a second FOV such that intersecting said first FOV and second FOV define said three-dimensional hover zone, said first camera having a first pixel sensor array with x-columns and y-rows and a first resolution, and said second camera having a second pixel sensor array with a second resolution substantially equal to said first resolution;

(b) obtaining a first two-dimensional image from said first camera of at least a portion of said user object within said three-dimensional hover zone, and obtaining a second two-dimensional image from said second camera of at least a portion of said user object in said three-dimensional hover zone, wherein said first image comprises a first set of pixel data, and said second image comprises a second set of pixel data, where a number N represents a maximum total data points acquired by said first and by said second set of pixel data;

(c) analyzing said first and second set of pixel data obtained at step (b) to identify in said pixel data potential two-dimensional locations of landmark data points on said user object, such that data reduction to less than about 1% N occurs;

(d) determining for at least some said two-dimensional locations identified in step (c) three-dimensional locations of potential landmarks on said user object, wherein remaining data is less than about 0.1% N;

(e) using three-dimensional locations determined at step (d) and using dynamic information for data remaining after step (d) to further reduce a number of three-dimensional locations of potential landmarks on said user object to less than about 0.01% N, each said potential landmark locations being characterizable by confidence probabilities as being a type of user object;

wherein three-dimensional data following step (e) is outputtable to said device to affect at least one device parameter responsive to detected user interaction.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

User interaction with a display is detected substantially simultaneously using at least two cameras whose intersecting FOVs define a three-dimensional hover zone within which user interactions can be imaged. Separately and collectively image data is analyzed to identify a relatively few user landmarks. A substantially unambiguous correspondence is established between the same landmark on each acquired image, and a three-dimensional reconstruction is made in a common coordinate system. Preferably cameras are modeled to have characteristics of pinhole cameras, enabling rectified epipolar geometric analysis to facilitate more rapid disambiguation among potential landmark points. Consequently processing overhead is substantially reduced, as are latency times. Landmark identification and position information is convertible into a command causing the display to respond appropriately to a user gesture. Advantageously size of the hover zone can far exceed size of the display, making the invention usable with smart phones as well as large size entertainment TVs.

Citations

20 Claims

1. A method to enable at least one user object interaction, user-made in a three-dimensional hover zone, with a device, said interaction creating a detectable event useable by said device, where at least a portion of said user object is representable by at least one landmark, the method including the following steps:
- (a) disposing a first camera having a first FOV and disposing a second camera having a second FOV such that intersecting said first FOV and second FOV define said three-dimensional hover zone, said first camera having a first pixel sensor array with x-columns and y-rows and a first resolution, and said second camera having a second pixel sensor array with a second resolution substantially equal to said first resolution;
  
  (b) obtaining a first two-dimensional image from said first camera of at least a portion of said user object within said three-dimensional hover zone, and obtaining a second two-dimensional image from said second camera of at least a portion of said user object in said three-dimensional hover zone, wherein said first image comprises a first set of pixel data, and said second image comprises a second set of pixel data, where a number N represents a maximum total data points acquired by said first and by said second set of pixel data;
  
  (c) analyzing said first and second set of pixel data obtained at step (b) to identify in said pixel data potential two-dimensional locations of landmark data points on said user object, such that data reduction to less than about 1% N occurs;
  
  (d) determining for at least some said two-dimensional locations identified in step (c) three-dimensional locations of potential landmarks on said user object, wherein remaining data is less than about 0.1% N;
  
  (e) using three-dimensional locations determined at step (d) and using dynamic information for data remaining after step (d) to further reduce a number of three-dimensional locations of potential landmarks on said user object to less than about 0.01% N, each said potential landmark locations being characterizable by confidence probabilities as being a type of user object;
  
  wherein three-dimensional data following step (e) is outputtable to said device to affect at least one device parameter responsive to detected user interaction.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein step (c) has at least one characteristic selected from a group consisting of (i) step (c) includes edge detection, and (ii) step (c) includes object modeling.
  - 3. The method of claim 1, wherein step (d) includes at least one method selected from a group consisting of (i) image rectification, and (ii) epipolar geometric analysis.
  - 4. The method of claim 1, said user object is a user, and said landmark includes at least one landmark selected from a group consisting of (i) approximate centroid of said user'"'"'s body, (ii) approximate centroid of said user'"'"'s head, (iii) approximate centroid of said user'"'"'s hand, (iv) approximate centroid of said user'"'"'s finger, (v) approximate location of said user'"'"'s fingertip, (vi) approximate location of said user'"'"'s shoulder joint, (vii) approximate location of said user'"'"'s knee joint, and (viii) approximate location of said user'"'"'s foot.
  - 5. The method of claim 1, wherein:
    - said device is functionally coupleable to a display having a display surface; and
      
      said user object interaction includes at least one interaction selected from a group consisting of (i) said user object physically touches said display surface, and (ii) a gesture made by said user object in a region of said three-dimensional hover zone without physically touching said display surface; and
      
      wherein said device in response to output from step (e) creates at least one instruction usable by said device, in response to a detected said user object interaction, said at least one instruction causing at least one action selected from a group consisting of (i) said instruction causes said device to alter at least a portion of an image presented on said display, (ii) said instruction causes said device to issue an audible sound, (iii) said instruction causes said device to alter a characteristic of said device, and (iv) said instruction causes a change in orientation in said first camera and in said second camera.
  - 6. The method of claim 5, wherein said device includes at least one of a gyroscope and an accelerometer, and wherein data outputtable from step (e) is modified by data from at least one of said gyroscope and accelerometer to alter an image on said display;
    - wherein said display has a characteristic selected from a group consisting of (i) said display and said device share a common housing, and (ii) said display is separate from said device; and
      
      said device includes at least one device selected from a group consisting of (i) a smart phone, (ii) a tablet, (iii) a netbook, (iv) a laptop, (v) an e-book reader, (vi) a PC, (vii) a TV, and (viii) a set top box.
  - 7. The method of claim 5, wherein step (a) includes disposing said first camera and said second camera such that said three-dimensional hover zone has at least one characteristic selected from a group consisting of (i) said three-dimensional hover zone is adjacent said surface of said display, and (ii) said three-dimensional hover zone is adjacent said surface of said display and includes at least a region of said surface of said display.
  - 8. The method of claim 5, wherein a cross-section of said three-dimensional hover zone taken parallel to said display surface has a size that is at least as large as a diagonal dimension of said display.

9. A system to enable at least one user object interaction, user-made in a three-dimensional hover zone, with a device, said interaction creating a detectable event useable by said device, where at least a portion of said user object is representable by at least one landmark, the system including:
- at least a first camera having a first FOV and a second camera having a second FOV, said first and second camera disposed such an intersection of said first FOV and second FOV defines said three-dimensional hover zone, said first camera having a first pixel sensor array with x-columns and y-rows and a first resolution, and said second camera having a second pixel sensor array with a second resolution substantially equal to said first resolution;
  
  means for obtaining a first two-dimensional image from said first camera of at least a portion of said user object within said three-dimensional hover zone, and obtaining a second two-dimensional image from said second camera of at least a portion of said user object in said three-dimensional hover zone, wherein said first image comprises a first set of pixel data, and said second image comprises a second set of pixel data, where a number N represents a maximum total data points acquired by said first and by said second set of pixel data;
  
  means for analyzing said first and second set of pixel data, obtained by said means for obtaining, to identify in said pixel data potential two-dimensional locations of landmark data points on said user object, such that data reduction to less than about 1% N occurs;
  
  means for determining for at least some said two-dimensional locations, identified by said means for analyzing, three-dimensional locations of potential landmarks on said user object, wherein remaining data is less than about 0.1% N;
  
  means for using three-dimensional locations, determined by said means for determining and using dynamic information for data remaining in said less than about 0.1% step, to further reduce a number of three-dimensional locations of potential landmarks on said user object to less than about 0.01% N, each said potential landmark locations being characterizable by confidence probabilities as being a type of user object;
  
  wherein three-dimensional data obtained from said means for using is outputtable to said device to affect at least one device parameter responsive to detected user interaction.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
- - 10. The system of claim 9, wherein said means for analyzing has at least one characteristic selected from a group consisting of (i) analysis includes edge detection, and (ii) analysis includes object modeling.
  - 11. The system of claim 9, wherein said means for determining uses at least one analytical technique selected from a group consisting of (i) image rectification, and (ii) epipolar geometric analysis.
  - 12. The system of claim 9, wherein said user object is a user, and said landmark includes at least one landmark selected from a group consisting of (i) approximate centroid of said user'"'"'s body, (ii) approximate centroid of said user'"'"'s head, (iii) approximate centroid of said user'"'"'s hand, (iv) approximate centroid of said user'"'"'s finger, (v) approximate location of said user'"'"'s fingertip, (vi) approximate location of said user'"'"'s shoulder joint, (vii) approximate location of said user'"'"'s knee joint, and (viii) approximate location of said user'"'"'s foot.
  - 13. The system of claim 9, wherein said device is functionally coupleable to a display having a display surface;
    - andsaid user object interaction includes at least one interaction selected from a group consisting of (i) said user object physically touches said display surface, and (ii) a gesture made by said user object in a region of said three-dimensional hover zone without physically touching said display surface; and
      
      wherein said device in response to output from means for using creates at least one instruction usable by said device, in response to a detected said user object interaction, said at least one instruction causing at least one action selected from a group consisting of (i) said instruction causes said device to alter at least a portion of an image presented on said display, (ii) said instruction causes said device to issue an audible sound, (iii) said instruction causes said device to alter a characteristic of said device, and (iv) said instruction causes a change in orientation in said first camera and in said second camera.
  - 14. The system of claim 13, wherein said device includes at least one of a gyroscope and an accelerometer, and wherein data outputtable from step (e) is modified by data from at least one of said gyroscope and accelerometer to alter an image on said display;
    - wherein said display has a characteristic selected from a group consisting of (i) said display and said device share a common housing, and (ii) said display is separate from said device; and
      
      said device includes at least one device selected from a group consisting of (i) a smart phone, (ii) a tablet, (iii) a netbook, (iv) a laptop, (v) an e-book reader, (vi) a PC, (vii) a TV, and (viii) a set top box.
  - 15. The system of claim 13, wherein said first camera and said second camera are disposed such that said three-dimensional hover zone has at least one characteristic selected from a group consisting of (i) said three-dimensional hover zone is adjacent said surface of said display, and (ii) said three-dimensional hover zone is adjacent said surface of said display and includes at least a region of said surface of said display.
  - 16. The system of claim 13, wherein a cross-section of said three-dimensional hover zone taken parallel to said display surface has a size that is at least as large as a diagonal dimension of said display.

17. A hand-holdable electronic device enabling at least one user interaction in a three-dimensional hover zone with an image presented on a display, said interaction creating a detectable event useable by said electronic device, where at least a portion of said user is representable by at least one landmark, the electronic device including:
- a housing;
  
  a processor-controller unit including a processor coupled to memory storing at least one routine executable by said processor, said processor-controller unit disposed in said housing;
  
  a display having a display surface, coupled to said processor-controller unit, able to present user viewable images responsive to commands from said processor-controller unit, said display integrally joined to said housing;
  
  at least a first camera having a first FOV and a second camera having a second FOV, said first and second camera disposed such an intersection of said first FOV and second FOV defines said three-dimensional hover zone, said first camera having a first pixel sensor array with x-columns and y-rows and a first resolution, and said second camera having a second pixel sensor array with a second resolution substantially equal to said first resolution;
  
  said processor controller unit further including;
  
  means for obtaining a first two-dimensional image from said first camera of at least a portion of said user object within said three-dimensional hover zone, and obtaining a second two-dimensional image from said second camera having of at least a portion of said user object in said three-dimensional hover zone, wherein said first image comprises a first set of pixel data, and said second image comprises a second set of pixel data, where a number N represents a maximum total data points acquired by said first and by said second set of pixel data;
  
  means for analyzing said first and second set of pixel data, obtained by said means for obtaining, to identify in said pixel data potential two-dimensional locations of landmark data points on said user object, such that data reduction to less than about 1% N occurs;
  
  means for determining for at least some said two-dimensional locations, identified by said means for analyzing, three-dimensional locations of potential landmarks on said user object, wherein remaining data is less than about 0.1% N;
  
  means for using three-dimensional locations, determined by said means for determining and using dynamic information for data remaining in said less than about 0.1% step, to further reduce a number of three-dimensional locations of potential landmarks on said user object to less than about 0.01% N, each said potential landmark locations being characterizable by confidence probabilities as being a type of user object;
  
  wherein three-dimensional data obtained from said means for using is coupled to said electronic device to affect at least one device parameter selected from a group consisting of (i) causing said electronic device to alter at least a portion of an image presented on said display, (ii) causing said electronic device to issue an audible sound, (iii) causing said electronic device to alter a characteristic of said electronic device, and (iv) causing a change in orientation in said first camera and in said second camera; and
  
  wherein said electronic device includes at least one device selected from a group consisting of (i) a smart phone, (ii) a tablet, (iii) a netbook, (iv) a laptop, (v) an e-book reader, (vi) a PC, (vii) a TV, and (viii) a set top box.
- View Dependent Claims (18, 19, 20)
- - 18. The hand-holdable electronic device of claim 17, wherein:
    - said means for analyzing includes at least one analytical technique selected from a group consisting of (i) edge detection, and (ii) object modeling; and
      
      said means for determining includes at least one analytical technique selected from a group consisting of (i) image rectification, and (ii) epipolar geometric analysis.
  - 19. The hand-holdable electronic device of claim 17, further including at least one of a gyroscope and an accelerometer, wherein data from said means for outputting is modified by data from at least one of said gyroscope and accelerometer to alter in image on said display.
  - 20. The hand-holdable electronic device of claim 17, wherein said three-dimensional hover zone has at least one characteristic selected from a group consisting (i) said three-dimensional hover zone is adjacent said surface of said display, (ii) said three-dimensional hover zone is adjacent said surface of said display and includes at least a region of said surface of said display, and said three-dimensional hover zone has a cross-section taken parallel to said display surface has a size that is at least as large as a diagonal dimension of said display.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
KAYA Dynamics LLC
Original Assignee
Imimtek, Inc.
Inventors
Rafii, Abbas
Primary Examiner(s)
Mengistu, Amare
Assistant Examiner(s)
LAM, VINH TANG

Application Number

US13/506,743
Time in Patent Office

687 Days
Field of Search

345156-184, 345619-688, 463 1- 8, 463 30- 40, 463 50- 56, 382103-284
US Class Current

345/158
CPC Class Codes

G06F 2203/04104   Multi-touch detection in di...

G06F 2203/04108   Touchless 2D- digitiser, i....

G06F 3/011   Arrangements for interactio...

G06F 3/017   Gesture based interaction, ...

G06F 3/0304   Detection arrangements usin...

G06F 3/042   by opto-electronic means

G06V 10/462   Salient features, e.g. scal...

G06V 20/653   by matching three-dimension...

G06V 40/113   Recognition of static hand ...

Two-dimensional method and system enabling three-dimensional user interaction with a device

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Two-dimensional method and system enabling three-dimensional user interaction with a device

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links