Two-dimensional method and system enabling three-dimensional user interaction with a device

US 8,723,789 B1
Filed: 02/03/2012
Issued: 05/13/2014
Est. Priority Date: 02/11/2011
Status: Active Grant

First Claim

Patent Images

1. A method to enable at least one user object interaction, in a three-dimensional hover zone, with an image presented on a display functionally coupled to a device, said interaction creating a detectable event useable by said device, where at least a portion of said user object is representable by at least one landmark, the method including the following steps:

(a) disposing a first camera having a first FOV and disposing a second camera having a second FOV such that intersecting said first FOV and second FOV define said three-dimensional hover zone;

(b) obtaining a first two-dimensional image from said first camera of at least a portion of said user object within said three-dimensional hover zone, and obtaining a second two-dimensional image from said second camera of at least a portion of said user object in said three-dimensional hover zone;

wherein said first two-dimensional image and said second two-dimensional image are obtained within a timing tolerance that is the longer of (i) said first and said second three-dimensional image are obtained within about ±

1.5 ms of each other, and (ii) said first and said second two-dimensional image each have an exposure duration of X ms, and said first and said second image are obtained within a tolerance of about ±

10%·

X;

(c) analyzing said first two-dimensional image and said second two-dimensional image to identify at least one said landmark and fewer than one hundred potential landmarks definable on said user object;

(d) establishing correspondence between said landmark in said first two-dimensional image and said same landmark in said second two-dimensional image to determine position of said landmark in three-dimensions; and

(e) using three-dimensional position information determined for said landmark at step (d) to create at least one instruction usable by said electronic device, in response to a detected said user object interaction;

wherein said user object interaction includes at least one interaction selected from a group consisting of (i) said user object physically touches said surface of said display, (ii) a gesture made by said user object in a region of said three-dimensional hover zone without physically touching said surface of said display.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

User interaction with a display is detected using at least two cameras whose intersecting FOVs define a three-dimensional hover zone within which user interactions can be imaged. Each camera substantially simultaneously acquires from its vantage point two-dimensional images of the user within the hover zone. Separately and collectively the image data is analyzed to identify therein a relatively few landmarks definable on the user. A substantially unambiguous correspondence is established between the same landmark on each acquired image, and as to those landmarks a three-dimensional reconstruction is made in a common coordinate system. This landmark identification and position information can be converted into a command causing the display to respond appropriately to a gesture made by the user. Advantageously size of the hover zone can far exceed size of the display, making the invention usable with smart phones as well as large size entertainment TVs.

100 Citations

View as Search Results

23 Claims

1. A method to enable at least one user object interaction, in a three-dimensional hover zone, with an image presented on a display functionally coupled to a device, said interaction creating a detectable event useable by said device, where at least a portion of said user object is representable by at least one landmark, the method including the following steps:
- (a) disposing a first camera having a first FOV and disposing a second camera having a second FOV such that intersecting said first FOV and second FOV define said three-dimensional hover zone;
  
  (b) obtaining a first two-dimensional image from said first camera of at least a portion of said user object within said three-dimensional hover zone, and obtaining a second two-dimensional image from said second camera of at least a portion of said user object in said three-dimensional hover zone;
  
  wherein said first two-dimensional image and said second two-dimensional image are obtained within a timing tolerance that is the longer of (i) said first and said second three-dimensional image are obtained within about ±
  
  1.5 ms of each other, and (ii) said first and said second two-dimensional image each have an exposure duration of X ms, and said first and said second image are obtained within a tolerance of about ±
  
  10%·
  
  X;
  
  (c) analyzing said first two-dimensional image and said second two-dimensional image to identify at least one said landmark and fewer than one hundred potential landmarks definable on said user object;
  
  (d) establishing correspondence between said landmark in said first two-dimensional image and said same landmark in said second two-dimensional image to determine position of said landmark in three-dimensions; and
  
  (e) using three-dimensional position information determined for said landmark at step (d) to create at least one instruction usable by said electronic device, in response to a detected said user object interaction;
  
  wherein said user object interaction includes at least one interaction selected from a group consisting of (i) said user object physically touches said surface of said display, (ii) a gesture made by said user object in a region of said three-dimensional hover zone without physically touching said surface of said display.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, wherein step (a) includes disposing said first camera and said second camera such that said three-dimensional hover zone has at least one characteristic selected from a group consisting of (i) said three-dimensional hover zone is adjacent said surface of said display, and (ii) said three-dimensional hover zone is adjacent said surface of said display and includes at least a region of said surface of said display.
  - 3. The method of claim 1, wherein said instruction created at step (e) causes at least one action selected from a group consisting of (i) said instruction causes said device to alter at least a portion of an image presented on said display, (ii) said instruction causes said device to issue an audible sound, (iii) said instruction causes said device to alter a characteristic of said device, and (iv) said instruction causes a change in orientation in said first camera and in said second camera.
  - 4. The method of claim 1, wherein:
    - step (a) includes previously calibrating said first camera and said second camera, and providing calibration information for said first camera and said second camera a priori; and
      
      step (d) includes and using correspondence so established and using said a priori determined calibration information for said first camera and for said second camera to determine in three-dimensions position of said landmark.
  - 5. The method of claim 1, wherein at step (d) position of said landmark in three-dimensions is determined with respect to a reference, said reference having at least one characteristic selected from a group consisting of (i) said reference is a fixed reference, (ii) said reference is a point on one of said first camera and said second camera, (iii) said reference is a point on said display, and (iv) said reference is a point on said device.
  - 6. The method of claim 1, wherein:
    - said user object is a user, and said landmark includes at least one landmark selected from a group consisting of (i) approximate centroid of said user'"'"'s body, (ii) approximate centroid of said user'"'"'s head, (iii) approximate centroid of said user'"'"'s hand, (iv) approximate location of said user'"'"'s fingertip, (v) approximate location of said user'"'"'s shoulder joint, (vi) approximate location of said user'"'"'s knee joint, and (vii) approximate location of said user'"'"'s foot.
  - 7. The method of claim 1, wherein at least one of said first camera and said second camera has a two-dimensional array of pixel sensors having at least one characteristic selected from a group consisting of (i) said pixel sensors sense color spectra, (ii) said pixel sensors sense monochrome spectra, (iii) said pixel sensors sense IR spectra, and (iv) said pixel sensors have equal pixel (x,y) resolution.
  - 8. The method of claim 1, wherein said device includes at least one device selected from a group consisting of (i) a smart phone, (ii) a tablet (iii) a netbook, (iv) a laptop, (v) an e-book reader, (vi) a PC, (VI) a TV, (viii) a set top box, (ix) a whiteboard projector, and (x) a monitor.
  - 9. The method of claim 1, wherein a cross-section of said three-dimensional hover zone taken parallel to a surface of said display has a size that is at least as large as a diagonal dimension of said display.

10. A system to enable at least one user object interaction, in a three-dimensional hover zone, with an image presented on a display functionally coupled to a device, said interaction creating a detectable event useable by said device, where at least a portion of said user object is representable by at least one landmark, the system including:
- at least a first camera having a first FOV and a second camera having second FOV, said first camera and said second camera disposed such that intersecting said first FOV and said second FOV define said three-dimensional hover zone;
  
  means for synchronizing, operatively coupled to at least said first camera and to said second camera, to obtain a first two-dimensional image from said first camera of at least a portion of said user object in said three-dimensional hover zone, and to obtain a second two-dimensional image from said second camera, of at least a portion of said user object in said three-dimensional hover zone;
  
  wherein at least one of said first camera and said second camera has at least one characteristic selected from a consisting of (i) a two-dimensional array of pixel sensors that senses color spectra, (ii) a two-dimensional array of pixel sensors that senses monochrome spectra, (iii) two-dimensional sensors that senses IR spectra, (iv) said first camera has a two-dimensional array of pixel sensors having equal pixel (x,y) resolution to said two-dimensional array of pixel sensors in said second camera, (v) a camera exposure duration that starts and stops within a timing tolerance of about ±
  
  1.5 ms, (vi) a camera exposure duration of X ms that starts and stops within a timing tolerance of about ±
  
  10%·
  
  X. (vii) said display includes a bezel and mounting of said first camera and said second camera is behind said bezel, (viii) first camera and said second camera are disposed such that said three-dimensional hover zone is adjacent said surface of said display, (ix) said first camera and said second camera are disposed such that said three-dimensional hover zone is adjacent said surface of said display and includes at least a region of said surface of said display, (x) said first camera and said second camera are selected and disposed such that a cross-sectional dimension of said three-dimensional hover zone taken parallel to a surface of said display is larger than a diagonal dimension of said display, and (xi) at least said first camera has been previously calibrated and calibration information for said first camera is known a priori;
  
  means for analyzing said first two-dimensional image and said second two-dimensional image to identify at least one said landmark and fewer than about one-hundred potential landmarks definable on said user object, said means for analyzing coupled to said first camera and to said second camera;
  
  means for establishing correspondence between said landmark in said first two-dimensional image and said same landmark in said second two-dimensional image to determine position of said landmark in three-dimensions, said means for establishing coupled to said means for analyzing; and
  
  means far creating at least one instruction usable by said device in response to a detected said user object interaction using three-dimensional position information determined for said landmark, said means for creating at least one instruction coupled to said means for establishing correspondence;
  
  wherein said user object interaction includes at least one interaction selected from a group consisting of (i) said user object physically touches said surface of said display, (ii) a gesture made by said user object in a region of said three-dimensional hover zone without physically touching said surface of said display.
- View Dependent Claims (11, 12, 13, 14, 15, 16)
- - 11. The system of claim 10, wherein at least one of said device and said first camera includes at least one processor coupled to memory storing at least one routine executable by said processor to implement at least two of said means for synchronizing, said means for analyzing, said means for establishing correspondence, and said means for creating at least one instruction.
  - 12. The system of claim 10, wherein said means for creating creates at least one instruction that causes at least one action selected from a group consisting of (i) said instruction causes said device to alter at least a portion of an image presented on said display, (ii) said instruction causes said device to issue an audible sound, (iii) said instruction causes said device to alter a characteristic of said device, and (iv) said instruction causes change in orientation in said first camera and in said second camera.
  - 13. The system of claim 10, wherein and said landmark includes at least one landmark selected from a group consisting of (i) approximate centroid of said user'"'"'s body, (ii) approximate centroid of said user'"'"'s head, (iii) approximate centroid of said user'"'"'s hand, (iv) approximate location of said user'"'"'s fingertip, (v) approximate location of said user'"'"'s shoulder joint, (vi) approximate location of said user'"'"'s knee joint, and (vii) approximate location of said user'"'"'s foot.
  - 14. The system of claim 10, wherein an enabled user interaction includes at least one interaction selected from a group consisting of (i) said user physically touching said display, (ii) a gesture made in a region of said hover zone without physically touching said display.
  - 15. The system of claim 10, wherein said device includes at least one device selected from a group consisting of (i) a smart phone that includes said display, (ii) a tablet that includes said display, (iii) a netbook that includes said display, (iv) a laptop that includes said display, (v) an e-book reader, (vi) a PC, (vii) a TV that includes said display, (viii) a set top box, (ix) a whiteboard projector, and (x) a monitor.
  - 16. The system of claim 10, wherein said system retrofittable to an existing display.

17. A hand-holdable electronic device enabling at least one user interaction in a three-dimensional hover zone with an image presented on a display, said interaction creating a detectable event useable by said electronic device, where at least a portion of said user is representable by at least one landmark, the electronic device including:
- a housing;
  
  a processor-controller unit including a processor coupled to memory storing at least one routine executable by said processor, said processor-controller unit disposed in said housing,a display having a surface, coupled to said processor-controller unit, able to present user viewable images responsive to commands from said processor-controller unit, said display integrally joined to said housing;
  
  at least a first camera having a first FOV and a second camera having a second FOV, said first camera and said second camera disposed relative to said housing such that intersecting said first FOV and second FOV define a three-dimensional hover zone adjacent said surface of said display, said first camera and said second camera integrally attached to said housing such that said three-dimensional hover zone projects outwardly relative to said surface of said display, wherein a transverse dimension of a cross-section of said three-dimensional hover zone in a plane parallel to said surface of said display is at least equal in size to a diagonal dimension of said display;
  
  wherein said first camera and said second camera each include a two-dimensional array of pixel sensors sensing at least one of (i) color spectra, monochrome spectra, and (iii) IR spectra;
  
  said processor controller unit further including;
  
  means for synchronizing, operatively coupled to at least said first camera and to said second camera, to obtain a first two-dimensional image from said first camera of at least a portion of said user in said three-dimensional hover zone, and to obtain a second two-dimensional image from said second camera of at least a portion of said user in said three-dimensional hover zone,means for analyzing said first two-dimensional image and said second two-dimensional image to identify at least one said landmark and fewer than about one-hundred potential landmarks definable on said user, said means for analyzing coupled to said first camera and to said second camera;
  
  wherein an identified said landmark includes at least one landmark selected from a group consisting of (i) approximate centroid of a user'"'"'s body, (ii) approximate centroid of a user'"'"'s head, (iii) approximate centroid of a user'"'"'s hand, (iv) approximate location of a user'"'"'s fingertip, (v) approximate location of a user'"'"'s shoulder joint, (vi) approximate location of a user'"'"'s knee joint, and (vii) approximate location of user'"'"'s foot;
  
  means for establishing correspondence between said landmark in said first two-dimensional image and said same landmark in said second two-dimensional image to determine position of said landmark in three-dimensions, said means for establishing coupled to said means for analyzing; and
  
  means for creating at least one instruction usable by said electronic device in response to a detected said user interaction using three-dimensional position information determined for said landmark, said means for creating at least one instruction coupled to said means for establishing correspondence;
  
  wherein said instruction causes at least one action selected from a group consisting of (i) said instruction causes said electronic device to alter at least a portion of an image presented on said display, (ii) said instruction causes said electronic device to issue an audible sound, (iii) said instruction causes said electronic device to alter a characteristic of said electronic device; and
  
  wherein said user interaction includes at least one interaction selected from a group consisting of (i) said user physically touches said surface of said display, (ii) a gesture made by said user in a region of said three-dimensional hover zone without physically touching said surface of said display.
- View Dependent Claims (18, 19, 20, 21, 22, 23)
- - 18. The hand-holdable electronic device of claim 17, wherein said electronic device is selected from a group consisting of (i) a smart phone, (ii) a tablet, (iii) a netbook, (iv) a laptop, and (v) an e-book reader.
  - 19. The hand-holdable electronic device of claim 17, wherein execution of at least one said routine stored in said processor-controller unit by said processor implements at least two of said means for synchronizing, said means for analyzing, said means for establishing correspondence, and said means for creating at least one instruction.
  - 20. The hand-holdable electronic device of claim 17, wherein at least one of said first camera and said second camera has at least one characteristic selected from a group consisting of (i) each said camera has a two-dimensional array of pixel sensors having equal pixel (x,y) resolution, (ii) a camera exposure duration that starts and stops within a timing tolerance of about ±
    - 1.5 ms, (iii) a camera exposure duration of X ms that starts and stops within a timing tolerance of about ±
      
      10%·
      
      X, (iv) said display includes a bezel and said first camera and said second camera are mounted behind said bezel, (v) said first camera and said second camera are disposed such that said three-dimensional hover zone includes at least a region of said surface of said display, and (vi) at least said first camera has been previously calibrated and calibration information for said first camera is known a priori.
  - 21. The hand-holdable electronic device of claim 17, wherein said means for creating creates at least one instruction that causes a change in orientation in said first camera and in said second camera.
  - 22. The hand-holdable electronic device of claim 17, wherein a detected said user interaction includes at least one interaction selected from a group consisting of (i) a user physically touching said display, (ii) a gesture made by a user in a region of said hover zone without physically touching said display.
  - 23. The hand-holdable electronic device of claim 17, wherein said electronic device includes at least one electronic device selected from a group consisting of (i) a smart phone that includes said display, (ii) a tablet that includes said display, (iii) a netbook that includes said display, (iv) a laptop that includes said display, and (v) an e-book reader that includes said display.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
KAYA Dynamics LLC
Original Assignee
Imimtek, Inc.
Inventors
Rafii, Abbas
Primary Examiner(s)
Mengistu, Amare
Assistant Examiner(s)
Zubajlo, Jennifer

Application Number

US13/385,134
Time in Patent Office

830 Days
Field of Search

345/156, 345/173
US Class Current

345/156
CPC Class Codes

G06F 3/011 Arrangements for interactio...

G06F 3/017 Gesture based interaction, ...

Two-dimensional method and system enabling three-dimensional user interaction with a device

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

100 Citations

23 Claims

Specification

Solutions

Use Cases

Quick Links

Two-dimensional method and system enabling three-dimensional user interaction with a device

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

100 Citations

23 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links