Two-dimensional method and system enabling three-dimensional user interaction with a device
First Claim
1. A method to enable at least one user object interaction, user-made in a three-dimensional hover zone, with a device, said interaction creating a detectable event useable by said device, where at least a portion of said user object is representable by at least one landmark, the method including the following steps:
- (a) disposing a first camera having a first FOV and disposing a second camera having a second FOV such that intersecting said first FOV and second FOV define said three-dimensional hover zone, said first camera having a first pixel sensor array with x-columns and y-rows and a first resolution, and said second camera having a second pixel sensor array with a second resolution substantially equal to said first resolution;
(b) obtaining a first two-dimensional image from said first camera of at least a portion of said user object within said three-dimensional hover zone, and obtaining a second two-dimensional image from said second camera of at least a portion of said user object in said three-dimensional hover zone, wherein said first image comprises a first set of pixel data, and said second image comprises a second set of pixel data, where a number N represents a maximum total data points acquired by said first and by said second set of pixel data;
(c) analyzing said first and second set of pixel data obtained at step (b) to identify in said pixel data potential two-dimensional locations of landmark data points on said user object, such that data reduction to less than about 1% N occurs;
(d) determining for at least some said two-dimensional locations identified in step (c) three-dimensional locations of potential landmarks on said user object, wherein remaining data is less than about 0.1% N;
(e) using three-dimensional locations determined at step (d) and using dynamic information for data remaining after step (d) to further reduce a number of three-dimensional locations of potential landmarks on said user object to less than about 0.01% N, each said potential landmark locations being characterizable by confidence probabilities as being a type of user object;
wherein three-dimensional data following step (e) is outputtable to said device to affect at least one device parameter responsive to detected user interaction.
5 Assignments
0 Petitions
Accused Products
Abstract
User interaction with a display is detected substantially simultaneously using at least two cameras whose intersecting FOVs define a three-dimensional hover zone within which user interactions can be imaged. Separately and collectively image data is analyzed to identify a relatively few user landmarks. A substantially unambiguous correspondence is established between the same landmark on each acquired image, and a three-dimensional reconstruction is made in a common coordinate system. Preferably cameras are modeled to have characteristics of pinhole cameras, enabling rectified epipolar geometric analysis to facilitate more rapid disambiguation among potential landmark points. Consequently processing overhead is substantially reduced, as are latency times. Landmark identification and position information is convertible into a command causing the display to respond appropriately to a user gesture. Advantageously size of the hover zone can far exceed size of the display, making the invention usable with smart phones as well as large size entertainment TVs.
-
Citations
20 Claims
-
1. A method to enable at least one user object interaction, user-made in a three-dimensional hover zone, with a device, said interaction creating a detectable event useable by said device, where at least a portion of said user object is representable by at least one landmark, the method including the following steps:
-
(a) disposing a first camera having a first FOV and disposing a second camera having a second FOV such that intersecting said first FOV and second FOV define said three-dimensional hover zone, said first camera having a first pixel sensor array with x-columns and y-rows and a first resolution, and said second camera having a second pixel sensor array with a second resolution substantially equal to said first resolution; (b) obtaining a first two-dimensional image from said first camera of at least a portion of said user object within said three-dimensional hover zone, and obtaining a second two-dimensional image from said second camera of at least a portion of said user object in said three-dimensional hover zone, wherein said first image comprises a first set of pixel data, and said second image comprises a second set of pixel data, where a number N represents a maximum total data points acquired by said first and by said second set of pixel data; (c) analyzing said first and second set of pixel data obtained at step (b) to identify in said pixel data potential two-dimensional locations of landmark data points on said user object, such that data reduction to less than about 1% N occurs; (d) determining for at least some said two-dimensional locations identified in step (c) three-dimensional locations of potential landmarks on said user object, wherein remaining data is less than about 0.1% N; (e) using three-dimensional locations determined at step (d) and using dynamic information for data remaining after step (d) to further reduce a number of three-dimensional locations of potential landmarks on said user object to less than about 0.01% N, each said potential landmark locations being characterizable by confidence probabilities as being a type of user object; wherein three-dimensional data following step (e) is outputtable to said device to affect at least one device parameter responsive to detected user interaction. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system to enable at least one user object interaction, user-made in a three-dimensional hover zone, with a device, said interaction creating a detectable event useable by said device, where at least a portion of said user object is representable by at least one landmark, the system including:
-
at least a first camera having a first FOV and a second camera having a second FOV, said first and second camera disposed such an intersection of said first FOV and second FOV defines said three-dimensional hover zone, said first camera having a first pixel sensor array with x-columns and y-rows and a first resolution, and said second camera having a second pixel sensor array with a second resolution substantially equal to said first resolution; means for obtaining a first two-dimensional image from said first camera of at least a portion of said user object within said three-dimensional hover zone, and obtaining a second two-dimensional image from said second camera of at least a portion of said user object in said three-dimensional hover zone, wherein said first image comprises a first set of pixel data, and said second image comprises a second set of pixel data, where a number N represents a maximum total data points acquired by said first and by said second set of pixel data; means for analyzing said first and second set of pixel data, obtained by said means for obtaining, to identify in said pixel data potential two-dimensional locations of landmark data points on said user object, such that data reduction to less than about 1% N occurs; means for determining for at least some said two-dimensional locations, identified by said means for analyzing, three-dimensional locations of potential landmarks on said user object, wherein remaining data is less than about 0.1% N; means for using three-dimensional locations, determined by said means for determining and using dynamic information for data remaining in said less than about 0.1% step, to further reduce a number of three-dimensional locations of potential landmarks on said user object to less than about 0.01% N, each said potential landmark locations being characterizable by confidence probabilities as being a type of user object; wherein three-dimensional data obtained from said means for using is outputtable to said device to affect at least one device parameter responsive to detected user interaction. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A hand-holdable electronic device enabling at least one user interaction in a three-dimensional hover zone with an image presented on a display, said interaction creating a detectable event useable by said electronic device, where at least a portion of said user is representable by at least one landmark, the electronic device including:
-
a housing; a processor-controller unit including a processor coupled to memory storing at least one routine executable by said processor, said processor-controller unit disposed in said housing; a display having a display surface, coupled to said processor-controller unit, able to present user viewable images responsive to commands from said processor-controller unit, said display integrally joined to said housing; at least a first camera having a first FOV and a second camera having a second FOV, said first and second camera disposed such an intersection of said first FOV and second FOV defines said three-dimensional hover zone, said first camera having a first pixel sensor array with x-columns and y-rows and a first resolution, and said second camera having a second pixel sensor array with a second resolution substantially equal to said first resolution; said processor controller unit further including; means for obtaining a first two-dimensional image from said first camera of at least a portion of said user object within said three-dimensional hover zone, and obtaining a second two-dimensional image from said second camera having of at least a portion of said user object in said three-dimensional hover zone, wherein said first image comprises a first set of pixel data, and said second image comprises a second set of pixel data, where a number N represents a maximum total data points acquired by said first and by said second set of pixel data; means for analyzing said first and second set of pixel data, obtained by said means for obtaining, to identify in said pixel data potential two-dimensional locations of landmark data points on said user object, such that data reduction to less than about 1% N occurs; means for determining for at least some said two-dimensional locations, identified by said means for analyzing, three-dimensional locations of potential landmarks on said user object, wherein remaining data is less than about 0.1% N; means for using three-dimensional locations, determined by said means for determining and using dynamic information for data remaining in said less than about 0.1% step, to further reduce a number of three-dimensional locations of potential landmarks on said user object to less than about 0.01% N, each said potential landmark locations being characterizable by confidence probabilities as being a type of user object; wherein three-dimensional data obtained from said means for using is coupled to said electronic device to affect at least one device parameter selected from a group consisting of (i) causing said electronic device to alter at least a portion of an image presented on said display, (ii) causing said electronic device to issue an audible sound, (iii) causing said electronic device to alter a characteristic of said electronic device, and (iv) causing a change in orientation in said first camera and in said second camera; and wherein said electronic device includes at least one device selected from a group consisting of (i) a smart phone, (ii) a tablet, (iii) a netbook, (iv) a laptop, (v) an e-book reader, (vi) a PC, (vii) a TV, and (viii) a set top box. - View Dependent Claims (18, 19, 20)
-
Specification