Video-teleconferencing system with eye-gaze correction
First Claim
1. A method, comprising:
- concurrently, capturing first and second video images representative of a first conferee taken from different views;
tracking a head position of the first conferee from the first and second video images;
ascertaining features and contours from the first video image that match features and contours from the second video image; and
synthesizing the head position as well as the features and contours from the first and second video images that match to generate a virtual image video stream of the first conferee that makes the first conferee appear to be making eye contact with a second conferee who is watching the virtual image video stream.
2 Assignments
0 Petitions
Accused Products
Abstract
Correcting for eye-gaze in video communication devices is accomplished by blending information captured from a stereoscopic view of the conferee and generating a virtual image of the conferee. A personalized face model of the conferee is captured to track head position of the conferee. First and second video images representative of a first conferee taken from different views are concurrently captured. A head position of the first conferee is tracked from the first and second video images. Matching features and contours from the first and second video images are ascertained. The head position as well as the matching features and contours from the first and second video images are synthesized to generate a virtual image video stream of the first conferee that makes the first conferee appear to be making eye contact with a second conferee who is watching the virtual image video stream.
75 Citations
41 Claims
-
1. A method, comprising:
-
concurrently, capturing first and second video images representative of a first conferee taken from different views;
tracking a head position of the first conferee from the first and second video images;
ascertaining features and contours from the first video image that match features and contours from the second video image; and
synthesizing the head position as well as the features and contours from the first and second video images that match to generate a virtual image video stream of the first conferee that makes the first conferee appear to be making eye contact with a second conferee who is watching the virtual image video stream. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method, comprising:
-
storing a personalized face model of a first conferee;
concurrently, capturing first and second video images representative of the first conferee taken from different views;
evaluating the first and second video images with respect to the personalized face model of the first conferee to ascertain three dimensional information; and
synthesizing the three dimensional information to generate a virtual image video stream of the first conferee that makes the first conferee appear to be making eye contact with a second conferee who is watching the virtual image video stream. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A system, comprising:
-
means for concurrently capturing first and second video images representative of a first conferee taken from different views;
means for tracking a head position of the first conferee from the first and second video images;
means for ascertaining features and contours from the first video image that match features and contours from the second video image; and
means for synthesizing the head position as well as the features and contours from the first and second video images that match to generate a virtual image video stream of the first conferee that makes the first conferee appear to be making eye contact with a second conferee who is watching the virtual image video stream. - View Dependent Claims (17, 18, 19, 20, 21, 22)
-
-
23. A video-teleconferencing system, comprising:
-
a head pose tracking module, configured to receive first and second video images representative of a first conferee concurrently taken from different views and track head position of the first conferee;
a stereo module, configured to receive the first and second video images representative of the first conferee concurrently taken from different views and match non-rigid parts observed from the first and second video images; and
a view synthesis module, configured to synthesize the head position as well as the matching non-rigid parts from the first and second video images to generate a virtual image video stream of the first conferee that makes the first conferee appear to be making eye contact with a second conferee who is watching the virtual image video stream. - View Dependent Claims (24, 25, 26, 27, 28)
-
-
29. One or more computer-readable media having stored thereon computer executable instructions that, when executed by one or more processors, causes the one or more processors of a computer system to:
-
concurrently, capture first and second video images representative of a first conferee taken from different views;
track a head position of the first conferee from the first and second video images;
ascertain features and contours from the first video image that match features and contours from the second video image; and
synthesize the head position as well as the features and contours from the first and second video images that match to generate a virtual image video stream of the first conferee that makes the first conferee appear to be making eye contact with a second conferee who is watching the virtual image video stream. - View Dependent Claims (30, 31, 32, 33, 34, 35)
-
-
36. One or more computer-readable media having stored thereon computer executable instructions that, when executed by one or more processors, causes the one or more processors of a computer system to:
-
store a personalized face model of a first conferee;
concurrently, capture first and second video images representative of the first conferee taken from different views;
evaluate the first and second video images with respect to the personalized face model of the first conferee to ascertain three dimensional information; and
synthesize the three dimensional information to generate a virtual image video stream of the first conferee that makes the first conferee appear to be making eye contact with a second conferee who is watching the virtual image video stream. - View Dependent Claims (37, 38, 39, 40, 41)
-
Specification