Systems and methods for providing personal video services

US 8,243,118 B2
Filed: 01/04/2008
Issued: 08/14/2012
Est. Priority Date: 01/23/2007
Status: Active Grant

First Claim

Patent Images

1. A method of video conferencing, the method comprising the steps of:

detecting a human face of a video conference participant depicted in portions of a video stream;

creating by explicitly modeling one or more explicit object models to model the face of the video conference participant;

generating one or more implicit object models relative to parameters obtained from the explicit object models to facilitate creation of a compact encoding of the video conference participant'"'"'s face;

using the implicit object model, creating a photorealistic avatar representation of the video conference participant;

periodically checking to determine whether the implicit object model is working optimally; and

responding to a determination that the implicit object model is not working by processing the step of detecting a human face of a video conference participant and in response to detecting a human face, searching for existing calibration information for the detected human face.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods for processing video are provided. Video compression schemes are provided to reduce the number of bits required to store and transmit digital media in video conferencing or videoblogging applications. A photorealistic avatar representation of a video conference participant is created. The avatar representation can be based on portions of a video stream that depict the conference participant. A face detector is used to identify, track and classify the face. Object models including density, structure, deformation, appearance and illumination models are created based on the detected face. An object based video compression algorithm, which uses machine learning face detection techniques, creates the photorealistic avatar representation from parameters derived from the density, structure, deformation, appearance and illumination models.

Citations

31 Claims

1. A method of video conferencing, the method comprising the steps of:
- detecting a human face of a video conference participant depicted in portions of a video stream;
  
  creating by explicitly modeling one or more explicit object models to model the face of the video conference participant;
  
  generating one or more implicit object models relative to parameters obtained from the explicit object models to facilitate creation of a compact encoding of the video conference participant'"'"'s face;
  
  using the implicit object model, creating a photorealistic avatar representation of the video conference participant;
  
  periodically checking to determine whether the implicit object model is working optimally; and
  
  responding to a determination that the implicit object model is not working by processing the step of detecting a human face of a video conference participant and in response to detecting a human face, searching for existing calibration information for the detected human face.
- View Dependent Claims (2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18)
- - 2. A method for providing video conferencing as in claim 1 wherein the face of the video conference participant is detected and tracked using a Viola/Jones face detection algorithm.
  - 3. A method for providing video conferencing as in claim 1 wherein the implicit object models provide an implicit representation of the face of the video conference participant.
  - 4. A method for providing video conferencing as in claim 3 wherein the implicit representation of the video conference participant is a simulated representation of the face of the video conference participant.
  - 5. A method for providing video conferencing as in claim 3 wherein the detecting and tracking comprise using a Viola/Jones face detection algorithm further includes the steps of:
    - identifying corresponding elements of at least one object associated with the face in two or more video frames from the video stream; and
      
      tracking and classifying the corresponding elements to identify relationships between the corresponding elements based on previously calibrated and modeled faces.
  - 6. A method for providing video conferencing as in claim 1 wherein the explicit object models include one or more object models for structure, deformation, pose, motion, illumination, and appearance.
  - 8. A method for providing video conferencing as in claim 1 wherein the implicit object models are configured using parameters obtained from the explicit object models, such that the explicit object model parameters are used as a ground truth for estimating portions of the video stream with the implicit object models.
  - 9. A method for providing video conferencing as in claim 8 wherein the explicit object model parameters are used to define expectations about how lighting interacts with the structure of the face of the video conference participant.
  - 10. A method for providing video conferencing as in claim 8 wherein the explicit object model parameters are used to limit a search space to the face or portions thereof for the implicit object modeling.
  - 12. A method for providing video conferencing as in claim 1 wherein periodically checking to determine whether the implicit object modeling is working optimally further includes determining that the implicit object models, which are used to create the photorealistic avatar representation, are working optimally by:
    - determining that reprojection error is low in the photorealistic avatar representation; and
      
      determining that there is a significant amount of motion in the photorealistic avatar representation.
  - 13. A method for providing video conferencing as in claim 1 wherein the determination that the implicit object modeling is working optimally causes subsequent instances of the photorealistic avatar representation of the conference participant to be created without relying on the step of detecting a human face in the portions of the video stream.
  - 14. A method for providing video conferencing as in claim 1 wherein determining that the implicit object model is not working optimally by:
    - determining that processing of the photorealistic avatar representation uses a disproportional amount of transmission bandwidth;
      
      or determining that the implicit object modeling is not working optimally if reprojection error is high.
  - 15. A method for providing video conferencing as in claim 1 further includes responding to the determination that the implicit object modeling is not working by processing the step of detecting a human face of a video conference participant;
    - andin response to detecting a human face, searching for existing calibration information for the detected human face.
  - 16. A method for providing video conferencing as in claim 15 wherein if a human face is undetectable, using a Viola-Jones face detector to facilitate detection.
  - 17. A method for providing video conferencing as in claim 1 wherein creating a photorealistic avatar representation of the video conference participant further includes enabling the video conference participant to adjust a gaze of their respective photorealistic avatar representation.
  - 18. A method for providing video conferencing as in claim 1 wherein the gaze adjustment enables configuration of the gaze of the photorealistic avatar representation causes eyes of the photorealistic avatar representation to focus directly in the direction of a video camera.

7. A video conferencing system comprising:
- a face detector configured to detect a face of a video conference participant in a video stream;
  
  a calibrator configured to generate a calibration model calibrating the face of the video conference participant;
  
  an explicit object modeler configured to generate one or more explicit object models, in combination with the calibrator and face detector, the explicit object models modeling portions of the video stream depicting the face of the video conference participant based on the calibration model;
  
  an implicit object modeler configured to build one or more implicit object models relative to parameters from the explicit object models to facilitate creation of a compact encoding of the participant'"'"'s face;
  
  the system operable to generate a photorealistic avatar representation of the video conference participant from the implicit object model;
  
  the system operable to periodically check to determine whether the implicit object model is working optimally; and
  
  the system operable to respond to a determination that the implicit object model is not working by using the face detector to detect a face of a video conference participant and responding to the face detector detecting a face by searching for an existing calibration model for the detected face.

11. A method of video conferencing, the method comprising the steps of:
- detecting a human face of a video conference participant depicted in portions of a video stream;
  
  creating by explicitly modeling one or more explicit object models to model the face of the video conference participant;
  
  generating one or more implicit object models relative to parameters obtained from the explicit object models to facilitate creation of a compact encoding of the video conference participant'"'"'s face;
  
  using the implicit object model, creating a photorealistic avatar representation of the video conference participant;
  
  periodically checking to determine whether the implicit object model is working optimally, where the determination that the implicit object modeling is working optimally causes subsequent instances of the photorealistic avatar representation of the conference participant to be created without relying on the step of detecting a human face in the portions of the video stream.
- View Dependent Claims (20)
- - 20. A method for providing video conferencing as in claim 11 further includes periodically checking to determine whether the implicit object modeling is working optimally.

19. A method of video conferencing, the method comprising the steps of:
- generating explicit object models to model a human face of a video conference participant depicted in portions of a video stream;
  
  using parameters from the explicit object models, generating one or more implicit object models to create a photorealistic avatar representation of the video conference participant, where the explicit object model parameters are used to define expectations for the implicit object model regarding how lighting interacts with a structure of the face of the video conference participant;
  
  periodically checking to determine whether the implicit object model is working optimally; and
  
  responding to a determination that the implicit object model is not working by detecting a face of a video conference participant and responding to the detection by searching for existing calibration information for the detected face.

21. A video conferencing system comprising:
- a face detector configured to detect a face of a video conference participant in a video stream;
  
  a calibrator configured to generate a calibration model calibrating the face of the video conference participant;
  
  an explicit object modeler configured to generate one or more explicit object models, in combination with the calibrator and face detector, the explicit object models modeling portions of the video stream depicting the face of the video conference participant based on the calibration model;
  
  an implicit object modeler configured to build one or more implicit object models relative to parameters from the explicit object models to facilitate creation of a compact encoding of the participant'"'"'s face;
  
  the system operable to generate a photorealistic avatar representation of the video conference participant from the implicit object model; and
  
  the system operable to periodically check to determine whether the implicit object model is working optimally, where the determination that the implicit object modeling is working optimally causes subsequent instances of the photorealistic avatar representation of the conference participant to be created without relying on the step of detecting a human face in the portions of the video stream.

22. A method of video conferencing, the method comprising the steps of:
- detecting a human face of a video conference participant depicted in portions of a video stream;
  
  creating by explicitly modeling one or more explicit object models to model the face of the video conference participant;
  
  generating one or more implicit object models relative to parameters obtained from the explicit object models to facilitate creation of a compact encoding of the video conference participant'"'"'s face;
  
  using the implicit object model, creating a photorealistic avatar representation of the video conference participant; and
  
  periodically checking to determine whether the implicit object modeling is working optimally including determining that the implicit object models, which are used to create the photorealistic avatar representation, are working optimally by;
  
  determining that reprojection error is low in the photorealistic avatar representation; and
  
  determining that there is a significant amount of motion in the photorealistic avatar representation.

23. A video conferencing system comprising:
- a face detector configured to detect a face of a video conference participant in a video stream;
  
  a calibrator configured to generate a calibration model calibrating the face of the video conference participant;
  
  an explicit object modeler configured to generate one or more explicit object models, in combination with the calibrator and face detector, the explicit object models modeling portions of the video stream depicting the face of the video conference participant based on the calibration model;
  
  an implicit object modeler configured to build one or more implicit object models relative to parameters from the explicit object models to facilitate creation of a compact encoding of the participant'"'"'s face;
  
  the system operable to generate a photorealistic avatar representation of the video conference participant from the implicit object model; and
  
  the system operable to periodically check to determine whether the implicit object modeling is working optimally such that the system is operable to determine that the implicit object models, which are used to create the photorealistic avatar representation, are working optimally by;
  
  determining that reprojection error is low in the photorealistic avatar representation; and
  
  determining that there is a significant amount of motion in the photorealistic avatar representation.

24. A method of video conferencing, the method comprising the steps of:
- detecting a human face of a video conference participant depicted in portions of a video stream;
  
  creating by explicitly modeling one or more explicit object models to model the face of the video conference participant;
  
  generating one or more implicit object models relative to parameters obtained from the explicit object models to facilitate creation of a compact encoding of the video conference participant'"'"'s face;
  
  using the implicit object model, creating a photorealistic avatar representation of the video conference participant; and
  
  periodically checking to determine whether the implicit object modeling is working optimally including determining that the implicit object model is not working optimally by;
  
  determining that processing of the photorealistic avatar representation uses a disproportional amount of transmission bandwidth;
  
  ordetermining that the implicit object modeling is not working optimally if reprojection error is high.

25. A video conferencing system comprising:
- a face detector configured to detect a face of a video conference participant in a video stream;
  
  a calibrator configured to generate a calibration model calibrating the face of the video conference participant;
  
  an explicit object modeler configured to generate one or more explicit object models, in combination with the calibrator and face detector, the explicit object models modeling portions of the video stream depicting the face of the video conference participant based on the calibration model;
  
  an implicit object modeler configured to build one or more implicit object models relative to parameters from the explicit object models to facilitate creation of a compact encoding of the participant'"'"'s face;
  
  the system operable to generate a photorealistic avatar representation of the video conference participant from the implicit object model; and
  
  the system operable to periodically check to determine whether the implicit object modeling is working optimally including determining that the implicit object model is not working optimally by;
  
  determining that processing of the photorealistic avatar representation uses a disproportional amount of transmission bandwidth;
  
  or determining that the implicit object modeling is not working optimally if reprojection error is high.

26. A method of video conferencing, the method comprising the steps of:
- detecting a human face of a video conference participant depicted in portions of a video stream;
  
  creating by explicitly modeling one or more explicit object models to model the face of the video conference participant;
  
  generating one or more implicit object models relative to parameters obtained from the explicit object models to facilitate creation of a compact encoding of the video conference participant'"'"'s face; and
  
  using the implicit object model, creating a photorealistic avatar representation of the video conference participant;
  
  wherein creating a photorealistic avatar representation of the video conference participant includes enabling the video conference participant to adjust a gaze of their respective photorealistic avatar representation.

27. A video conferencing system comprising:
- a face detector configured to detect a face of a video conference participant in a video stream;
  
  a calibrator configured to generate a calibration model calibrating the face of the video conference participant;
  
  an explicit object modeler configured to generate one or more explicit object models, in combination with the calibrator and face detector, the explicit object models modeling portions of the video stream depicting the face of the video conference participant based on the calibration model;
  
  an implicit object modeler configured to build one or more implicit object models relative to parameters from the explicit object models to facilitate creation of a compact encoding of the participant'"'"'s face; and
  
  the system operable to generate a photorealistic avatar representation of the video conference participant from the implicit object model;
  
  wherein creating a photorealistic avatar representation of the video conference participant includes enabling the video conference participant to adjust a gaze of their respective photorealistic avatar representation.

28. A method of video conferencing, the method comprising the steps of:
- detecting a human face of a video conference participant depicted in portions of a video stream;
  
  creating by explicitly modeling one or more explicit object models to model the face of the video conference participant;
  
  generating one or more implicit object models relative to parameters obtained from the explicit object models to facilitate creation of a compact encoding of the video conference participant'"'"'s face; and
  
  using the implicit object model, creating a photorealistic avatar representation of the video conference participant;
  
  wherein the implicit object models are configured using parameters obtained from the explicit object models, such that the explicit object model parameters are used as a ground truth for estimating portions of the video stream with the implicit object models;
  
  wherein the explicit object model parameters are used to define expectations about how lighting interacts with the structure of the face of the video conference participant.

29. A video conferencing system comprising:
- a face detector configured to detect a face of a video conference participant in a video stream;
  
  a calibrator configured to generate a calibration model calibrating the face of the video conference participant;
  
  an explicit object modeler configured to generate one or more explicit object models, in combination with the calibrator and face detector, the explicit object models modeling portions of the video stream depicting the face of the video conference participant based on the calibration model;
  
  an implicit object modeler configured to build one or more implicit object models relative to parameters from the explicit object models to facilitate creation of a compact encoding of the participant'"'"'s face; and
  
  the system operable to generate a photorealistic avatar representation of the video conference participant from the implicit object model;
  
  wherein the implicit object model is configured using parameters obtained from the explicit object models, such that the explicit object model parameters are used as a ground truth for estimating portions of the video stream with the implicit object model;
  
  wherein the explicit object model parameters are used to define expectations about how lighting interacts with the structure of the face of the video conference participant.

30. A method of video conferencing, the method comprising the steps of:
- detecting a human face of a video conference participant depicted in portions of a video stream;
  
  creating by explicitly modeling one or more explicit object models to model the face of the video conference participant;
  
  generating one or more implicit object models relative to parameters obtained from the explicit object models to facilitate creation of a compact encoding of the video conference participant'"'"'s face; and
  
  using the implicit object model, creating a photorealistic avatar representation of the video conference participant;
  
  wherein the step of detecting a face includes using a Viola/Jones face detector which is operable to;
  
  identify corresponding elements of at least one object associated with the face in two or more video frames from the video stream; and
  
  track and classify the corresponding elements to identify relationships between the corresponding elements based on previously calibrated and modeled faces.

31. A video conferencing system comprising:
- a face detector configured to detect a face of a video conference participant in a video stream;
  
  a calibrator configured to generate a calibration model calibrating the face of the video conference participant;
  
  an explicit object modeler configured to generate one or more explicit object models, in combination with the calibrator and face detector, the explicit object models modeling portions of the video stream depicting the face of the video conference participant based on the calibration model;
  
  an implicit object modeler configured to build one or more implicit object models relative to parameters from the explicit object models to facilitate creation of a compact encoding of the participant'"'"'s face; and
  
  the system operable to generate a photorealistic avatar representation of the video conference participant from the implicit object model;
  
  wherein the implicit object model provides an implicit representation of the face of the video conference participant;
  
  wherein the face detector is configured to use a Viola/Jones face detector which is operable to;
  
  identify corresponding elements of at least one object associated with the face in two or more video frames from the video stream; and
  
  track and classify the corresponding elements to identify relationships between the corresponding elements based on previously calibrated and modeled faces.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Euclid Discoveries LLC
Original Assignee
Euclid Discoveries LLC
Inventors
Pace, Charles P.
Primary Examiner(s)
Kuntz, Curtis
Assistant Examiner(s)
El-Zoobi, Maria

Application Number

US12/522,324
Publication Number

US 20100073458A1
Time in Patent Office

1,684 Days
Field of Search

348 1401- 1408, 348/14.09, 348/557, 382/115, 382/254
US Class Current

348/14.01
CPC Class Codes

G06V 10/7557   based on appearance, e.g. a...

G06V 40/167   using comparisons between t...

H04N 21/23412   for generating or manipulat...

H04N 21/4223   Cameras H04N23/00 takes pre...

H04N 21/44012   involving rendering scenes ...

H04N 21/4415   using biometric characteris...

H04N 21/4532   involving end-user characte...

H04N 21/4788   communicating with other us...

H04N 7/147   Communication arrangements,...

H04N 7/157   defining a virtual conferen...

Systems and methods for providing personal video services

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

31 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for providing personal video services

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

31 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links