Immersive remote conferencing

US 8,675,067 B2
Filed: 05/04/2011
Issued: 03/18/2014
Est. Priority Date: 05/04/2011
Status: Active Grant

First Claim

Patent Images

1. A system comprising:

a service configured to receive video information and depth information corresponding to data captured by camera mechanisms of remote participants;

a view generator coupled to the service, the view generator configured to process data corresponding to the video information and depth information to place visible representations of remote participants into a common scene, wherein the common scene is rendered via a first person point of view;

a tracker using position tracking data to re-render the common scene to compensate for parallax as a user viewing the scene moves among different viewing angles; and

an audio output controller to provide spatial audio based upon the position of the user, or based upon a position of a visible representation of a remote participant placed in the common scene, or both.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The subject disclosure is directed towards an immersive conference, in which participants in separate locations are brought together into a common virtual environment (scene), such that they appear to each other to be in a common space, with geometry, appearance, and real-time natural interaction (e.g., gestures) preserved. In one aspect, depth data and video data are processed to place remote participants in the common scene from the first person point of view of a local participant. Sound data may be spatially controlled, and parallax computed to provide a realistic experience. The scene may be augmented with various data, videos and other effects/animations.

Citations

19 Claims

1. A system comprising:
- a service configured to receive video information and depth information corresponding to data captured by camera mechanisms of remote participants;
  
  a view generator coupled to the service, the view generator configured to process data corresponding to the video information and depth information to place visible representations of remote participants into a common scene, wherein the common scene is rendered via a first person point of view;
  
  a tracker using position tracking data to re-render the common scene to compensate for parallax as a user viewing the scene moves among different viewing angles; and
  
  an audio output controller to provide spatial audio based upon the position of the user, or based upon a position of a visible representation of a remote participant placed in the common scene, or both.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The system of claim 1 wherein the common scene comprises a plurality of common scene renderings with each common scene rendering being visible to a distinct remote participant, and wherein the common scene rendering viewed by at least one participant is augmented or modified with at least one element that is not visible to all participants.
  - 3. The system of claim 1 further comprising a camera mechanism configured to capture video information and depth information corresponding to a local user participant, and to provide data corresponding to the video information and depth information to the remote participants.
  - 4. The system of claim 1 further comprising a mesh generator configured to process the video data of a participant into mesh and texture data.
  - 5. The system of claim 1 further comprising a head tracking mechanism configured to provide user head position data to the view generator, the view generator using the head position data to compensate for motion parallax in rendering the scene as the user moves.
  - 6. The system of claim 1 wherein the scene is rendered via at least one display, via a holographic screen, or via goggles, or wherein the scene is rendered differently to each eye of a viewer to provide stereoscopic viewing at least part of a viewing time, or wherein the scene is rendered via at least one display, via a holographic screen, or via goggles and the scene is rendered differently to each eye of a viewer to provide stereoscopic viewing at least part of a viewing time.
  - 7. The system of claim 1 wherein the view generator generates the common scene with a virtual object, or generates the virtual scene with lighting adjustment of the video data captured by the camera mechanism of at least one remote participant, or both generates the common scene with a virtual object and generates the virtual scene with lighting adjustment of the video data captured by the camera mechanism of at least one remote participant.
  - 8. The system of claim 1 wherein the view generator generates the common scene with a virtual surface and wherein the virtual surface is textured or edge-aligned with an actual surface associated with the user, or both textured and edge-aligned with an actual surface associated with the user.
  - 9. The system of claim 1 further comprising a spatial audio mechanism configured to control audio output to at least two speakers to provide a perceived direction of audio.
  - 10. The system of claim 1 further comprising a tracking mechanism configured to provide user position data, and a spatial audio mechanism configured to use the user position data to adjust spatial audio output based upon the position of the user.
  - 11. The system of claim 1 further comprising a mechanism configured to augment the scene with one or more representations of shared data, private data, a projected screen, a projected document, a three-dimensional representation of data, a video, a background surface, a window, an image, or a computational surface, a representation of a virtual assistant, a representation of a fake participant, a speech transcription, a speech translation, a note or a bubble, or any combination of shared data, private data, a projected screen, a projected document, a three-dimensional representation of data, a video, a background surface, a window, an image, or a computational surface, a representation of a virtual assistant, a representation of a fake participant, a speech transcription, a speech translation, a note or a bubble.

12. In a computing environment, a method performed at least in part on at least one processor, comprising:
- receiving a plurality of sets of data, each set of data corresponding to video and depth data associated with a remote participant;
  
  generating a photo-realistic representation of each remote participant based upon the video and depth data associated with that remote participant;
  
  rendering a common scene via a first-person point of view with the photo-realistic representations of the remote participants placed into the common scene;
  
  using position tracking data to re-render the common scene to compensate for parallax as a user viewing the scene moves among different viewing angles; and
  
  controlling audio output to provide spatial audio based upon the position of the user, or based upon a position of a visible representation of a remote participant placed in the common scene, or both.
- View Dependent Claims (13, 14, 15, 16, 17)
- - 13. The method of claim 12 further comprising:
    - generating a virtual object in the common scene, and adjusting the virtual object'"'"'s size depending on one or more criteria.
  - 14. The method of claim 13 further comprising:
    - repositioning at least one participant relative to the virtual object when another participant joins or leaves by having a representation added to or removed from the common scene, respectively.
  - 15. The method of claim 12 further comprising:
    - generating a representation of a local participant based upon the video-related data associated with that local participant, and wherein rendering the common scene comprises including the representation of the local participant in the common scene viewed by that local participant.
  - 16. The method of claim 12 further comprising:
    - rendering the scene to compensate for parallax as a viewer of the scene moves among different viewing angles; and
      
      controlling audio output to provide spatial audio based upon a position of a listener, or based upon a position of a visible representation of a remote participant placed in the common scene, or both based upon a position of a listener and based upon a position of a visible representation of a remote participant placed in the common scene.
  - 17. The method of claim 12 further comprising:
    - augmenting the scene with one or more representations of shared data, private data, a projected screen, a projected document, a three-dimensional representation of data, a video, a background surface, a window, an image, or a computational surface, a representation of a virtual assistant, a representation of a fake participant, a speech transcription, a speech translation, a note or a bubble, or any combination of shared data, private data, a projected screen, a projected document, a three-dimensional representation of data, a video, a background surface, a window, an image, or a computational surface, a representation of a virtual assistant, a representation of a fake participant, a speech transcription, a speech translation, a note or a bubble.

18. One or more computer-readable media having computer-executable instructions, which when executed perform steps, comprising:
- receiving a plurality of sets of data, each set of data corresponding to video and depth data associated with a remote participant;
  
  generating a photo-realistic representation of each participant based upon the video and depth data associated with that remote participant;
  
  rendering a common scene via a first-person point of view with the photo-realistic representations of the remote participants placed in the common scene;
  
  using position tracking data to re-render the common scene to compensate for parallax as a user viewing the scene moves among different viewing angles; and
  
  controlling audio output to provide spatial audio based upon the position of the user, or based upon a position of a visible representation of a remote participant placed in the common scene, or both.
- View Dependent Claims (19)
- - 19. The one or more computer-readable media of claim 18 having further computer-executable instructions, comprising:
    - augmenting the scene with one or more representations of shared data, private data, a projected screen, a projected document, a three-dimensional representation of data, a video, a background surface, a window, an image, or a computational surface, a representation of a virtual assistant, a representation of a fake participant, a speech transcription, a speech translation, a note or a bubble, or any combination of shared data, private data, a projected screen, a projected document, a three-dimensional representation of data, a video, a background surface, a window, an image, or a computational surface, a representation of a virtual assistant, a representation of a fake participant, a speech transcription, a speech translation, a note or a bubble.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Chou, Philip A., Zhang, Zhengyou, Zhang, Cha, Florencio, Dinei A., Liu, Zicheng, Hegde, Rajesh K., Chandrasekaran, Nirupama
Primary Examiner(s)
Nguyen, Joseph J

Application Number

US13/100,504
Publication Number

US 20120281059A1
Time in Patent Office

1,049 Days
Field of Search

348 1407- 141, 349/143, 349/149
US Class Current

348/143
CPC Class Codes

H04L 12/1827   Network arrangements for co...

H04N 7/147   Communication arrangements,...

H04N 7/157   defining a virtual conferen...

Immersive remote conferencing

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Immersive remote conferencing

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links