Synthesis of information from multiple audiovisual sources
First Claim
1. A method for synthesizing information for a scene from multiple sources, wherein the sources are capture devices, comprising:
- a) receiving scene information from a first source and a second source, the first and second sources spatially separated from each other and the scene;
b) determining a position for each of the first and second sources from the scene information and one or more cues detected in common from the scene by the first and second sources;
c) creating a representation of the scene based on the positions of the first and second sources determined in said step b) and the scene information received from the first and second sources.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method are disclosed for synthesizing information received from multiple audio and visual sources focused on a single scene. The system may determine the positions of capture devices based on a common set of cues identified in the image data of the capture devices. As a scene may often have users and objects moving into and out of the scene, data from the multiple capture devices may be time synchronized to ensure that data from the audio and visual sources are providing data of the same scene at the same time. Audio and/or visual data from the multiple sources may be reconciled and assimilated together to improve an ability of the system to interpret audio and/or visual aspects from the scene.
59 Citations
20 Claims
-
1. A method for synthesizing information for a scene from multiple sources, wherein the sources are capture devices, comprising:
-
a) receiving scene information from a first source and a second source, the first and second sources spatially separated from each other and the scene; b) determining a position for each of the first and second sources from the scene information and one or more cues detected in common from the scene by the first and second sources; c) creating a representation of the scene based on the positions of the first and second sources determined in said step b) and the scene information received from the first and second sources. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for synthesizing information for a scene from multiple sources, wherein the sources are capture devices, comprising:
-
a) receiving scene information from a first source and a second source, an initial position of the first source being unknown with respect to the second source, the first and second sources spatially separated from each other and the scene, the scene information including at least one of image depth data and RGB data; b) determining a position for each of the first and second sources from at least one of the image data and RGB data, together with the scene information shared in common from the scene by the first and second sources; and c) creating a representation of the scene based on the positions of the first and second sources determined in said step b) and the scene information received from the first and second sources. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A method for synthesizing information for a play space in a gaming application from multiple capture devices, capture devices in the multiple capture devices including a depth camera, an RGB camera and at least one microphone, comprising:
-
a) receiving image depth data and RGB depth data from a first capture device and a second capture device, the image depth data and the RGB depth data from the first and second capture devices being time synchronized together, the first and second capture devices spatially separated from each other and the play space; b) determining a position and orientation for each of the first and second capture devices from a combination of the synchronized image depth data and RGB data, together with a plurality of cues detected in common from the play space by the first and second capture devices; c) creating a representation of the play space based on the positions of the first and the second capture devices determined in said step b) and the image depth data and RGB depth data received from the first and second capture devices; d) stitching together a first portion of the play space representation from the first capture device with a second portion of the play space representation from the second capture device; and e) rendering the representation of the play space on a display associated with the first and second capture devices. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification