Speaker detection and tracking using audiovisual data
First Claim
Patent Images
1. An object tracker system, comprising:
- an audio model that models an original audio signal of an object, a time delay between at least two audio input signals and a variability component of the original audio signal, the audio model employing a probabilistic generative model;
a video model that models a location of the object, an original image of the object and a variability component of the original image, the video model employing a probabilistic generative model, the video model receiving a video input; and
, an audio video tracker that models the location of the object based, at least in part, upon the audio model and the video model, the audio video tracker providing an output associated with the location of the object.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method facilitating object tracking is provided. The invention includes an audio model that receives at least two audio input signals and a video model that receives a video input. The audio model and the video model employ probabilistic generative models which are combined to facilitate object tracking. Expectation maximization can be employed to modify trainable parameters of the audio model and the video model.
17 Citations
26 Claims
-
1. An object tracker system, comprising:
-
an audio model that models an original audio signal of an object, a time delay between at least two audio input signals and a variability component of the original audio signal, the audio model employing a probabilistic generative model;
a video model that models a location of the object, an original image of the object and a variability component of the original image, the video model employing a probabilistic generative model, the video model receiving a video input; and
,an audio video tracker that models the location of the object based, at least in part, upon the audio model and the video model, the audio video tracker providing an output associated with the location of the object. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method for object tracking, comprising:
-
updating a posterior distribution over unobserved variables of an audio model and a video model;
updating trainable parameters of the audio model and the video model; and
,providing an output associated with a location of an object. - View Dependent Claims (20, 21)
-
-
22. A data packet transmitted between two or more computer components that facilitates object tracking, the data packet comprising:
-
a first data field comprising information associated with a horizontal location of an object; and
,a second data field comprising information associated with a vertical location of the object, the horizontal location and the vertical location being based, at least in part, upon an object tracker system receiving at least two audio signal inputs and a video input signal.
-
-
23. A computer readable medium storing computer executable components of an object tracker system, comprising:
-
an audio model component that models an original audio signal of an object, a time delay between at least two audio input signals and a variability component of the original audio signal, the audio model employing a probabilistic generative model;
a video model component that models a location of the object, an original image of the object and a variability component of the original image, the video model employing a probabilistic generative model, the video model receiving a video input; and
,an audio video tracker component that models the location of the object based, at least in part, upon the audio model and the video model, the audio video tracker providing an output associated with the location of the object.
-
-
24. An object tracker system, comprising:
-
means for modeling audio that models an original audio signal of an object, a time delay between at least two audio input signals and a variability component of the original audio signal, the means for modeling audio employing a probabilistic generative model;
means for modeling video that models a location of the object, an original image of the object and a variability component of the original image, the means for modeling video employing a probabilistic generative model; and
,means for tracking the location of the object based, at least in part, upon the means for modeling audio and the means for model video, the means for tracking the location of the object providing an output associated with the location of the object.
-
-
25. An object tracker system, comprising:
-
an audio model that models an original audio signal of an object, a time delay between at least two audio input signals and a variability component of the original audio signal, the audio model employing a probabilistic generative model;
a video model that models a location of the object, an original image of the object, a variability component of the original image and a background image, the video model employing a probabilistic generative model, the video model receiving a video input; and
,an audio video tracker that models the location of the object based, at least in part, upon the audio model and the video model, the audio video tracker providing an output associated with the location of the object.
-
-
26. An object tracker system, comprising:
-
an audio model that models an original audio signal of an object, a time delay between at least two audio input signals, a variability component of the original audio signal and a previous original audio signal of the object, the audio model employing a probabilistic generative model;
a video model that models a location of the object, an original image of the object and a variability component of the original image, the video model employing a probabilistic generative model, the video model receiving a video input; and
,an audio video tracker that models the location of the object based, at least in part, upon the audio model, the video model and a previous location of the object, the audio video tracker providing an output associated with the location of the object.
-
Specification