×

Speaker detection and tracking using audiovisual data

  • US 6,940,540 B2
  • Filed: 06/27/2002
  • Issued: 09/06/2005
  • Est. Priority Date: 06/27/2002
  • Status: Active Grant
First Claim
Patent Images

1. An object tracker system, comprising:

  • an audio model that models an original audio signal of an object, a time delay between at least two audio input signals and a variability component of the original audio signal, the audio model employing a probabilistic generative model, and employing, at least in part, the following equations;


    p(r)=π

    r,
    p(a|r)=N(a|0,η

    r),
    p(x1|a)=N(x1

    1a,ν

    1),
    p(x2|a

    )=N(x2

    2Lτ

    a,ν

    2), where r is variability component of the original audio signal, π

    is a prior probability parameter of r, a is the original audio signal of the object, x1 is a first audio input signal, x2 is a second audio input signal, τ

    is the time delay between x1 and x2, λ

    1 is an attenuation parameter associated with x1, λ

    2 is an attenuation parameter associated with x2, η

    r is a precision matrix parameter associated with r, ν

    1 is a precision matrix parameter associated with additive noise of x1, ν

    2 is a precision matrix parameter associated with additive noise of x2, Lr denotes a temporal shift operator;

    a video model that models a location of the object, an original image of the object and a variability component of the original image, the video model employing a probabilistic generative model, the video model receiving a video input; and

    , an audio video tracker that models the location of the object based, at least in part, upon the audio model and the video model, the audio video tracker providing an output associated with the location of the object.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×