×

Apparatus and method performing audio-video sensor fusion for object localization, tracking, and separation

  • US 7,536,029 B2
  • Filed: 11/30/2004
  • Issued: 05/19/2009
  • Est. Priority Date: 09/30/2004
  • Status: Active Grant
First Claim
Patent Images

1. An apparatus for tracking and identifying objects using received sounds and video, comprising:

  • an audio likelihood module which determines corresponding audio likelihoods for each of a plurality of the sounds received from corresponding different directions based on a signal subspace and noise subspace approach, with a spatial covariance matrix that is updated only when target audio is absent, considering together a respective audio source vector, measurement noise vector, and a transform function matrix including predefined steering vectors representing attenuation and delay reflecting propagation of audio at respective directions to at least two audio sensors, each audio likelihood indicating a likelihood the sound is an object to be tracked;

    a video likelihood module which determines video likelihoods for each of a plurality of images disposed in corresponding different directions in the video, each video likelihood indicating a likelihood that the image in the video is an object to be tracked; and

    an identification and tracking module which;

    determines correspondences between the audio likelihoods and the video likelihoods,if a correspondence is determined to exist between one of the audio likelihoods and one of the video likelihoods, identifies and tracks a corresponding one of the objects using each determined pair of audio and video likelihoods, andif a correspondence does not exist between a corresponding one of the audio likelihoods and a corresponding one of the video likelihoods, identifies a source of the sound or image as not being an object to tracked.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×