×

System and method for audio/video speaker detection

  • US 7,343,289 B2
  • Filed: 06/25/2003
  • Issued: 03/11/2008
  • Est. Priority Date: 06/25/2003
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented process for detecting speech, comprising the process actions of:

  • inputting associated audio and video training data containing a person'"'"'s face that is periodically speaking; and

    using said audio and video signals to train a time delay neural network to determine when a person is speaking, wherein said training comprises the following process actions;

    computing audio features from said audio training data wherein said audio feature is the energy over an audio frame;

    computing video features from said video training signals wherein said video feature is the degree to which said person'"'"'s mouth is open or closed; and

    correlating said audio features and video features to determine when a person is speaking.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×