×

Factorial hidden markov model for audiovisual speech recognition

  • US 7,209,883 B2
  • Filed: 05/09/2002
  • Issued: 04/24/2007
  • Est. Priority Date: 05/09/2002
  • Status: Expired due to Fees
First Claim
Patent Images

1. A speech recognition method for audiovisual data comprisingobtaining a first data stream of speech data and a second data stream of face image data while a speaker is speaking;

  • extracting visual features from the second data stream by masking, resizing, rotating, and normalizing a mouth region in a face image, and by using a two-dimensional discrete cosine transform;

    constructing a factorial hidden Markov model for the first data stream and the second data stream, the factorial hidden Markov model including a plurality of hidden Markov models with each hidden Markov model having a plurality of discrete nodes and continuous observable nodes, wherein discrete nodes at a first time for each hidden Markov model are conditioned by discrete nodes at a second time of the plurality of hidden Markov models; and

    providing maximum likelihood training for the factorial hidden Markov model to identify words.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×