System and method for rescoring N-best hypotheses of an automatic speech recognition system

  • US 7,761,296 B1
  • Filed: 04/02/1999
  • Issued: 07/20/2010
  • Est. Priority Date: 04/02/1999
  • Status: Expired due to Fees
  • ×
    • Pin Icon | RPX Insight
    • Pin
First Claim
Patent Images

1. A computer readable medium storing a computer program to perform method steps for execution by a processor, the method steps comprising:

  • generating a synthetic waveform for each of N textual transcriptions of an original waveform, wherein N is greater than 1 and the N textual transcriptions are generated by a speech recognition system and represent N-best textual transcription hypotheses of the original waveform;

    for each synthetic waveform,time-aligning feature vectors of the synthetic waveform with feature vectors of the original waveform at a phoneme level;

    computing a mean of the feature vectors which align to each phoneme for the original waveform and the synthetic waveform;

    computing a distance measure between each phoneme mean of the original waveform and the synthetic waveform;

    summing the distance measures to generate an overall distance measure representing a distance between the original waveform and the synthetic waveform;

    comparing scores based on the overall distance measure between the synthetic waveform and the original waveform, an acoustic model score of a corresponding textual transcription of the synthetic waveform, and a language model score of the corresponding textual transcription to determine a corresponding one of the N-best textual transcriptions; and

    selecting for output the determined N-best textual transcription.

View all claims

    Thank you for your feedback