×

MODEL TRAINING FOR AUTOMATIC SPEECH RECOGNITION FROM IMPERFECT TRANSCRIPTION DATA

  • US 20100318355A1
  • Filed: 06/10/2009
  • Published: 12/16/2010
  • Est. Priority Date: 06/10/2009
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method, comprising:

  • a. aligning an utterance from a set of training data with a corresponding original transcription from the set of training data to produce a time-aligned transcription with time alignment information for each word in the utterance, wherein the set of training data includes transcription errors;

    b. decoding the same utterance with an incremental acoustic model and an incremental language model to produce a decoded transcription with time alignment information for each word;

    c. aligning the time-aligned and decoded transcriptions according to time alignment information;

    d. selecting all segments from the utterance having at least Q contiguous matching aligned words, where Q is a positive integer; and

    e. training the incremental acoustic model with the selected segments.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×