×

Acoustic model training using corrected terms

  • US 10,019,986 B2
  • Filed: 07/29/2016
  • Issued: 07/10/2018
  • Est. Priority Date: 07/29/2016
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • receiving, from a client device and by a voice search system that includes (i) an automated speech recognizer that uses an acoustic model to transcribe utterances, (ii) a search engine, (iii) an acoustic model trainer that periodically retrains the acoustic model using portions of audio data that correspond to manually specified terms of first transcriptions, (iv) a user interface component, and (v) a correction classifier, first audio data corresponding to an utterance of a user;

    obtaining, by the automated speech recognizer of the voice search system, a first transcription of the first audio data;

    receiving, by the user interface component of the voice search system, data indicating (i) a selection of one or more terms of the first transcription and (ii) one or more of replacement terms that the user has manually specified as a replacement for the one or more terms;

    determining, by the correction classifier of the voice search system, a minimum edit distance between the one or more terms of the first transcription and the one or more replacement terms;

    determining, by the correction classifier of the voice search system and based at least on the minimum edit distance between the one or more terms of the first transcription and the one or more replacement terms that the user has manually specified as a replacement for the one or more terms, whether one or more of the replacement terms that the user has manually specified as a replacement for the one or more terms likely represent a correction of one or more of the one or more terms of the first transcription;

    in response to determining, based at least on the minimum edit distance between the one or more terms of the first transcription and the one or more replacement terms that the user has manually specified as a replacement for the one or more terms, whether the one or more of the replacement terms that the user has manually specified as a replacement for the one or more terms likely represent a correction of the one or more terms of the first transcription, selectively retraining, by the acoustic model trainer of the voice search system, the acoustic model, comprising (i) retraining the acoustic model of the automated speech recognizer using a first portion of the audio that is associated with the one or more terms of the first transcription when the correction classifier indicates that the replacement terms likely represent a correction, or (ii) bypassing retraining of the acoustic model of the automated speech recognizer using the first portion of the first audio data that is associated with the one or more terms of the first transcription when the correction classifier indicates that the replacement terms do not likely represent a correction;

    obtaining, by the automated speech recognizer of the voice search system and using the retrained acoustic model, a transcription of audio data corresponding to a subsequently received utterance; and

    providing, by the user interface component of the voice search system, a user interface that includes one or more search results that the search engine of the voice search system has identified in response to the transcription of the audio data corresponding to the subsequently received utterance.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×