×

Method and system for voice recognition employing multiple voice-recognition techniques

  • US 9,570,076 B2
  • Filed: 02/22/2013
  • Issued: 02/14/2017
  • Est. Priority Date: 10/30/2012
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method comprising:

  • receiving audio data that encodes an utterance;

    obtaining, as a result of performing speech-to-text voice recognition on the audio data, a first transcription of the utterance;

    segmenting the first transcription into two or more discrete terms;

    determining that a first particular term from among the two or more discrete terms is included among a predefined set of terms that are associated with a word spotting process that involves determining whether an acoustic fingerprint of a given portion of audio data is an acoustic match with one or more given terms without performing speech-to-text voice recognition;

    determining that the two or more discrete terms other than the first particular term are included among an additional predefined set of terms that are associated with the predefined set of terms that are associated with a word spotting process;

    in response to determining that the two or more discrete terms other than the first particular term are included among the additional predefined set of terms that are associated with the predefined set of terms that are associated with the word spotting process, obtaining, as a result of performing the word spotting process on a portion of the audio data that corresponds to a second particular term from among the two or more discrete terms other than the first particular term without re-performing speech-to-text voice recognition on the portion of the audio data, an indication that an acoustic fingerprint associated with the portion of the audio data that corresponds to the second particular term is an acoustic match with one or more terms of the predefined set of terms that are associated with the word spotting process;

    obtaining, as a result of re-performing speech-to-text voice recognition on a portion of the audio data that does not correspond to the second particular term, a second transcription of the utterance using the portion of the audio data that does not correspond to the second particular term;

    generating a third transcription of the utterance based at least on (i) the second transcription of the utterance that was obtained as a result of re-performing speech-to-text voice recognition on the portion of the audio data that does not correspond to the second particular term, and (ii) the one or more terms of the predefined set of terms that are indicated, as a result of performing the word spotting process on the portion of the audio data that corresponds to the second particular term without re-performing speech-to-text voice recognition of the audio data, as an acoustic match with the portion of the audio data that corresponds to the second particular term; and

    providing the third transcription of the utterance for output.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×