×

Fast out-of-vocabulary search in automatic speech recognition systems

  • US 9,542,936 B2
  • Filed: 05/02/2013
  • Issued: 01/10/2017
  • Est. Priority Date: 12/29/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • receiving, on a computer system, a text search query, the query comprising one or more query words;

    generating, on the computer system, for each query word in the query, a set of one or more anchor segments from searching metadata corresponding to a plurality of speech recognition processed audio files, the metadata including representations of one or more words detected in the audio files, wherein, for each detected word, the metadata includes a reference to each audio file in which the word was detected, a temporal location of the detected word in the audio file, and a confidence measure for the word as detected within the audio file, where each anchor segment includes a query word, an identifier for an audio file, and a temporal location of the query word within the audio file, where generating anchor segments includes, for each query word;

    determining, on the computer system, if the query word is included in a vocabulary of a learning model for a speech recognizer engine of the computer system;

    on the computer system, when the query word is in the vocabulary, searching the metadata to identify one or more high confidence anchor segments corresponding to the query word; and

    on the computer system, when the query word is not in the vocabulary;

    generating a search list of one or more sub-words of the query word,searching the metadata to identify one or more audio files containing at least one of the one or more sub-words to identify one or more anchor segments corresponding to one or more of the sub-words;

    post-processing, on the computer system, the one or more anchor segments, the post-processing comprising;

    expanding the one or more anchor segments;

    sorting the one or more anchor segments; and

    merging overlapping ones of the one or more anchor segments; and

    performing, on the computer system, speech recognition on the post-processed one or more expanded anchor segments for instances of at least one of the one or more query words using a constrained grammar.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×