×

System and method for automatic speech to text conversion

  • US 8,566,088 B2
  • Filed: 11/11/2009
  • Issued: 10/22/2013
  • Est. Priority Date: 11/12/2008
  • Status: Active Grant
First Claim
Patent Images

1. A system for recognizing speech that corresponds to a digital speech signal, the system comprising:

  • a speech recognition engine that has access toa training corpus of known-class digitized speech utterances,a plurality of weak classifiers, wherein each weak classifier comprises a decision function for determining the presence of an event within the training corpus, andan ensemble detector comprising a plurality of the weak classifiers, that together are better at determining the presence of a speech signal event than any of the constituent weak classifiers;

    wherein the speech recognition engine comprises an event extractor for extracting speech signal events and patterns of the speech signal events from the digital speech signal, wherein the speech signal events and patterns of the speech signal events are relevant in speech recognition,wherein the speech recognition engine comprises at least one processor that is configured to perform a plurality of operations, wherein the plurality of operations comprisedetecting locations of relevant speech signal events in the digital speech signal, wherein each of the speech signal events comprise spectral information and temporal information,capturing spectral features of and temporal relationships between all of the speech signal events,segmenting the digital speech signal based on the detected locations of the detected speech signal events,analyzing the segmented digital speech signal, wherein the analysis is synchronized with the speech signal events,detecting patterns in the digital speech signal with the captured spectral information, the temporal relationships, and the analyzed digital speech signal,providing a list of perceptual alternatives for recognized speech data that corresponds to the detected patterns in the digital speech signal, anddisambiguating between the perceptual alternatives for the recognized speech data based on the analysis of one or more of the speech signal events to improve the recognized speech data;

    wherein the at least one processor is configured to perform one or more of the operations using the ensemble detector; and

    a module coupled to the speech recognition engine, wherein the module is configured to output the improved recognized speech data.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×