×

Method and system for post-processing speech recognition results

  • US 8,682,660 B1
  • Filed: 05/16/2009
  • Issued: 03/25/2014
  • Est. Priority Date: 05/21/2008
  • Status: Active Grant
First Claim
Patent Images

1. In a computer-based system for post-processing of recognition results from an Automatic Speech Recognition (ASR) System, said ASR System comprising (a) an input component configured to accept a spoken utterance, Wint, (b) a processing component comprising one or more Statistical Language Models (SLMs), said processing component configured to process any said spoken utterance, Wint, and (c) an output component configured to output a result of the processing done by the processing component as a recognition result, said recognition result comprising a recognized sequence of words, Wrec, and a recognized semantic interpretation, Crec, said computer-based system for post-processing comprising one or more Conditional Language Models (CLMs), which are statistical models for computing conditional probabilities, said CLMs trained using a training method comprising the steps of (i) using said ASR System to recognize a plurality of said spoken utterances, Wint, and to obtain for each of said spoken utterances an output value of said recognition result, Wrec and Crec, (ii) determining intended semantic interpretation, Cint, for each of the spoken utterances, Wint, that were input to said ASR system, (iii) using said intended semantic interpretations, Cint, to partition all recognized sequences of words, Wrec, and recognized semantic interpretations, Crec, that were output by said ASR system at step (i) into a plurality of training data subsets, so that each said training data subset resulting from said partitioning is associated with (1) exactly one value of said intended semantic interpretation, Cint, and (2) one or more values of said recognized semantic interpretations, Crec, finally (iv) using each said training data subset to train one or more of said CLMs, so that each trained CLM becomes associated with exactly one value of said intended semantic interpretation, Cint, and one or more values of said recognized sematic interpretations, Crec, that are associated with the training data subsets used to train said CLM;

  • a computer-implemented method for post-processing of said recognition result, said recognition result outputted from said ASR System as result of using the ASR System to process an input comprising said spoken utterance, Wint, the outputted recognition result comprising said recognized sequence of words, Wrec, and said recognized semantic interpretation, Crec, said computer-implemented post-processing method comprising the steps of;

    selecting those trained CLMs which are associated with the recognized semantic interpretation, Crec, that is comprised in the said recognition result outputted when using said ASR System,for each of the selected CLMs, CLM(k), computing a kth decision factor value, DF(k), from kth data, the kth data comprising;

    (I) the kth intended semantic interpretation, Cint(k), as associated with the selected CLM(k),(II) a kth conditional probability, P(Wrec|Crec,Cint(k)), for the recognized sequence of words, Wrec, given both the recognized semantic interpretation, Crec, and the kth intended semantic interpretation, Cint(k), said kth conditional probability computed from said CLM(k) given the recognized semantic interpretation Crec and the recognized sequence of words Wrec, both Crec and Wrec comprised in the said recognition result outputted when using said ASR System,(III) a kth a-priori probability, P(Cint(k),Crec), of a pair of the kth intended semantic interpretation, Cint(k), and the recognized semantic interpretation, Crec, said kth a-priori probability determined from prior ASR use data, the prior ASR use data obtained while previously using the ASR system to recognize a plurality of prior spoken utterances, said prior ASR use data comprising (A) intended semantic interpretations for said plurality of prior spoken utterances input into the ASR System and (B) corresponding recognized semantic interpretations comprised in the recognition results output from the ASR System as result of processing said prior spoken utterances,selecting such kth intended semantic interpretation, Cint(k), associated with the kth selected CLM, CLM(k), for which the computed decision factor value, DF(k), is maximal among said computed decision factor values and declaring said selected kth intended semantic interpretation, Cint(k), an optimal semantic interpretation,outputting the optimal semantic interpretation via computer-writable media.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×