Semantic co-occurrence filtering for speech recognition and signal transcription applications
First Claim
1. An automated transcription disambiguation method comprising the steps of:
- providing an input question having first and second words to a processor in a form subject to misinterpretation by the processor;
generating a plurality of hypotheses with the processor, the hypotheses including alternative interpretations of at least one of the first and second words due to possible misinterpretations of the input question by the processor;
producing with the processor an initial evaluation of the hypotheses;
gathering confirming evidence for the hypotheses by searching with the processor in a text corpus for co-occurrences of hypothesized first and second words for the hypotheses;
automatically and explicitly selecting with the processor from among the plurality of hypotheses a preferred hypothesis as to both of the first and second words based at least in part on the initial evaluation and at least in part on the gathered confirming evidence; and
outputting a transcription result from the processor, the transcription result representing the selected preferred hypothesis.
3 Assignments
0 Petitions
Accused Products
Abstract
A system and method for automatically transcribing an input question from a form convenient for user input into a form suitable for use by a computer. The question is a sequence of words represented in a form convenient for the user, such as a spoken utterance or a handwritten phrase. The question is transduced into a signal that is converted into a sequence of symbols. A set of hypotheses is generated from the sequence of symbols. The hypotheses are sequences of words represented in a form suitable for use by the computer, such as text. One or more information retrieval queries are constructed and executed to retrieve documents from a corpus (database). Retrieved documents are analyzed to produce an evaluation of the hypotheses of the set and to select one or more preferred hypotheses from the set. The preferred hypotheses are output to a display, speech synthesizer, or applications program. Additionally, retrieved documents relevant to the preferred hypotheses can be selected and output.
456 Citations
22 Claims
-
1. An automated transcription disambiguation method comprising the steps of:
-
providing an input question having first and second words to a processor in a form subject to misinterpretation by the processor; generating a plurality of hypotheses with the processor, the hypotheses including alternative interpretations of at least one of the first and second words due to possible misinterpretations of the input question by the processor; producing with the processor an initial evaluation of the hypotheses; gathering confirming evidence for the hypotheses by searching with the processor in a text corpus for co-occurrences of hypothesized first and second words for the hypotheses; automatically and explicitly selecting with the processor from among the plurality of hypotheses a preferred hypothesis as to both of the first and second words based at least in part on the initial evaluation and at least in part on the gathered confirming evidence; and outputting a transcription result from the processor, the transcription result representing the selected preferred hypothesis.
-
-
2. In the operation of a system comprising a processor, an input transducer, an output facility, and a corpus comprising at least one document comprising words represented in a first form, a method for transcribing an input question by transforming the input question from a sequence of words represented in a second form, subject to misinterpretation by the processor, into a sequence of words represented in the first form, the method comprising the steps of:
-
accepting the input question into the system, the question comprising a sequence of words represented in the second form; converting the input question into a signal with the input transducer; converting the signal into a sequence of symbols with the processor; generating a set of hypotheses from the sequence of symbols with the processor, the hypotheses of the set comprising sequences of words represented in the first form, the set of hypotheses including alternative interpretations of at least one of the words to account for possible misinterpretation of the input question; producing with the processor an initial evaluation of the hypotheses; automatically constructing a query from hypotheses of the set with the processor; executing the constructed query by searching with the processor in the corpus for co-occurrences of hypothesized words for the hypotheses; analyzing the co-occurrences and the initial evaluation with the processor to produce a revised evaluation of the hypotheses of the set; automatically and explicitly selecting a preferred hypothesis from the set with the processor responsively to the revised evaluation, the preferred hypothesis comprising a preferred sequence of words in the first form and thus a preferred transcription of the sequence of words of the input question; and outputting the preferred hypothesis with the output facility. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. In a system comprising a processor, a method for processing an input utterance comprising speech, the method comprising the steps of:
-
accepting the input utterance into the system; producing a phonetic transcription of the input utterance with the processor; responsively to the phonetic transcription, generating with the processor a set of hypotheses, the hypotheses of the set being hypotheses as to a first word contained in the input utterance and further as to a second word contained in the input utterance, the set of hypotheses including alternative interpretations of at least one of the words to account for the error-prone nature of speech analysis; determining with the processor an initial evaluation measurement for each hypothesis; automatically constructing an information retrieval query with the processor, the query comprising the set of hypotheses and a proximity constraint; executing the constructed query in conjunction with an information retrieval subsystem comprising a text corpus; and responsively to the results of the executed query with respect to each hypothesis of the set of hypotheses, and taking into consideration the initial evaluation measurements of the hypotheses, automatically and explicitly selecting with the processor from among the hypotheses of the set a preferred hypothesis, the preferred hypothesis including the first and second words. - View Dependent Claims (19)
-
-
20. In a system comprising a processor, an error-prone input facility, and an information retrieval subsystem, said information retrieval subsystem comprising a natural-language text corpus, a method for accessing documents of the corpus, the method comprising the steps of:
-
transcribing a question with the error-prone input facility and the processor, the question comprising a sequence of words; selecting a subset of words of the sequence with the processor; forming with the processor a plurality of hypotheses about the selected subset of words, the hypotheses of the plurality representing possible alternative transcriptions of the question to account for the error-prone nature of the input facility; producing with the processor an initial evaluation of the hypotheses; automatically constructing a co-occurrence query with the processor, the co-occurrence query being based on hypotheses of the plurality; executing the co-occurrence query in conjunction with the information retrieval subsystem to retrieve a set of documents; analyzing the initial evaluation and documents of the retrieved set with the processor to produce a revised evaluation of the hypotheses; responsively to the revised evaluation, automatically and explicitly selecting with the processor a preferred hypothesis representing a preferred transcription of the sequence of words of the question; evaluating documents of the retrieved set with the processor with respect to the selected hypothesis to determine a relevant document; and outputting from the system the relevant document thus determined.
-
-
21. An automated system for producing a preferred transcription of a question presented in a form prone to erroneous transcription, comprising:
-
a processor; an input transducer, coupled to the processor, for accepting an input question and producing a signal therefrom; converter means, coupled to the input transducer, for converting the signal to a string comprising a sequence of symbols; hypothesis generation means, coupled to the converter means, for developing a set of hypotheses from the string, each hypothesis of the set comprising a sequence of word representations, the set of hypotheses representing a set of possible alternative transcriptions of the input question to account for the likelihood of erroneous transcription; initial scoring means, coupled to the hypothesis generation means, for determining an initial score for each hypothesis; query construction means, coupled to the hypothesis generation means, for automatically constructing at least one information retrieval query using hypotheses of the set; a corpus comprising documents, each document comprising word representations; query execution means, coupled to the query construction means and to the corpus, for retrieving from the corpus documents responsive to said at least one query; analysis means, coupled to the query execution means, for generating an analysis of the retrieved documents and evaluating the hypotheses of the set based on the initial scores and the analysis to determine a preferred hypothesis from among the hypotheses of the set, the preferred hypothesis representing a preferred transcription of the sequence of words of the input question; and output means, coupled to the analysis means, for outputting the preferred hypothesis.
-
-
22. A speech processing apparatus comprising:
-
input means for transducing a spoken utterance into an audio signal; means for converting the audio signal into a sequence of phones; means for analyzing the sequence of phones to generate a plurality of hypotheses comprising sequences of words, the hypotheses representing possible alternative transcriptions of the spoken utterance to account for the error-prone nature of speech analysis; means for determining an initial evaluation measurement for each hypothesis; means for automatically constructing a query using the hypotheses of the plurality; information retrieval means, coupled to a corpus of documents and to the constructing means, for retrieving documents of the corpus relevant to the constructed query; means for automatically and explicitly ranking the hypotheses of the plurality according to confirming evidence found in the retrieved documents and further according to the initial evaluation measurements previously determined; and means for outputting a subset of the hypotheses thus ranked, each hypothesis of the subset comprising a sequence of words representing a possible transcription of the spoken utterance.
-
Specification