Structured models of repetition for speech recognition
First Claim
1. In a computing environment, a method, comprising, receiving two or more adjacent utterances, in which a later utterance is structurally related to an earlier utterance by repetition, using a structured model of repetition to determine an intention associated with at least one of the utterances, recognizing the utterances as separate sets of word sequences, and wherein using the structured model of repetition comprises performing a joint probability analysis on the word sequences and associated acoustic data, and using word sequences common to the sets of word sequences to select only a subset of the word sequences for the joint probability analysis.
2 Assignments
0 Petitions
Accused Products
Abstract
Described is a technology by which a structured model of repetition is used to determine the words spoken by a user, and/or a corresponding database entry, based in part on a prior utterance. For a repeated utterance, a joint probability analysis is performed on (at least some of) the corresponding word sequences as recognized by one or more recognizers) and associated acoustic data. For example, a generative probabilistic model, or a maximum entropy model may be used in the analysis. The second utterance may be a repetition of the first utterance using the exact words, or another structural transformation thereof relative to the first utterance, such as an extension that adds one or more words, a truncation that removes one or more words, or a whole or partial spelling of one or more words.
24 Citations
19 Claims
- 1. In a computing environment, a method, comprising, receiving two or more adjacent utterances, in which a later utterance is structurally related to an earlier utterance by repetition, using a structured model of repetition to determine an intention associated with at least one of the utterances, recognizing the utterances as separate sets of word sequences, and wherein using the structured model of repetition comprises performing a joint probability analysis on the word sequences and associated acoustic data, and using word sequences common to the sets of word sequences to select only a subset of the word sequences for the joint probability analysis.
- 14. In a computing environment, a system comprising, at least one processor, a memory communicatively coupled to the at least one processor and including components comprising, a repeat analysis mechanism that processes speech recognition results differently based on whether input speech is an initial input, or is repeated input speech that includes a structural transformation of the initial input, and, when the input speech is the repeated input speech, the repeat analysis mechanism configured to combine recognition data corresponding to the repeated input speech with recognition data corresponding to the prior input speech to provide a recognition result for that repeated input speech, the recognition result based upon one or more structural features corresponding to the repeated input speech in relation to the prior input speech, wherein the repeat analysis mechanism dynamically limits the recognition data corresponding to the repeated input speech that is combined with the recognition data corresponding to the prior input speech.
- 17. One or more computer-readable storage media having computer-executable instructions, which when executed perform steps, comprising, receiving an utterance, determining if the utterance is a structural transformation comprising at least one of an extension, a truncation, or at least a partial spelling of a prior utterance from a same speaker as the utterance, and if so, using word sequence data corresponding to recognition of the prior utterance in combination with word sequence data corresponding to recognition of the utterance to select a recognition result for the utterance comprising performing a joint probability analysis on the word sequence data corresponding to recognition of the utterance and associated acoustic data and using the word sequence data corresponding to recognition of the prior utterance and the word sequence data corresponding to recognition of the utterance to select a subset of the word sequences for the joint probability analysis and wherein at least one speech recognizer that is different from a speech recognizer used in recognizing the prior utterance.
Specification