STRUCTURED MODELS OF REPITITION FOR SPEECH RECOGNITION

US 20100076765A1
Filed: 09/19/2008
Published: 03/25/2010
Est. Priority Date: 09/19/2008
Status: Active Grant

First Claim

Patent Images

1. In a computing environment, a method, comprising, receiving two or more adjacent utterances, in which a later utterance is related to an earlier utterance by repetition, and using a structured model of repetition to determine an intention associated with at least one of the utterances.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Described is a technology by which a structured model of repetition is used to determine the words spoken by a user, and/or a corresponding database entry, based in part on a prior utterance. For a repeated utterance, a joint probability analysis is performed on (at least some of) the corresponding word sequences as recognized by one or more recognizers) and associated acoustic data. For example, a generative probabilistic model, or a maximum entropy model may be used in the analysis. The second utterance may be a repetition of the first utterance using the exact words, or another structural transformation thereof relative to the first utterance, such as an extension that adds one or more words, a truncation that removes one or more words, or a whole or partial spelling of one or more words.

53 Citations

View as Search Results

20 Claims

1. In a computing environment, a method, comprising, receiving two or more adjacent utterances, in which a later utterance is related to an earlier utterance by repetition, and using a structured model of repetition to determine an intention associated with at least one of the utterances.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The method of claim 1 wherein using the structured model of repetition to determine the intention comprises attempting to determine exact words spoken by a user, or selecting at least one entry from among a fixed set of database entries, or both attempting to determine exact words spoken by a user and selecting at least one entry from among a fixed set of database entries.
  - 3. The method of claim 1 further comprising, recognizing the utterances as separate sets of word sequences, and wherein using the structured model of repetition comprises performing a joint probability analysis on the word sequences and associated acoustic data.
  - 4. The method of claim 3 further comprising, using word sequences common to the sets of word sequences to select only a subset of the word sequences for the joint probability analysis.
  - 5. The method of claim 3 further comprising, using phonetic similarity to select only a subset of the word sequences for the joint probability analysis.
  - 6. The method of claim 5 further comprising, using a transduction process that takes phonemes as input and produces words as output to determine the subset.
  - 7. The method of claim 6 wherein the transduction process uses a language model on the output, in which the language model is built from a set of listings, transcribed utterances, or decoded utterances, or any combination of a set of listings, transcribed utterances, or decoded utterances.
  - 8. The method of claim 1 further comprising, using a statistical user model that makes one or more inferences about at least one guess corresponding to a later utterance that was made regarding a misrecognized previous utterance.
  - 9. The method of claim 1 wherein recognizing the second utterance comprises using at least one speech recognizer that is different from a speech recognizer used in recognizing the first utterance.
  - 10. The method of claim 1 wherein using the structured model comprises determining that the second utterance is an attempt at exact repetition of the first utterance.
  - 11. The method of claim 1 wherein using the structured model comprises determining that the second utterance is an extension of the first utterance, including that the second utterance adds at least one word before the first utterance, or adds at least one word after the first utterance, or both adds at least one word before the first utterance and adds at least one word after the first utterance.
  - 12. The method of claim 1 wherein using the structured model comprises determining that the second utterance is a truncation of the first utterance, including that the second utterance has removed at least one word before the first utterance, or removed at least one word after the first utterance, or both removed at least one word before the first utterance and removed at least one word after the first utterance.
  - 13. The method of claim 1 wherein using the structured model comprises determining that the second utterance spells at least part of one word that was spoken in the first utterance.
  - 14. The method of claim 1 wherein structured model of repetition comprises a set of one or more features used in a generative probabilistic model, or a set of one or more features used in a maximum entropy model.

15. In a computing environment, a system comprising, a repeat analysis mechanism that processes speech recognition results differently based on whether input speech is an initial input, or is repeated input speech related to prior input speech that includes the initial input, and, when the input speech is repeated input speech, the repeat analysis mechanism configured to combine recognition data corresponding to the repeated input speech with recognition data corresponding to the prior input speech to provide a recognition result for that repeated input speech.
- View Dependent Claims (16, 17)
- - 16. The system of claim 15 wherein the repeat analysis mechanism is coupled to an automatic speech recognizer that provides recognition data for the initial input, and a different automatic speech recognizer that provides recognition data corresponding to the repeated input speech.
  - 17. The system of claim 15 wherein the repeat analysis mechanism dynamically limits the recognition data corresponding to the repeated input speech that is combined with the recognition data corresponding to the prior input speech.

18. One or more computer-readable media having computer-executable instructions, which when executed perform steps, comprising, receiving an utterance, determining if the utterance is a repeated utterance relative to a prior utterance, and if so, using word sequence data corresponding to recognition of the prior utterance in combination with word sequence data corresponding to recognition of the repeated utterance to select a recognition result for the repeated utterance.
- View Dependent Claims (19, 20)
- - 19. The one or more computer-readable media of claim 18 wherein selecting a recognition result comprises selecting at least one listing from a finite set of listings, or selecting at least one most probable set of one or more words corresponding to the second utterance.
  - 20. The one or more computer-readable media of claim 18 wherein the utterance is a repeated utterance relative to a prior utterance that occurs in a session having two or more utterances, and having computer-executable instructions comprising, using data from a separate session as part of selecting the recognition result for the repeated utterance.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Acero, Alejandro, Horvitz, Eric J., Li, Xiao, Zweig, Geoffrey G., Bohus, Dan

Granted Patent

US 8,965,765 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/255
CPC Class Codes

G10L 15/1822 Parsing for meaning underst...

STRUCTURED MODELS OF REPITITION FOR SPEECH RECOGNITION

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

53 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

STRUCTURED MODELS OF REPITITION FOR SPEECH RECOGNITION

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

53 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others