Speech recognition application or server using iterative recognition constraints

US 7,809,567 B2
Filed: 07/23/2004
Issued: 10/05/2010
Est. Priority Date: 07/23/2004
Status: Active Grant

First Claim

Patent Images

1. A speech recognition system comprising:

a computer storage media and a processing unit to implement instructions stored on the non-transitory computer storage media;

a contacts list or directory including contacts having a plurality of contact attributes stored on the non-transitory computer storage media;

a recognition module including instructions stored on the non-transitory computer storage media and executable by the processing unit to receive a first input utterance corresponding to a first contact attribute and provide a first N-Best list of one or more data entries in a first iteration for the first input utterance using a first grammar comprising grammars associated with contact records of the contacts list or directory;

an application module including instructions stored on the non-transitory computer storage media and executable by the processing unit to utilize the contact records associated with the first N-Best list to provide a second subset grammar or grammars for a second contact attribute limited to only the contacts corresponding to the N-Best entries of the first N-Best list and the recognition module is configured to receive a second input utterance corresponding to the second contact attribute and process the second input utterance using the second subset grammar or grammars to recognize the second contact attribute to provide a second N-Best list; and

an application component including instructions stored on the non-transitory computer storage media and executable by the processing unit to select or order the one or more N-Best entries from the first or second N-Best lists using information associated with the first or second iterations.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognition application including a recognition module configured to receive input utterances and an application module configured to select a recognition from the speech recognition module using output from a first iteration to select a recognition result for a second iteration. In one embodiment, the application module eliminates a previous rejected recognition result or results from the N-Best list for recognition. In another embodiment, the application module rescores N-Best entries based upon N-Best lists or information from another iteration. In another illustrated embodiment, the application module uses a limited grammar from a current N-Best list for subsequent recognition, for example for rerecognition using a recorded input from a previous iteration.

Citations

18 Claims

1. A speech recognition system comprising:
- a computer storage media and a processing unit to implement instructions stored on the non-transitory computer storage media;
  
  a contacts list or directory including contacts having a plurality of contact attributes stored on the non-transitory computer storage media;
  
  a recognition module including instructions stored on the non-transitory computer storage media and executable by the processing unit to receive a first input utterance corresponding to a first contact attribute and provide a first N-Best list of one or more data entries in a first iteration for the first input utterance using a first grammar comprising grammars associated with contact records of the contacts list or directory;
  
  an application module including instructions stored on the non-transitory computer storage media and executable by the processing unit to utilize the contact records associated with the first N-Best list to provide a second subset grammar or grammars for a second contact attribute limited to only the contacts corresponding to the N-Best entries of the first N-Best list and the recognition module is configured to receive a second input utterance corresponding to the second contact attribute and process the second input utterance using the second subset grammar or grammars to recognize the second contact attribute to provide a second N-Best list; and
  
  an application component including instructions stored on the non-transitory computer storage media and executable by the processing unit to select or order the one or more N-Best entries from the first or second N-Best lists using information associated with the first or second iterations.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The speech recognition system of claim 1 wherein the application component retrieves misrecognitions from the first N-Best list of the first iteration and eliminates the misrecognitions from the second N-Best list of the second iteration.
  - 3. The speech recognition system of claim 1 wherein the first and second contact attributes are different attributes.
  - 4. The speech recognition system of claim 1 wherein the application component restores one of the first or second N-Best lists based upon the other of the first and second N-Best lists.
  - 5. The speech recognition system of claim 1 wherein the second input utterance comprises a rerecognition of a previous input utterance.
  - 6. The speech recognition system of claim 1 wherein the first input utterance is a voice input of a name spelling and the second input utterance is a voice input of a full name.
  - 7. The speech recognition system of claim 1 wherein the first and second contact attributes correspond to a name and a spelling of the name.

8. A method for retrieving a contact from a contact list or directory comprising the steps of:
- receiving a first input utterance from an audio input device coupled to a computing device;
  
  recognizing the first input utterance using a processing unit of the computing device and outputting a first N-best list corresponding to the first input utterance using a first grammar or grammars comprising grammars associated with contacts in the contact list or directory during one recognition iteration;
  
  creating a second grammar or grammars for a second input utterance using the processing unit of the computing device and a subset of grammars limited to a subset of the contacts corresponding to one or more entries of the first N-Best list;
  
  receiving the second input utterance from the audio input device;
  
  recognizing and outputting a second N-Best list corresponding to the second input utterance using the processing unit of the computing device and the second grammar or grammars during a different recognition iteration; and
  
  selecting a recognition result for the different iteration using the processing unit of the computing device and the information for the one recognition iteration and the second N-Best list from the different recognition iteration.
- View Dependent Claims (9, 10, 11, 12, 13, 14, 15)
- - 9. The method of claim 8 wherein the step of selecting the recognition result for the different iteration using the information for the one iteration and the second N-Best list for the different iteration comprises the steps of:
    - receiving feedback that a recognition result or results identified during the one recognition iteration is incorrect;
      
      eliminating the incorrect recognition result or results identified during the one recognition iteration from the second N-Best list for the different recognition iteration; and
      
      selecting the recognition result from the second N-Best list for the different recognition iteration having the incorrect recognition result or results deleted.
  - 10. The method of claim 8 comprising the steps of:
    - rescoring one of the first or second N-Best lists corresponding to the one recognition iteration or the different recognition iteration based upon the other of the first or second N-Best lists; and
      
      selecting the recognition result based upon the rescored N-Best list.
  - 11. The method of claim 8 wherein the first input utterance for the one recognition iteration corresponds to a first contact attribute and the second input utterance for the different recognition iteration corresponds to a second different contact attribute.
  - 12. The method of claim 11 wherein the first and second contact attributes correspond to a name and a spelling of the name.
  - 13. The method of claim 8 wherein the first N-Best list for the one recognition iteration corresponds to a name spelling and the second N-Best list for the different recognition iteration is generated based upon the second grammar or grammars corresponding to the contact records for the N-Best list entries of the first N-Best list for the name spelling.
  - 14. The method of claim 8 wherein the different recognition iteration is a rerecognition of a recording of a previous full name voice input.
  - 15. The method of claim 8 and comprisingrecording the first input utterance;
    - andrerecognizing the recorded first input utterance following recognition of the first and second input utterances during the one recognition iteration and the different recognition iteration.

16. A method comprising:
- receiving a first input utterance from an audio input device coupled to a computing device to retrieve a contact from a contact list or directory;
  
  processing the first input utterance using a processing unit of the computing device and using a first grammar comprising grammars associated with contacts of the contact list or directory for a first contact attribute to recognize the first input utterance during a first recognition iteration;
  
  generating a first N-best list for the first contact attribute;
  
  storing information from the first recognition iteration on a non-transitory data storage media;
  
  processing data for one or more entities of the first N-best list associated with the first input utterance using the processing unit of the computing device and generating a second grammar comprising a subset grammar or grammars limited to contact records associated with the one or more entries of the first N-best list;
  
  processing a second input utterance from the audio input device using the processing unit and the second grammar for a second contact attribute to recognize the second input utterance during a second recognition iteration wherein the first and second attributes correspond to different contact attributes of the contact records of the contact list or directory;
  
  generating a second N-Best list corresponding to the second input utterance for the second contact attribute; and
  
  selecting a recognition result for the second iteration using the second N-Best list to provide the contact from the contact list or directory corresponding to the second input utterances.
- View Dependent Claims (17, 18)
- - 17. The method of claim 16 wherein the first and second input utterances correspond to a contact name and a spelling of the contact name.
  - 18. The method of claim 16 wherein the first and second contact attributes are different attributes.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Ollason, David G., Bhatia, Siddharth, Ju, Yun-Cheng
Primary Examiner(s)
McFadden; Susan

Application Number

US10/897,817
Publication Number

US 20060020464A1
Time in Patent Office

2,265 Days
Field of Search

704/257
US Class Current

704/257
CPC Class Codes

G10L 15/08   Speech classification or se...

G10L 15/183   using context dependencies,...

G10L 2015/228   of application context

Speech recognition application or server using iterative recognition constraints

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition application or server using iterative recognition constraints

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links