System, method and computer program product for a distributed speech recognition tuning platform

US 7,383,187 B2
Filed: 01/24/2001
Issued: 06/03/2008
Est. Priority Date: 01/24/2001
Status: Expired due to Term

First Claim

Patent Images

1. A method for improving a speech recognition process, comprising:

maintaining a database of utterances;

collecting information associated with the utterances in the database utilizing a speech recognition process;

transmitting the utterances in the database to at least one user interface utilizing a network;

receiving transcriptions of the utterances in the database from the at least one user interface utilizing the network;

wherein a human is capable of utilizing the information and the transcriptions to improve a speech recognition application;

wherein the speech recognition process is improved by performing experiments based on the information;

wherein the information is selected from the group consisting of a dialog state, a gender of a speaker, and a date the utterances are wherein the at least one user interface includes a first icon for emitting a present utterance upon the selection thereof.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system, method and computer program product are provided for tuning a speech recognition process. Initially, a database of utterances is maintained. Thereafter, information associated with the utterances is collected utilizing a speech recognition process. Further, the utterances in the database are transmitted to a plurality of users utilizing a network. As such, transcriptions of the utterances in the database may be received from the users utilizing the network. In use, the speech recognition process may be tuned utilizing the information and the transcriptions.

Citations

19 Claims

1. A method for improving a speech recognition process, comprising:
- maintaining a database of utterances;
  
  collecting information associated with the utterances in the database utilizing a speech recognition process;
  
  transmitting the utterances in the database to at least one user interface utilizing a network;
  
  receiving transcriptions of the utterances in the database from the at least one user interface utilizing the network;
  
  wherein a human is capable of utilizing the information and the transcriptions to improve a speech recognition application;
  
  wherein the speech recognition process is improved by performing experiments based on the information;
  
  wherein the information is selected from the group consisting of a dialog state, a gender of a speaker, and a date the utterances are wherein the at least one user interface includes a first icon for emitting a present utterance upon the selection thereof.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
- - 2. The method as recited in claim 1, wherein the network includes the Internet.
  - 3. The method as recited in claim 2, wherein the transcriptions of the utterances are received from the at least one user interface using a network browser.
  - 4. The method as recited in claim 1, wherein the information includes a recognition result.
  - 5. The method as recited in claim 1, wherein the information is selected from the group consisting of a name of a grammar each utterance was recognized against, a name of an audio file on a disk, a directory path to the audio file, a size of the audio file, a session identifier, an index of each utterance, a recognition status, a recognition confidence associated with a recognition result, a recognition hypothesis, and an identification of a transcriber.
  - 6. The method as recited in claim 1, wherein the information includes a name of a grammar each utterance was recognized against, a name of an audio file on a disk, a directory path to the audio file, a size of the audio file, a session identifier, an index of each utterance, the dialog state, a recognition status, a recognition confidence associated with a recognition result, a recognition hypothesis, the gender of a speaker, an identification of a transcriber, and the date the utterances are transcribed.
  - 7. The method as recited in claim 1, wherein the utterances and the information are stored in the database, and the database is capable of being queried for results selected from the group consisting of a number of the utterances, a percentage of rejected utterances for a grammar, an average length of each utterance, a call volume in a predetermined range, a popularity of a grammar state, and a transcription management parameter.
  - 8. The method as recited in claim 1, wherein the utterances and the information are stored in the database, and the database queried for results includes a number of the utterances, a percentage of rejected utterances for a grammar, an average length of each utterance, a call volume in a predetermined range, a popularity of a grammar state, and a transcription management parameter.
  - 9. The method as recited in claim 1, wherein the speech recognition application is improved by performing experiments based on the information.
  - 10. The method as recited in claim 1, wherein the at least one user interface includes additional icons for emitting previous and next utterances upon the selection thereof.
  - 11. The method as recited in claim 1, wherein the at least one user interface includes a comment field for allowing a user to enter comments regarding a plurality of transcriptions.
  - 12. The method as recited in claim 1, wherein the at least one user interface includes a hint menu for allowing a user to choose from a plurality of stings identified by the speech recognition process.
  - 13. The method as recited in claim 12, wherein the hint menu allows the user to do a manual comparison between the utterances and results of the speech recognition process.
  - 14. The method as recited in claim 1, wherein the information includes the dialog state.
  - 15. The method as recited in claim 14, wherein the dialog state includes a context in a dialog flow.
  - 16. The method as recited in claim 1, wherein the information includes the gender of the speaker.
  - 17. The method as recited in claim 1, wherein the information includes the date the utterances are transcribed.

18. A computer program product embodied on a computer readable medium for improving a speech recognition process, comprising:
- (a) computer code for maintaining a database of utterances;
  
  (b) computer code for collecting information associated with the utterances in the database utilizing a speech recognition process;
  
  (c) computer code for transmitting the utterances in the database to at least one user interface utilizing a network; and
  
  (d) computer code for receiving transcriptions of the utterances in the database from the at least one user interface utilizing the network;
  
  (e) wherein a human is capable of utilizing the information and the transcriptions to improve a speech recognition application;
  
  wherein the speech recognition process is improved by performing experiments based on the information;
  
  wherein the information is selected from the group consisting of a dialog state, a gender of a speaker, and a date the utterances are wherein the at least one user interface includes a string field for allowing a user to enter a string corresponding to each utterance.

19. A system including a tangible computer readable medium for improving a speech recognition process, comprising:
- (a) logic for maintaining a database of utterances;
  
  (b) logic for collecting information associated with the utterances in the database utilizing a speech recognition process,(c) logic for transmitting the utterances in the database to at least one user interface utilizing a network;
  
  (d) logic for receiving transcriptions of the utterances in the database from the at least one user interface utilizing the network;
  
  (e) wherein a human is capable of utilizing the information and the transcriptions to improve a speech recognition application;
  
  wherein the speech recognition process is improved by performing experiments based on the information;
  
  wherein the information is selected from the group consisting of a dialog state, a gender of a speaker, and a date the utterances are wherein the at least one user interface includes a string field for allowing a user to enter a string corresponding to each utterance.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Bevocal Incorporated (Microsoft Corporation)
Inventors
Damiba, Bertrand A
Primary Examiner(s)
MCFADDEN, SUSAN IRIS

Application Number

US09/769,880
Publication Number

US 20020138276A1
Time in Patent Office

2,687 Days
Field of Search

704/270, 704/260, 704/270.1, 704/275, 379/88.01, 707/6
US Class Current

704/270
CPC Class Codes

G10L 15/06 Creation of reference templ...

G10L 15/30 Distributed recognition, e....

System, method and computer program product for a distributed speech recognition tuning platform

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

System, method and computer program product for a distributed speech recognition tuning platform

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links