Training speaker-dependent, phrase-based speech grammars using an unsupervised automated technique
First Claim
1. A computer-implemented method for performing speech recognition, the method comprising:
- operating at least one processor programmed to perform;
initiating a communication session with a speaker, the communication session requiring automatic speech recognition (ASR);
determining a characteristic of the speaker, the characteristic selected from a group consisting of a speaker identity and at least one voice characteristic for the speaker;
identifying a speaker-dependent, phrase-based grammar to use in the communication session with the speaker, wherein different speaker-dependent, phrase-based grammars are used for different users based on at least one speaker-dependent feature independent of a gender of the users;
recording feedback of ASR phrase processing operations during the communication session, wherein each ASR phrase processing operation seeks to match a spoken utterance against at least one entry within the identified speaker-dependent, phrase-based grammar, each entry of the at least one entry within said identified speaker-dependent, phrase-based grammar having a plurality of grammar option weights, each of the plurality of grammar option weights corresponding to a respective speech processing context, wherein the grammar option weights affect which entries are matched to the spoken utterances;
automatically adjusting the grammar option weights based upon recorded feedback data for the communication session to improve accuracy of the identified speaker-dependent, phrase-based grammar.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention can include a method for tuning grammar option weights of a phrase-based, automatic speech recognition (ASR) grammar, where the grammar option weights affect which entries within the grammar are matched to spoken utterances. The tuning can occur in an unsupervised fashion, meaning no special training session or manual transcription of data from an ASR session is needed. The method can include the step of selecting a phrase-based grammar to use in a communication session with a user wherein different phrase-based grammars can be selected for different users. Feedback of ASR phrase processing operations can be recorded during the communication session. Each ASR phrase processing operation can match a spoken utterance against at least one entry within the selected phrase-based grammar. At least one of the grammar option weights can be automatically adjusted based upon the feedback to improve accuracy of the phrase-based grammar.
-
Citations
19 Claims
-
1. A computer-implemented method for performing speech recognition, the method comprising:
operating at least one processor programmed to perform; initiating a communication session with a speaker, the communication session requiring automatic speech recognition (ASR); determining a characteristic of the speaker, the characteristic selected from a group consisting of a speaker identity and at least one voice characteristic for the speaker; identifying a speaker-dependent, phrase-based grammar to use in the communication session with the speaker, wherein different speaker-dependent, phrase-based grammars are used for different users based on at least one speaker-dependent feature independent of a gender of the users; recording feedback of ASR phrase processing operations during the communication session, wherein each ASR phrase processing operation seeks to match a spoken utterance against at least one entry within the identified speaker-dependent, phrase-based grammar, each entry of the at least one entry within said identified speaker-dependent, phrase-based grammar having a plurality of grammar option weights, each of the plurality of grammar option weights corresponding to a respective speech processing context, wherein the grammar option weights affect which entries are matched to the spoken utterances; automatically adjusting the grammar option weights based upon recorded feedback data for the communication session to improve accuracy of the identified speaker-dependent, phrase-based grammar. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
9. A machine-readable recording medium having stored thereon, a computer program having a plurality of code sections, said code sections being executable by a machine for causing the machine to perform the steps of:
-
initiating a communication session with a speaker, the communication session requiring automatic speech recognition (ASR); determining a characteristic of the speaker, the characteristic selected from a group consisting of a speaker identity and at least one voice characteristic for the speaker; identifying a speaker-dependent, phrase-based grammar to use in the communication session with the speaker, wherein different speaker-dependent, phrase-based grammars are used for different users based on at least one speaker-dependent feature independent of a gender of the users; recording feedback of ASR phrase processing operations during the communication session, wherein each ASR phrase processing operation seeks to match a spoken utterance against at least one entry within the identified speaker-dependent, phrase-based grammar, each entry of the at least one entry within said identified speaker-dependent, phrase-based grammar having a plurality of grammar option weights, each of the plurality of grammar option weights corresponding to a respective speech processing context, wherein the grammar option weights affect which entries are matched to the spoken utterances; and automatically adjusting the grammar option weights based upon the recorded feedback data for the communication session to improve accuracy of the identified speaker-dependent, phrase-based grammar. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A computer-implemented system for performing speech recognition, the system comprising:
-
at least one computer programmed to; initiate a communication session with a speaker, the communication session requiring automatic speech recognition (ASR); and determine a characteristic of the speaker, the characteristic selected from a group consisting of a speaker identity and at least one voice characteristic for the speaker; and an identification unit configured to identify a speaker-dependent phrase-based ASR grammar to use in the communication session, wherein different phrase-based grammars are used for different users based on at least one speaker-dependent feature independent of a gender of the users; an information collection unit configured to record feedback in real-time of ASR phrase processing operations during the communication session, wherein each ASR phrase processing operation seeks to match a spoken utterance against at least one entry within the identified speaker dependent, phrase-based grammar, each entry of the at least one entry within said identified speaker dependent, phrase-based grammar having a plurality of grammar option weights, each of the plurality of grammar option weights corresponding to a respective speech processing context, wherein the grammar option weights affect which entries are matched to the spoken utterances; and a logic unit configured to utilize said recorded feedback to automatically adjust the grammar option weights of the ASR grammar to improve accuracy of the identified speaker dependent, phrase-based grammar. - View Dependent Claims (16, 17, 18, 19)
-
Specification