PRONUNCIATION LEARNING FROM USER CORRECTION
First Claim
1. A method for updating a custom lexicon used by a speech recognition engine that comprises part of a speech interface, comprising:
- obtaining a speech signal by the speech interface when a user speaks a name of a particular item for the purpose of selecting the particular item from among a finite set of items;
presenting the user with a means for selecting the particular item from among the finite set of items by providing input in a manner that does not include speaking the name of the item in response to determining that a phonetic description of the speech signal is not recognized by the speech recognition engine; and
after the user has selected the particular item via the means for selecting, storing the phonetic description of the speech signal in association with a text description of the particular item in the custom lexicon.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are described for adding entries to a custom lexicon used by a speech recognition engine of a speech interface in response to user interaction with the speech interface. In one embodiment, a speech signal is obtained when the user speaks a name of a particular item to be selected from among a finite set of items. If a phonetic description of the speech signal is not recognized by the speech recognition engine, then the user is presented with a means for selecting the particular item from among the finite set of items by providing input in a manner that does not include speaking the name of the item. After the user has selected the particular item via the means for selecting, the phonetic description of the speech signal is stored in association with a text description of the particular item in the custom lexicon.
274 Citations
20 Claims
-
1. A method for updating a custom lexicon used by a speech recognition engine that comprises part of a speech interface, comprising:
-
obtaining a speech signal by the speech interface when a user speaks a name of a particular item for the purpose of selecting the particular item from among a finite set of items; presenting the user with a means for selecting the particular item from among the finite set of items by providing input in a manner that does not include speaking the name of the item in response to determining that a phonetic description of the speech signal is not recognized by the speech recognition engine; and after the user has selected the particular item via the means for selecting, storing the phonetic description of the speech signal in association with a text description of the particular item in the custom lexicon. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system, comprising:
-
a speech recognition engine that is configured to generate a phonetic description of a speech signal obtained when a user speaks a name of a particular item into a speech interface for the purpose of selecting the particular item from among a finite set of items and to match the phonetic description of the speech signal to one of a plurality of phonetic descriptions included in a system lexicon or a custom lexicon; a dialog manager that is configured to present the user with a means for selecting the particular item from among the finite set of items by providing input in a manner that does not include speaking the name of the item in response to determining that the speech recognition engine has failed to match the phonetic description of the speech signal to any of the phonetic descriptions included in the system lexicon or the custom lexicon; and a learning engine that is configured to store the phonetic description of the speech signal in association with a text description of the particular item selected by the user via the means for selecting in the custom lexicon. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A computer program product comprising a non-transitory computer-readable medium having computer program logic recorded thereon for enabling a processing unit to update a custom lexicon dictionary used by a speech recognition engine that comprises part of a speech interface to an application, the computer program logic comprising:
-
first means for enabling the processing unit to obtain a speech signal when a user speaks a name of a particular item into the speech interface for the purpose of selecting the particular item from among a finite set of items; second means for enabling the processing unit to obtain a text description of the particular item from the speech recognition engine based upon recognition of a phonetic description of the speech signal by the speech recognition engine; and third means for enabling the processing unit to store the phonetic description of the speech signal in association with the text description of the particular item in a custom lexicon in response to determining that a measure of confidence with which the phonetic description of the speech signal has been recognized by the speech recognition engine is below a predefined threshold.
-
Specification