System and method of performing user-specific automatic speech recognition
First Claim
1. A method comprising:
- receiving a voice request from a speaker;
sampling an entirety of the voice request at periodic intervals, to yield samples;
identifying a background based on the samples;
modifying a transducer model based on the background generating a modified transducer model and adapting a plurality of speaker independent speech recognition models with the modified transducer model to yield a plurality of modified language models;
receiving a data field selection by the speaker, the data field selection comprising a first data field selection and a second data field selection; and
applying, via a processor, the plurality of modified language models to the voice request for speech recognition based on the data field selection and the background, wherein the speech recognition uses a first language model from the plurality of language models for the first data field selection and a second language model from the plurality of language models for the second data field selection, and wherein the first language model is distinct from the second language model.
4 Assignments
0 Petitions
Accused Products
Abstract
Speech recognition models are dynamically re-configurable based on user information, application information, background information such as background noise and transducer information such as transducer response characteristics to provide users with alternate input modes to keyboard text entry. Word recognition lattices are generated for each data field of an application and dynamically concatenated into a single word recognition lattice. A language model is applied to the concatenated word recognition lattice to determine the relationships between the word recognition lattices and repeated until the generated word recognition lattices are acceptable or differ from a predetermined value only by a threshold amount. These techniques of dynamic re-configurable speech recognition provide for deployment of speech recognition on small devices such as mobile phones and personal digital assistants as well environments such as office, home or vehicle while maintaining the accuracy of the speech recognition.
-
Citations
20 Claims
-
1. A method comprising:
-
receiving a voice request from a speaker; sampling an entirety of the voice request at periodic intervals, to yield samples; identifying a background based on the samples; modifying a transducer model based on the background generating a modified transducer model and adapting a plurality of speaker independent speech recognition models with the modified transducer model to yield a plurality of modified language models; receiving a data field selection by the speaker, the data field selection comprising a first data field selection and a second data field selection; and applying, via a processor, the plurality of modified language models to the voice request for speech recognition based on the data field selection and the background, wherein the speech recognition uses a first language model from the plurality of language models for the first data field selection and a second language model from the plurality of language models for the second data field selection, and wherein the first language model is distinct from the second language model. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
-
a processor; and a computer-readable storage medium having instructions stored which, when executed on the processor, perform operations comprising; receiving a voice request from a speaker; sampling an entirety of the voice request at periodic intervals, to yield samples; identifying a background based on the samples; modifying a transducer model based on the background generating a modified transducer model and adapting a plurality of speaker independent speech recognition models with the modified transducer model to yield a plurality of modified language models; receiving a data field selection by the speaker, the data field selection comprising a first data field selection and a second data field selection; and applying, via a processor, the plurality of modified language models to the voice request for speech recognition based on the data field selection and the background, wherein the speech recognition uses a first language model from the plurality of language models for the first data field selection and a second language model from the plurality of language models for the second data field selection, and wherein the first language model is distinct from the second language model. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
-
receiving a voice request from a speaker; sampling an entirety of the voice request at periodic intervals, to yield samples; identifying a background based on the samples; modifying a transducer model based on the background generating a modified transducer model and adapting a plurality of speaker independent speech recognition models with the modified transducer model to yield a plurality of modified language models; receiving a data field selection by the speaker, the data field selection comprising a first data field selection and a second data field selection; and applying, via a processor, the plurality of modified language models to the voice request for speech recognition based on the data field selection and the background, wherein the speech recognition uses a first language model from the plurality of language models for the first data field selection and a second language model from the plurality of language models for the second data field selection, and wherein the first language model is distinct from the second language model. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification