Generating topic-specific language models
First Claim
Patent Images
1. A method comprising:
- performing, by a computing device and using a first language model, a first speech recognition process on an audio signal;
determining, by the computing device and based on the first speech recognition process, a plurality of topics associated with the audio signal;
determining, by the computing device and based on the first speech recognition process, a respective significance, for each of the plurality of topics, based on a respective quantity of terms, in the audio signal, associated with each of the plurality of topics;
determining, by the computing device and based on the respective significance for each of the plurality of topics, a respective term threshold;
causing, for each of the plurality of topics, a respective set of one or more searches such that a respective quantity of terms identified by the respective set of one or more searches satisfies the respective term threshold for the topic;
determining, by the computing device and based on the terms identified by the searches, a second language model; and
performing, by the computing device and using the second language model, a second speech recognition process on the audio signal.
3 Assignments
0 Petitions
Accused Products
Abstract
Speech recognition may be improved by generating and using a topic specific language model. A topic specific language model may be created by performing an initial pass on an audio signal using a generic or basis language model. A speech recognition device may then determine topics relating to the audio signal based on the words identified in the initial pass and retrieve a corpus of text relating to those topics. Using the retrieved corpus of text, the speech recognition device may create a topic specific language model. In one example, the speech recognition device may adapt or otherwise modify the generic language model based on the retrieved corpus of text.
216 Citations
20 Claims
-
1. A method comprising:
-
performing, by a computing device and using a first language model, a first speech recognition process on an audio signal; determining, by the computing device and based on the first speech recognition process, a plurality of topics associated with the audio signal; determining, by the computing device and based on the first speech recognition process, a respective significance, for each of the plurality of topics, based on a respective quantity of terms, in the audio signal, associated with each of the plurality of topics; determining, by the computing device and based on the respective significance for each of the plurality of topics, a respective term threshold; causing, for each of the plurality of topics, a respective set of one or more searches such that a respective quantity of terms identified by the respective set of one or more searches satisfies the respective term threshold for the topic; determining, by the computing device and based on the terms identified by the searches, a second language model; and performing, by the computing device and using the second language model, a second speech recognition process on the audio signal. - View Dependent Claims (2, 3, 4, 10, 11, 12, 13, 14)
-
-
5. A method comprising:
-
determining, by a computing device and via a first speech recognition process, a first topic and a second topic associated with an audio signal, wherein the first speech recognition process uses an initial language model; determining, by the computing device, a significance of the first topic based on a first quantity of terms, in the audio signal, identified as being associated with the first topic via the first speech recognition process; determining, by the computing device, a significance of the second topic based on a second quantity of terms, in the audio signal, identified as being associated with the second topic via the first speech recognition process; receiving, by the computing device and in response to a first search associated with the first topic, a first plurality of terms related to at least the first topic, wherein a quantity of the first plurality of terms satisfies a first threshold number of terms that are based on the significance of the first topic; causing, by the computing device and based on the first plurality of terms, modification of the initial language model; and performing, by the computing device and using the modified initial language model, a second speech recognition process on the audio signal. - View Dependent Claims (6, 7, 8, 9, 15)
-
-
16. A method comprising:
-
performing, by a computing device and using a first language model, a first speech recognition process on an input signal; determining, based on the first speech recognition process, a plurality of topics associated with the input signal; determining, for each topic of the plurality of topics; a respective significance based on a respective quantity of terms, in the input signal, associated with each of the plurality of topics; and a respective term threshold based on the respective significance; causing, for each of the plurality of topics and using words recognized by the first speech recognition process, one or more searches such that a quantity of terms identified by the one or more searches satisfies the respective term threshold for the topic; determining a corpus of terms by combining the terms returned by the one or more searches conducted for each of the plurality of topics; determining, based on the corpus of terms, a second language model; and performing, by the computing device and using the second language model, a second speech recognition process on the input signal. - View Dependent Claims (17, 18, 19, 20)
-
Specification