Adjusting language models based on topics identified using context
First Claim
1. A method of performing speech recognition, the method being performed by an automated speech recognition system, the method comprising:
- receiving, by the automated speech recognition system, audio data and context data associated with the audio data, wherein the received context data indicates a current context of a device when the device captured the audio data;
identifying, by the automated speech recognition system, a topic based on a comparison of the received context data with second context data indicating a previous context of a device, wherein identifying the topic includes;
selecting, by the automated speech recognition system, a term based on the comparison between the received context data and the second context data; and
in response to selecting the term, selecting, by the automated speech recognition system and as the identified topic, a topic that is associated with the selected term;
based on identifying the topic, adjusting, by the automated speech recognition system, a language model based on the identified topic to adjust a likelihood that the language model indicates for one or more terms associated with the topic;
determining, by the automated speech recognition system, a transcription of the audio data using the adjusted language model; and
outputting, by the automated speech recognition system, the transcription determined using the adjusted language model, the transcription being output as a speech recognition output of the automated speech recognition system.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for adjusting language models. In one aspect, a method includes accessing audio data. Information that indicates a first context is accessed, the first context being associated with the audio data. At least one term is accessed. Information that indicates a second context is accessed, the second context being associated with the term. A similarity score is determined that indicates a degree of similarity between the second context and the first context. A language model is adjusted based on the accessed term and the determined similarity score to generate an adjusted language model. Speech recognition is performed on the audio data using the adjusted language model to select one or more candidate transcriptions for a portion of the audio data.
155 Citations
15 Claims
-
1. A method of performing speech recognition, the method being performed by an automated speech recognition system, the method comprising:
-
receiving, by the automated speech recognition system, audio data and context data associated with the audio data, wherein the received context data indicates a current context of a device when the device captured the audio data; identifying, by the automated speech recognition system, a topic based on a comparison of the received context data with second context data indicating a previous context of a device, wherein identifying the topic includes; selecting, by the automated speech recognition system, a term based on the comparison between the received context data and the second context data; and in response to selecting the term, selecting, by the automated speech recognition system and as the identified topic, a topic that is associated with the selected term; based on identifying the topic, adjusting, by the automated speech recognition system, a language model based on the identified topic to adjust a likelihood that the language model indicates for one or more terms associated with the topic; determining, by the automated speech recognition system, a transcription of the audio data using the adjusted language model; and outputting, by the automated speech recognition system, the transcription determined using the adjusted language model, the transcription being output as a speech recognition output of the automated speech recognition system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A non-transitory computer-readable storage device having instructions stored thereon that, when executed by a computing device of an automated speech recognition system, cause the computing device to perform speech recognition operations comprising:
-
receiving, by the computing device of the automated speech recognition system, audio data and context data associated with the audio data, wherein the received context data indicates a current context of a device when the device captured the audio data; identifying, by the computing device of the automated speech recognition system, a topic based on a comparison of the received context data with second context data indicating a previous context of a device, wherein identifying the topic includes; selecting, by the computing device of the automated speech recognition system, a term based on the comparison between the received context data and the second context data; and in response to selecting the term, selecting, by the computing device of the automated speech recognition system and as the identified topic, a topic that is associated with the selected term; based on identifying the topic, adjusting, by the computing device of the automated speech recognition system, a language model based on the identified topic to adjust a likelihood that the language model indicates for one or more terms associated with the topic; determining, by the computing device of the automated speech recognition system, a transcription of the audio data using the adjusted language model; and outputting, by the computing device of the automated speech recognition system, the transcription determined using the adjusted language model, the transcription being output as a speech recognition output of the automated speech recognition system. - View Dependent Claims (11, 12, 13)
-
-
14. An automated speech recognition system comprising:
-
one or more data processing apparatus; and a computer-readable storage device having stored thereon instructions that, when executed by the one or more data processing apparatus, cause the one or more data processing apparatus to perform speech recognition operations comprising; receiving, by the one or more data processing apparatus of the automated speech recognition system, audio data and context data associated with the audio data, wherein the received context data indicates a current context of a device when the device captured the audio data; identifying, by the one or more data processing apparatus of the automated speech recognition system, a topic based on a comparison of the received context data with second context data indicating a previous context of a device, wherein identifying the topic includes; selecting, by the one or more data processing apparatus of the automated speech recognition system, a term based on the comparison between the received context data and the second context data; and in response to selecting the term, selecting, by the one or more data processing apparatus of the automated speech recognition system and as the identified topic, a topic that is associated with the selected term; based on identifying the topic, adjusting, by the one or more data processing apparatus of the automated speech recognition system, a language model based on the identified topic to adjust a likelihood that the language model indicates for one or more terms associated with the topic; determining, by the one or more data processing apparatus of the automated speech recognition system, a transcription of the audio data using the adjusted language model; and outputting, by the one or more data processing apparatus of the automated speech recognition system, the transcription determined using the adjusted language model, the transcription being output as a speech recognition output of the automated speech recognition system. - View Dependent Claims (15)
-
Specification