Adjusting language models using context information
First Claim
1. A method comprising:
- obtaining audio data;
accessing first context information associated with the audio data, wherein the first context information indicates (i) a first geographical location, and (ii) a first time;
accessing second context information associated with one or more previously typed or previously transcribed terms, wherein the second context information indicates (i) a second geographical location and (ii) a second time;
determining a similarity score for the first context information and the second context information based on (i) a degree of a similarity of the second geographical location to the first geographical location and (ii) a degree of a similarity of the second time to the first time;
adjusting a language model based on the similarity score to adjust a likelihood that the language model indicates the one or more previously typed or previously transcribed terms as a candidate transcription of the audio data;
determining a transcription of the audio data using the adjusted language model; and
outputting the transcription that was determined using the adjusted language model.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for adjusting language models. In one aspect, a method includes accessing audio data. Information that indicates a first context is accessed, the first context being associated with the audio data. At least one term is accessed. Information that indicates a second context is accessed, the second context being associated with the term. A similarity score is determined that indicates a degree of similarity between the second context and the first context. A language model is adjusted based on the accessed term and the determined similarity score to generate an adjusted language model. Speech recognition is performed on the audio data using the adjusted language model to select one or more candidate transcriptions for a portion of the audio data.
-
Citations
21 Claims
-
1. A method comprising:
-
obtaining audio data; accessing first context information associated with the audio data, wherein the first context information indicates (i) a first geographical location, and (ii) a first time; accessing second context information associated with one or more previously typed or previously transcribed terms, wherein the second context information indicates (i) a second geographical location and (ii) a second time; determining a similarity score for the first context information and the second context information based on (i) a degree of a similarity of the second geographical location to the first geographical location and (ii) a degree of a similarity of the second time to the first time; adjusting a language model based on the similarity score to adjust a likelihood that the language model indicates the one or more previously typed or previously transcribed terms as a candidate transcription of the audio data; determining a transcription of the audio data using the adjusted language model; and outputting the transcription that was determined using the adjusted language model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 21)
-
-
13. A system comprising:
-
one or more processors; and a non-transitory computer-readable medium coupled to the one or more processors having instructions stored thereon which, when executed by the one or more processors, cause the system to perform operations comprising; obtaining audio data; accessing first context information associated with the audio data, wherein the first context information indicates (i) a first geographical location, and (ii) a first time; accessing second context information associated with one or more previously typed or previously transcribed terms, wherein the second context information indicates (i) a second geographical location and (ii) a second time; determining a similarity score for the first context information and the second context information based on (i) a degree of a similarity of the second geographical location to the first geographical location and (ii) a degree of a similarity of the second time to the first time; adjusting a language model based on the similarity score to adjust a likelihood that the language model indicates the one or more previously typed or previously transcribed terms as a candidate transcription of the audio data; determining a transcription of the audio data using the adjusted language model; and outputting the transcription that was determined using the adjusted language model. - View Dependent Claims (14, 15, 16)
-
-
17. A non-transitory computer storage medium storing a computer program, the program comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
-
obtaining audio data; accessing first context information associated with the audio data, wherein the first context information that indicates (i) a first geographical location, and a first time; accessing second context information associated with one or more previously typed or previously transcribed terms, wherein the second context information indicates (i) a second geographical location and (ii) a second time; determining a similarity score for the first context information and the second context information based on (i) a degree of a similarity of the second geographical location to the first geographical location and (ii) a degree of a similarity of the second time to the first time; adjusting a language model based on the similarity score to adjust a likelihood that the language model indicates the one or more previously typed or previously transcribed terms as a candidate transcription of the audio data; determining a transcription of the audio data using the adjusted language model; and outputting the transcription that was determined using the adjusted language model. - View Dependent Claims (18, 19, 20)
-
Specification