Adjusting language models using context information

US 9,076,445 B1
Filed: 12/05/2012
Issued: 07/07/2015
Est. Priority Date: 12/30/2010
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

obtaining audio data;

accessing first context information associated with the audio data, wherein the first context information indicates (i) a first geographical location, and (ii) a first time;

accessing second context information associated with one or more previously typed or previously transcribed terms, wherein the second context information indicates (i) a second geographical location and (ii) a second time;

determining a similarity score for the first context information and the second context information based on (i) a degree of a similarity of the second geographical location to the first geographical location and (ii) a degree of a similarity of the second time to the first time;

adjusting a language model based on the similarity score to adjust a likelihood that the language model indicates the one or more previously typed or previously transcribed terms as a candidate transcription of the audio data;

determining a transcription of the audio data using the adjusted language model; and

outputting the transcription that was determined using the adjusted language model.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for adjusting language models. In one aspect, a method includes accessing audio data. Information that indicates a first context is accessed, the first context being associated with the audio data. At least one term is accessed. Information that indicates a second context is accessed, the second context being associated with the term. A similarity score is determined that indicates a degree of similarity between the second context and the first context. A language model is adjusted based on the accessed term and the determined similarity score to generate an adjusted language model. Speech recognition is performed on the audio data using the adjusted language model to select one or more candidate transcriptions for a portion of the audio data.

Citations

21 Claims

1. A method comprising:
- obtaining audio data;
  
  accessing first context information associated with the audio data, wherein the first context information indicates (i) a first geographical location, and (ii) a first time;
  
  accessing second context information associated with one or more previously typed or previously transcribed terms, wherein the second context information indicates (i) a second geographical location and (ii) a second time;
  
  determining a similarity score for the first context information and the second context information based on (i) a degree of a similarity of the second geographical location to the first geographical location and (ii) a degree of a similarity of the second time to the first time;
  
  adjusting a language model based on the similarity score to adjust a likelihood that the language model indicates the one or more previously typed or previously transcribed terms as a candidate transcription of the audio data;
  
  determining a transcription of the audio data using the adjusted language model; and
  
  outputting the transcription that was determined using the adjusted language model.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 21)
- - 2. The method of claim 1, wherein obtaining the audio data comprises receiving the audio data over a network from client device;
    - andwherein outputting the transcription determined using the adjusted language model comprises providing the transcription to the client device over the network.
  - 3. The method of claim 1, wherein accessing the first context information comprises accessing information that indicates a geographical location where the audio data was recorded and a time when the audio data was recorded.
  - 4. The method of claim 1, wherein accessing the second context information comprises accessing second context information that is associated with one or more terms previously transcribed for other audio, the second context information indicating (i) a particular geographical location where the other audio was input, and (ii) a time when the other audio was input at the particular geographical location.
  - 5. The method of claim 1, wherein obtaining the audio data comprises obtaining audio data for an utterance of a user;
    - wherein accessing the first context information comprises accessing information that indicates a geographical location of a device when the audio data was recorded by the device and a time when the audio data was recorded by the device; and
      
      wherein accessing the second context information comprises accessing second context information associated with one or more previously transcribed terms that were previously transcribed from previously received audio data for a previous utterance of the user, the second context information indicating a geographical location of the device when the previous utterance of the user was input to the device and a time when the previous utterance of the user was input to the device.
  - 6. The method of claim 1, wherein the first time indicates a first day of week when the audio data was recorded and the second time indicates a second day of week when the one or more previously typed or previously transcribed terms were input;
    - andwherein determining the similarity score comprises determining the similarity score based on a similarity of the second day of week to the first day of week.
  - 7. The method of claim 1, wherein the first time indicates a first time of day when the audio data was recorded and the second time indicates a second time of day when the one or more previously typed or previously transcribed terms were input;
    - andwherein determining the similarity score comprises determining the similarity score based on a similarity of the second time of day to the first time of day.
  - 8. The method of claim 1, wherein determining the similarity score comprises determining the similarity score based on a distance between the second geographical location and the first geographical location.
  - 9. The method of claim 1, wherein accessing the first context information comprises accessing information that indicates a geographical location indicated by a Global Positioning System (GPS) receiver of a device that receives the audio data.
  - 10. The method of claim 1, wherein adjusting the language model based on the similarity score comprises changing one or more weighting values in the language model that correspond to the one or more previously typed or previously transcribed terms.
  - 11. The method of claim 10, wherein changing the one or more weighting values comprises changing the one or more weighting values such that a magnitude of the change in the one or more weighting values is based on the similarity score.
  - 12. The method of claim 1, wherein adjusting the language model based on the similarity score comprises increasing the likelihood by an amount that is based on the similarity score.
  - 21. The method of claim 1, wherein obtaining the audio data comprises obtaining audio data for an utterance of a user;
    - andwherein accessing the second context information comprises accessing second context information associated with one or more terms that were previously typed by the user, the second context information indicating (i) a particular geographical location where the user typed the one or more terms, and (ii) a time when the user typed the one or more terms while at the particular geographical location.

13. A system comprising:
- one or more processors; and
  
  a non-transitory computer-readable medium coupled to the one or more processors having instructions stored thereon which, when executed by the one or more processors, cause the system to perform operations comprising;
  
  obtaining audio data;
  
  accessing first context information associated with the audio data, wherein the first context information indicates (i) a first geographical location, and (ii) a first time;
  
  accessing second context information associated with one or more previously typed or previously transcribed terms, wherein the second context information indicates (i) a second geographical location and (ii) a second time;
  
  determining a similarity score for the first context information and the second context information based on (i) a degree of a similarity of the second geographical location to the first geographical location and (ii) a degree of a similarity of the second time to the first time;
  
  adjusting a language model based on the similarity score to adjust a likelihood that the language model indicates the one or more previously typed or previously transcribed terms as a candidate transcription of the audio data;
  
  determining a transcription of the audio data using the adjusted language model; and
  
  outputting the transcription that was determined using the adjusted language model.
- View Dependent Claims (14, 15, 16)
- - 14. The system of claim 13, wherein the first time indicates a first day of week when the audio data was recorded and the second time indicates a second day of week when the one or more previously typed or previously transcribed terms were input;
    - andwherein determining the similarity score comprises determining the similarity score based on a similarity of the second day of week to the first day of week.
  - 15. The system of claim 13, wherein the first time indicates a first time of day when the audio data was recorded and the second time indicates a second time of day when the one or more previously typed or previously transcribed terms were input;
    - andwherein determining the similarity score comprises determining the similarity score based on a similarity of the second time of day to the first time of day.
  - 16. The system of claim 13, wherein determining the similarity score comprises determining the similarity score based on a distance between the second geographical location and the first geographical location.

17. A non-transitory computer storage medium storing a computer program, the program comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
- obtaining audio data;
  
  accessing first context information associated with the audio data, wherein the first context information that indicates (i) a first geographical location, and a first time;
  
  accessing second context information associated with one or more previously typed or previously transcribed terms, wherein the second context information indicates (i) a second geographical location and (ii) a second time;
  
  determining a similarity score for the first context information and the second context information based on (i) a degree of a similarity of the second geographical location to the first geographical location and (ii) a degree of a similarity of the second time to the first time;
  
  adjusting a language model based on the similarity score to adjust a likelihood that the language model indicates the one or more previously typed or previously transcribed terms as a candidate transcription of the audio data;
  
  determining a transcription of the audio data using the adjusted language model; and
  
  outputting the transcription that was determined using the adjusted language model.
- View Dependent Claims (18, 19, 20)
- - 18. The non-transitory computer storage medium of claim 17, wherein the first time indicates a first day of week when the audio data was recorded and the second time indicates a second day of week when the one or more previously typed or previously transcribed terms were input;
    - andwherein determining the similarity score comprises determining the similarity score based on a similarity of the second day of week to the first day of week.
  - 19. The non-transitory computer storage medium of claim 17, wherein the first time indicates a first time of day when the audio data was recorded and the second time indicates a second time of day when the one or more previously typed or previously transcribed terms were input;
    - andwherein determining the similarity score comprises determining the similarity score based on a similarity of the second time of day to the first time of day.
  - 20. The non-transitory computer storage medium of claim 17, wherein determining the similarity score comprises determining the similarity score based on a distance between the second geographical location and the first geographical location.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Lloyd, Matthew I.
Primary Examiner(s)
Opsasnick, Michael N

Application Number

US13/705,228
Time in Patent Office

944 Days
Field of Search

704/9
US Class Current

1/1
CPC Class Codes

G10L 15/18   using natural language mode...

G10L 15/183   using context dependencies,...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 2015/223   Execution procedure of a sp...

G10L 25/12   the extracted parameters be...

G10L 25/51   for comparison or discrimin...

Adjusting language models using context information

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Adjusting language models using context information

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links