Adjusting language models based on topics identified using context

US 9,542,945 B2
Filed: 06/10/2015
Issued: 01/10/2017
Est. Priority Date: 12/30/2010
Status: Active Grant

First Claim

Patent Images

1. A method of performing speech recognition, the method being performed by an automated speech recognition system, the method comprising:

receiving, by the automated speech recognition system, audio data and context data associated with the audio data, wherein the received context data indicates a current context of a device when the device captured the audio data;

identifying, by the automated speech recognition system, a topic based on a comparison of the received context data with second context data indicating a previous context of a device, wherein identifying the topic includes;

selecting, by the automated speech recognition system, a term based on the comparison between the received context data and the second context data; and

in response to selecting the term, selecting, by the automated speech recognition system and as the identified topic, a topic that is associated with the selected term;

based on identifying the topic, adjusting, by the automated speech recognition system, a language model based on the identified topic to adjust a likelihood that the language model indicates for one or more terms associated with the topic;

determining, by the automated speech recognition system, a transcription of the audio data using the adjusted language model; and

outputting, by the automated speech recognition system, the transcription determined using the adjusted language model, the transcription being output as a speech recognition output of the automated speech recognition system.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for adjusting language models. In one aspect, a method includes accessing audio data. Information that indicates a first context is accessed, the first context being associated with the audio data. At least one term is accessed. Information that indicates a second context is accessed, the second context being associated with the term. A similarity score is determined that indicates a degree of similarity between the second context and the first context. A language model is adjusted based on the accessed term and the determined similarity score to generate an adjusted language model. Speech recognition is performed on the audio data using the adjusted language model to select one or more candidate transcriptions for a portion of the audio data.

155 Citations

15 Claims

1. A method of performing speech recognition, the method being performed by an automated speech recognition system, the method comprising:
- receiving, by the automated speech recognition system, audio data and context data associated with the audio data, wherein the received context data indicates a current context of a device when the device captured the audio data;
  
  identifying, by the automated speech recognition system, a topic based on a comparison of the received context data with second context data indicating a previous context of a device, wherein identifying the topic includes;
  
  selecting, by the automated speech recognition system, a term based on the comparison between the received context data and the second context data; and
  
  in response to selecting the term, selecting, by the automated speech recognition system and as the identified topic, a topic that is associated with the selected term;
  
  based on identifying the topic, adjusting, by the automated speech recognition system, a language model based on the identified topic to adjust a likelihood that the language model indicates for one or more terms associated with the topic;
  
  determining, by the automated speech recognition system, a transcription of the audio data using the adjusted language model; and
  
  outputting, by the automated speech recognition system, the transcription determined using the adjusted language model, the transcription being output as a speech recognition output of the automated speech recognition system.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, wherein identifying the topic based on the comparison of the received context data with second context data includes:
    - identifying the topic based on a comparison of the received context data with second context data that indicates a context at a time at which the term was provided by a user, the term being designated as one of a plurality of words related to the topic.
  - 3. The method of claim 1, wherein adjusting the language model based on the identified topic to adjust the likelihood that the language model indicates for one or more terms associated with the topic includes:
    - adjusting the language model based on the identified topic to adjust a likelihood that the language model indicates for one or more terms associated with the topic that are different from the selected term.
  - 4. The method of claim 1, comprising:
    - selecting, from among multiple sets of terms that are each associated with a respective topic, a particular set of terms that are associated with the identified topic; and
      
      wherein adjusting the language model based on the identified topic to adjust the likelihood that the language model indicates for one or more terms associated with the topic, includes;
      
      adjusting the language model based on the identified topic to adjust a likelihood that the language model indicates for each of the terms in the particular set of terms associated with the topic.
  - 5. The method of claim 1, comprising:
    - determining, for each of multiple terms, a similarity score that reflects a degree of similarity between (i) context data that is associated with the respective term and (ii) the received context data; and
      
      wherein selecting the term based on the comparison between the received context data and the second context data includes;
      
      selecting the term from among the multiple terms based at least on the similarity score for the selected term.
  - 6. The method of claim 5, wherein determining, for each of multiple terms, a similarity score includes:
    - determining, for each of multiple terms, a similarity score that is indicative of a degree of similarity between;
      
      (i) context data indicating a time associated with the term or a geographic location associated with the term, and(ii) the received context data indicating a time associated with the received audio data or a geographic location associated with the received audio data.
  - 7. The method of claim 5, wherein adjusting the language model based on the identified topic to adjust the likelihood that the language model indicates for one or more terms associated with the topic includes:
    - adjusting the language model based on the identified topic and one or more of the determined similarity scores to adjust the likelihood that the language model indicates for one or more terms associated with the topic.
  - 8. The method of claim 1, wherein receiving audio data includes:
    - receiving, at a first point in time, audio data that represents speech of a user as obtained by one of a plurality of client devices on which an account associated with the user is accessed; and
      
      wherein selecting the term includes;
      
      selecting the term from among terms that are each associated with data obtained by one of the plurality of client devices that reflects (i) input received through a user interface of the client device before the first point in time, or (ii) content which, before the first point in time, was associated the user'"'"'s account and accessible to one or more of the plurality of client devices.
  - 9. The method of claim 1, comprising:
    - accessing multiple terms that are each associated with previously-received input data;
      
      wherein selecting the term includes selecting a particular one of the multiple terms; and
      
      wherein adjusting the language model based on the identified topic to adjust the likelihood that the language model indicates for one or more terms associated with the topic includes;
      
      adjusting a language model based on the identified topic to adjust a likelihood that the language model indicates for terms associated with the topic which include (i) one or more of the multiple terms associated with previously-received input data, and (ii) one or more terms not associated with previously-received input data.

10. A non-transitory computer-readable storage device having instructions stored thereon that, when executed by a computing device of an automated speech recognition system, cause the computing device to perform speech recognition operations comprising:
- receiving, by the computing device of the automated speech recognition system, audio data and context data associated with the audio data, wherein the received context data indicates a current context of a device when the device captured the audio data;
  
  identifying, by the computing device of the automated speech recognition system, a topic based on a comparison of the received context data with second context data indicating a previous context of a device, wherein identifying the topic includes;
  
  selecting, by the computing device of the automated speech recognition system, a term based on the comparison between the received context data and the second context data; and
  
  in response to selecting the term, selecting, by the computing device of the automated speech recognition system and as the identified topic, a topic that is associated with the selected term;
  
  based on identifying the topic, adjusting, by the computing device of the automated speech recognition system, a language model based on the identified topic to adjust a likelihood that the language model indicates for one or more terms associated with the topic;
  
  determining, by the computing device of the automated speech recognition system, a transcription of the audio data using the adjusted language model; and
  
  outputting, by the computing device of the automated speech recognition system, thetranscription determined using the adjusted language model, the transcription being output as a speech recognition output of the automated speech recognition system.
- View Dependent Claims (11, 12, 13)
- - 11. The non-transitory computer-readable storage device of claim 10, wherein identifying the topic based on the comparison of the received context data with second context data includes:
    - identifying the topic based on a comparison of the received context data with second context data that indicates a context at a time at which the term was provided by a user, the term being designated as one of a plurality of words related to the topic.
  - 12. The non-transitory computer-readable storage device of claim 10, wherein adjusting the language model based on the identified topic to adjust the likelihood that the language model indicates for one or more terms associated with the topic includes:
    - adjusting the language model based on the identified topic to adjust a likelihood that the language model indicates for one or more terms associated with the topic that are different from the selected term.
  - 13. The non-transitory computer-readable storage device of claim 10, wherein the operations comprise:
    - selecting, from among multiple sets of terms that are each associated with a respective topic, a particular set of terms that are associated with the identified topic; and
      
      wherein adjusting the language model based on the identified topic to adjust the likelihood that the language model indicates for one or more terms associated with the topic, includes;
      
      adjusting the language model based on the identified topic to adjust a likelihood that the language model indicates for each of the terms in the particular set of terms associated with the topic.

14. An automated speech recognition system comprising:
- one or more data processing apparatus; and
  
  a computer-readable storage device having stored thereon instructions that, when executed by the one or more data processing apparatus, cause the one or more data processing apparatus to perform speech recognition operations comprising;
  
  receiving, by the one or more data processing apparatus of the automated speech recognition system, audio data and context data associated with the audio data, wherein the received context data indicates a current context of a device when the device captured the audio data;
  
  identifying, by the one or more data processing apparatus of the automated speech recognition system, a topic based on a comparison of the received context data with second context data indicating a previous context of a device, wherein identifying the topic includes;
  
  selecting, by the one or more data processing apparatus of the automated speech recognition system, a term based on the comparison between the received context data and the second context data; and
  
  in response to selecting the term, selecting, by the one or more data processing apparatus of the automated speech recognition system and as the identified topic, a topic that is associated with the selected term;
  
  based on identifying the topic, adjusting, by the one or more data processing apparatus of the automated speech recognition system, a language model based on the identified topic to adjust a likelihood that the language model indicates for one or more terms associated with the topic;
  
  determining, by the one or more data processing apparatus of the automated speech recognition system, a transcription of the audio data using the adjusted language model; and
  
  outputting, by the one or more data processing apparatus of the automated speech recognition system, the transcription determined using the adjusted language model, the transcription being output as a speech recognition output of the automated speech recognition system.
- View Dependent Claims (15)
- - 15. The system of claim 14, wherein identifying the topic based on the comparison of the received context data with second context data includes:
    - identifying the topic based on a comparison of the received context data with second context data that indicates a context at a time at which the term was provided by a user, the term being designated as one of a plurality of words related to the topic.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Lloyd, Matthew I.
Primary Examiner(s)
Opsasnick, Michael N

Application Number

US14/735,416
Publication Number

US 20150269938A1
Time in Patent Office

580 Days
Field of Search

704/9, 704/239
US Class Current

1/1
CPC Class Codes

G10L 15/18   using natural language mode...

G10L 15/183   using context dependencies,...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

G10L 2015/223   Execution procedure of a sp...

G10L 25/12   the extracted parameters be...

G10L 25/51   for comparison or discrimin...

Adjusting language models based on topics identified using context

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

155 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Adjusting language models based on topics identified using context

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

155 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links