Language models using non-linguistic context

US 9,842,592 B2
Filed: 02/12/2014
Issued: 12/12/2017
Est. Priority Date: 02/12/2014
Status: Active Grant

First Claim

Patent Images

1. A method performed by data processing apparatus, the method comprising:

receiving, by the data processing apparatus, context data indicating non-linguistic context for an utterance;

generating, by the data processing apparatus and based on the context data, feature scores for one or more non-linguistic features, the generating comprising generating multiple location feature scores, each location feature score indicating whether a user is currently located at a location corresponding to the location feature score, wherein each of the multiple location feature scores corresponds to a different location;

providing, by the data processing apparatus, the feature scores for the one or more non-linguistic features as input to a log-linear language model that has been trained to generate probability scores using feature scores for non-linguistic features, the providing comprising providing the multiple location feature scores to a log-linear language model that has been trained to generate probability scores in response to receiving multiple location feature scores corresponding to different locations;

receiving, by the data processing apparatus, probability scores generated by the log-linear language model using the one or more feature scores for the non-linguistic features;

determining, by the data processing apparatus, a transcription for the utterance using the probability scores generated by the log-linear language model; and

providing, by the data processing apparatus, the transcription determined using the probability scores generated by the log-linear language model.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language models using non-linguistic context. In some implementations, context data indicating non-linguistic context for the utterance is received. Based on the context data, feature scores for one or more non-linguistic features are generated. The feature scores for the non-linguistic features are provided to a language model trained to process scores for non-linguistic features. The output from the language model is received, and a transcription for the utterance is determined using the output of the language model.

Citations

17 Claims

1. A method performed by data processing apparatus, the method comprising:
- receiving, by the data processing apparatus, context data indicating non-linguistic context for an utterance;
  
  generating, by the data processing apparatus and based on the context data, feature scores for one or more non-linguistic features, the generating comprising generating multiple location feature scores, each location feature score indicating whether a user is currently located at a location corresponding to the location feature score, wherein each of the multiple location feature scores corresponds to a different location;
  
  providing, by the data processing apparatus, the feature scores for the one or more non-linguistic features as input to a log-linear language model that has been trained to generate probability scores using feature scores for non-linguistic features, the providing comprising providing the multiple location feature scores to a log-linear language model that has been trained to generate probability scores in response to receiving multiple location feature scores corresponding to different locations;
  
  receiving, by the data processing apparatus, probability scores generated by the log-linear language model using the one or more feature scores for the non-linguistic features;
  
  determining, by the data processing apparatus, a transcription for the utterance using the probability scores generated by the log-linear language model; and
  
  providing, by the data processing apparatus, the transcription determined using the probability scores generated by the log-linear language model.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 14, 15, 16, 17)
- - 2. The method of claim 1, wherein receiving the context data indicating the non-linguistic context for the utterance comprises receiving context data indicating an application through which the utterance is entered;
    - wherein generating the feature scores for the one or more non-linguistic features comprises generating one or more feature scores indicating whether the utterance is entered using a particular application or an application in a particular class of applications;
      
      wherein providing the feature scores comprises providing, to the log-linear language model, the one or more feature scores indicating whether the utterance is entered using a particular application or an application in a particular class of applications; and
      
      wherein receiving the probability scores comprises receiving probability scores generated by the log-linear language model the one or more feature scores indicating whether the utterance is entered using a particular application or an application in a particular class of applications.
  - 3. The method of claim 2, wherein generating the one or more feature scores indicating whether the utterance is entered using a particular application or an application in a particular class of applications comprises determining a binary value for each of a plurality of different applications, each binary value indicating whether the utterance was entered using the corresponding application.
  - 4. The method of claim 1, wherein receiving the context data indicating the non-linguistic context for the utterance comprises receiving context data indicating a gender of a speaker of the utterance;
    - andwherein generating the feature scores for the one or more non-linguistic features comprises generating one or more feature scores indicating the gender of the speaker of the utterance.
  - 5. The method of claim 1, further comprising receiving a set of candidate transcriptions for the utterance;
    - wherein determining the transcription for the utterance using the probability scores generated by the log-linear language model comprises;
      
      generating one or more scores for each of the candidate transcriptions in the set of candidate transcriptions based on the probability scores generated by the log-linear language model; and
      
      selecting one of the candidate transcriptions based on the one or more scores for the candidate transcriptions.
  - 6. The method of claim 1, further comprising:
    - receiving data indicating a linguistic context for the utterance;
      
      determining feature scores for one or more linguistic features based on the data indicating the linguistic context; and
      
      providing the feature scores for the one or more linguistic features as input to the log-linear language model;
      
      wherein receiving the probability scores generated by the log-linear language model comprises receiving probability scores generated by the log-linear language model based on (i) the feature scores for the non-linguistic features and (ii) the feature scores for the linguistic features.
  - 7. The method of claim 6, wherein determining the feature scores for the one or more linguistic features comprises determining n-gram scores based on one or more words spoken prior to the utterance.
  - 14. The method of claim 1, wherein providing the feature scores for the one or more non-linguistic features as input to the log-linear language model comprises:
    - providing the feature scores for the one or more non-linguistic features to a log-linear model that includes parameter values corresponding to the respective one or more non-linguistic features; and
      
      receiving the probability scores generated by the log-linear language model using the one or more feature scores for the non-linguistic features comprises;
      
      receiving probability scores generated based on the feature scores for the non-linguistic features and the corresponding parameter values.
  - 15. The method of claim 1, wherein providing the feature scores as input to the log-linear language model comprises:
    - providing the feature scores to a log-linear language model trained to generate word likelihood scores based on input scores indicating locations and user characteristics, the log-linear language model having been trained using training data indicating words spoken by different users in different contexts, the different contexts comprising different combinations of locations and user characteristics.
  - 16. The method of claim 1, wherein generating the feature scores for the one or more non-linguistic features comprises generating feature scores that are independent of previous words entered by the user;
    - wherein providing the feature scores for the one or more non-linguistic features as input to a log-linear language model comprises providing the feature scores generated independent of previous words entered by the user to a log-linear language model that has been trained to generate probability scores using features scores generated independent of previous words entered by the user; and
      
      wherein receiving the probability scores generated by the log-linear language model comprises receiving probability scores generated by the log-linear language model using the one or more feature scores generated independent of previous words entered by the user.
  - 17. The method of claim 1, wherein the log-linear language model includes weights corresponding to the one or more non-linguistic features, the weights having values determined during training of the log-linear language model based on training examples indicating non-linguistic feature values representing different non-linguistic contexts associated with different words;
    - andwherein receiving the probability scores generated by the log-linear language model comprises receiving probability scores calculated, at least in part, by multiplying the one or more feature values for the non-linguistic features by the weights corresponding to the one or more non-linguistic features.

8. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  receiving, by the one or more computers, context data indicating non-linguistic context for an utterance;
  
  generating, by the one or more computers and based on the context data, feature scores for one or more non-linguistic features, the generating comprising generating multiple location feature scores, each location feature score indicating whether a user is currently located at a location corresponding to the location feature score, wherein each of the multiple location feature scores corresponds to a different location;
  
  providing, by the one or more computers, the feature scores for the one or more non-linguistic features as input to a log-linear language model that has been trained to generate probability scores using feature scores for non-linguistic features, the providing comprising providing the multiple location feature scores to a log-linear language model that has been trained to generate probability scores in response to receiving multiple location feature scores corresponding to different locations;
  
  receiving, by the one or more computers, probability scores generated by the log-linear language model using the one or more feature scores for the non-linguistic features;
  
  determining, by the one or more computers, a transcription for the utterance using the probability scores generated by the log-linear language model; and
  
  providing, by the one or more computers, the transcription determined using the probability scores generated by the log-linear language model.
- View Dependent Claims (9, 10, 11)
- - 9. The system of claim 8, wherein receiving the context data indicating the non-linguistic context for the utterance comprises receiving context data indicating an application through which the utterance is entered;
    - andwherein generating the feature scores for the one or more non-linguistic features comprises generating one or more feature scores indicating whether the utterance is entered using a particular application or an application in a particular class of applications.
  - 10. The system of claim 8, wherein receiving the context data indicating the non-linguistic context for the utterance comprises receiving context data indicating a gender of a speaker of the utterance;
    - andwherein generating the feature scores for the one or more non-linguistic features comprises generating one or more feature scores indicating the gender of the speaker of the utterance.
  - 11. The system of claim 8, wherein the operations further comprise receiving a set of candidate transcriptions for the utterance;
    - wherein determining the transcription for the utterance using the probability scores generated by the log-linear language model comprises;
      
      generating one or more scores for each of the candidate transcriptions in the set of candidate transcriptions based on the probability scores generated by the log-linear language model; and
      
      selecting one of the candidate transcriptions based on the one or more scores for the candidate transcriptions.

12. A non-transitory computer storage device encoded with a computer program, the program comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
- receiving, by the one or more computers, context data indicating non-linguistic context for an utterance;
  
  generating, by the one or more computers and based on the context data, feature scores for one or more non-linguistic features, the generating comprising generating multiple location feature scores, each location feature score indicating whether user is currently located at a location corresponding to the location feature score, wherein each of the multiple location feature scores corresponds to a different location;
  
  providing, by the one or more computers, the feature scores for the one or more non-linguistic features as input to a log-linear language model that has been trained to generate probability scores using feature scores for non-linguistic features, the providing comprising providing the multiple location feature scores to a log-linear language model that has been trained to generate probability scores in response to receiving multiple location feature scores corresponding to different locations;
  
  receiving, by the one or more computers, probability scores generated by the log-linear language model using the one or more feature scores for the non-linguistic features;
  
  determining, by the one or more computers, a transcription for the utterance using the probability scores generated by the log-linear language model; and
  
  providing, by the one or more computers, the transcription determined using the probability scores generated by the log-linear language model.
- View Dependent Claims (13)
- - 13. The computer storage device of claim 12, wherein receiving the context data indicating the non-linguistic context for the utterance comprises receiving context data indicating an application through which the utterance is entered;
    - andwherein generating the feature scores for the one or more non-linguistic features comprises generating one or more feature scores indicating whether the utterance is entered using a particular application or an application in a particular class of applications.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Biadsy, Fadi, Moreno Mengibar, Pedro J.
Primary Examiner(s)
Mishra, Richa

Application Number

US14/179,257
Publication Number

US 20150228279A1
Time in Patent Office

1,399 Days
Field of Search

None
US Class Current
CPC Class Codes

G10L 15/08   Speech classification or se...

G10L 15/197   Probabilistic grammars, e.g...

G10L 15/26   Speech to text systems G10L...

G10L 2015/226   using non-speech characteri...

G10L 2015/228   of application context

Language models using non-linguistic context

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Language models using non-linguistic context

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links