LANGUAGE MODELING IN SPEECH RECOGNITION

US 20150279360A1
Filed: 04/01/2014
Published: 10/01/2015
Est. Priority Date: 04/01/2014
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method, comprising:

providing a training set of text samples to a semantic parser that associates text samples with domains;

obtaining data that indicates associations determined by the semantic parser between at least some of the text samples of the training set and one or more domains;

generating a first subset of text samples that the semantic parser has associated with a first of the one or more domains;

generating a first language model for the first of the one or more domains using the first subset of text samples that the semantic parser has associated with the first of the one or more domains; and

performing speech recognition on an utterance using the first language model for the first of the one or more domains.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Some implementations include a computer-implemented method. The method can include providing a training set of text samples to a semantic parser that associates text samples with actions. The method can include obtaining, for each of one or more of the text samples of the training set, data that indicates one or more domains that the semantic parser has associated with the text sample. For each of one or more domains, a subset of the text samples of the training set can be generated that the semantic parser has associated with the domain. Using the subset of text samples associated with the domain, a language model can be generated for one or more of the domain. Speech recognition can be performed on an utterance using the one or more language models that are generated for the one or more of the domains.

Citations

20 Claims

1. A computer-implemented method, comprising:
- providing a training set of text samples to a semantic parser that associates text samples with domains;
  
  obtaining data that indicates associations determined by the semantic parser between at least some of the text samples of the training set and one or more domains;
  
  generating a first subset of text samples that the semantic parser has associated with a first of the one or more domains;
  
  generating a first language model for the first of the one or more domains using the first subset of text samples that the semantic parser has associated with the first of the one or more domains; and
  
  performing speech recognition on an utterance using the first language model for the first of the one or more domains.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The computer-implemented method of claim 1, wherein the text samples in the training set are identified from at least one of records of past search queries, web pages, books, periodicals, and other electronic documents.
  - 3. The computer-implemented method of claim 1, wherein at least some of the text samples in the training set are identified from records of past utterances spoken by a population of users.
  - 4. The computer-implemented method of claim 1, wherein performing speech recognition on the utterance further comprises using, along with the first language model for the first of the one or more domains, a general language model that is not associated with particular domains.
  - 5. The computer-implemented method of claim 1, further comprising:
    - generating a second subset of text samples that the semantic parser has associated with a second of the one or more domains; and
      
      generating a second language model for the second of the one or more domains using the second subset of text samples that the semantic parser has associated with the second of the one or more domains.
  - 6. The computer-implemented method of claim 5, wherein performing speech recognition on the utterance further comprises using the second language model for the second of the one or more domains.
  - 7. The computer-implemented method of claim 6, wherein performing speech recognition on the utterance comprises:
    - obtaining a first transcription of the utterance from the first language model and a second transcription of the utterance from the second language model;
      
      obtaining respective scores for the first transcription and the second transcription that indicate respective likelihoods that the first transcription or the second transcription accurately reflects the utterance; and
      
      selecting the first transcription or the second transcription to provide to a user based at least on the respective scores for the first transcription and the second transcription.
  - 8. The computer-implemented method of claim 7, further comprising identifying context information associated with the utterance, and using the context information to bias the respective scores for the transcriptions.
  - 9. The computer-implemented method of claim 8, wherein using the context information to bias the respective scores for the transcriptions comprises determining whether the context information is consistent with the first of the one or more domains or the second of the one or more domains.
  - 10. The computer-implemented method of claim 1, further comprising obtaining, for particular ones of the text samples of the training set, a confidence score that indicates a confidence of the association between the text sample and the one or more domains that the semantic parser has associated with the text sample.
  - 11. The computer-implemented method of claim 10, further comprising identifying data that indicates user confirmation of the one or more domains that the semantic parser has associated with a particular one of the text samples, and in response, biasing the confidence score for the particular one of the text samples to indicate a greater confidence in the association between the particular one of the text samples and the one or more domains.
  - 12. The computer-implemented method of claim 10, wherein generating the first subset of text samples that the semantic parser has associated with the first of the one or more domains comprises excluding text samples from the first subset of the text samples that have confidence scores below a predetermined threshold.
  - 13. The computer-implemented method of claim 1, wherein generating the first language model for the first of the one or more domains comprises identifying terms in the text samples that are associated with a class, and wherein performing speech recognition on the utterance using the first language model comprises accessing lists of terms associated with the class.
  - 14. The computer-implemented method of claim 1, wherein the one or more domains are one or more actions that a user may request or command to be executed.

15. One or more computer-readable storage devices having instructions stored thereon that, when executed by one or more computers, cause the one or more computers to perform operations comprising:
- providing a training set of text samples to a semantic parser that associates text samples with domains;
  
  obtaining data that indicates associations determined by the semantic parser between at least some of the text samples of the training set and one or more domains;
  
  generating a first subset of text samples that the semantic parser has associated with a first of the one or more domains;
  
  generating a first language model for the first of the one or more domains using the first subset of text samples that the semantic parser has associated with the first of the one or more domains; and
  
  performing speech recognition on an utterance using the first language model for the first of the one or more domains.
- View Dependent Claims (16, 17, 18, 19)
- - 16. The one or more computer-readable storage devices of claim 15, wherein the text samples in the training set are identified from at least one of records of past search queries, web pages, books, periodicals, and other electronic documents.
  - 17. The one or more computer-readable storage devices of claim 15, wherein at least some of the text samples in the training set are identified from records of past utterances spoken by a population of users.
  - 18. The one or more computer-readable storage devices of claim 15, wherein performing speech recognition on the utterance further comprises using, along with the first language model for the first of the one or more domains, a general language model that is not associated with particular domains.
  - 19. The one or more computer-readable storage devices of claim 15, wherein the operations further comprise:
    - generating a second subset of text samples that the semantic parser has associated with a second of the one or more domains; and
      
      generating a second language model for the second of the one or more domains using the second subset of text samples that the semantic parser has associated with the second of the one or more domains.

20. A system comprising:
- one or more computers configured to provide;
  
  a repository of training data that includes a plurality of text samples in a natural language;
  
  a semantic parser configured to process a set of text samples from the plurality of text samples to determine, for each text sample in the set of text samples, a domain associated with the text sample;
  
  a training set manager configured to generate subsets of text samples that correspond to respective domains, wherein each subset of text samples includes text samples that the semantic parser has associated with the domain that corresponds to the subset of text samples;
  
  a language modeling engine configured to generate a respective language model for each of the subsets of text samples; and
  
  a speech recognizer configured to receive an utterance and to recognize the utterance using one or more of the language models that are generated for each of the subsets of text samples.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Mengibar, Pedro J. Moreno, Epstein, Mark Edward

Granted Patent

US 9,286,892 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06F 40/205   Parsing

G06F 40/289   Phrasal analysis, e.g. fini...

G06F 40/30   Semantic analysis

G10L 15/063   Training

G10L 15/18   using natural language mode...

G10L 15/1815   Semantic context, e.g. disa...

LANGUAGE MODELING IN SPEECH RECOGNITION

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

LANGUAGE MODELING IN SPEECH RECOGNITION

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links