Robust speech recognition

US 8,370,146 B1
Filed: 09/30/2011
Issued: 02/05/2013
Est. Priority Date: 08/31/2010
Status: Active Grant

First Claim

Patent Images

1. A system comprising:

one or more processors;

a computer-readable medium coupled to the one or more processors having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations comprising;

receiving a user input and a grammar including annotations, the user input comprising audio data and the annotations providing syntax and semantics to the grammar;

retrieving third-party statistical speech recognition information, the statistical speech recognition information being transmitted over a network;

generating a statistical language model (SLM) based on the grammar and the statistical speech recognition information, the SLM preserving semantics of the grammar;

processing the user input using the SLM to generate one or more results;

comparing the one or more results to candidates provided in the grammar;

identifying a particular candidate of the grammar based on the comparing;

providing the particular candidate for input to an application executed on a computing device;

translating the user input to a second language, different than a first language of the user input;

generating a plurality of translation hypotheses based on the translating;

translating each translation hypothesis of the plurality of translation hypotheses to the first language to provide a plurality of translated hypotheses; and

appending the plurality of translated hypotheses as results to the one or more results.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing speech input. In one aspect, a method includes receiving a user input and a grammar including annotations, the user input comprising audio data and the annotations providing syntax and semantics to the grammar, retrieving third-party statistical speech recognition information, the statistical speech recognition information being transmitted over a network, generating a statistical language model (SLM) based on the grammar and the statistical speech recognition information, the SLM preserving semantics of the grammar, processing the user input using the SLM to generate one or more results, comparing the one or more results to candidates provided in the grammar, identifying a particular candidate of the grammar based on the comparing, and providing the particular candidate for input to an application executed on a computing device.

43 Citations

View as Search Results

13 Claims

1. A system comprising:
- one or more processors;
  
  a computer-readable medium coupled to the one or more processors having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations comprising;
  
  receiving a user input and a grammar including annotations, the user input comprising audio data and the annotations providing syntax and semantics to the grammar;
  
  retrieving third-party statistical speech recognition information, the statistical speech recognition information being transmitted over a network;
  
  generating a statistical language model (SLM) based on the grammar and the statistical speech recognition information, the SLM preserving semantics of the grammar;
  
  processing the user input using the SLM to generate one or more results;
  
  comparing the one or more results to candidates provided in the grammar;
  
  identifying a particular candidate of the grammar based on the comparing;
  
  providing the particular candidate for input to an application executed on a computing device;
  
  translating the user input to a second language, different than a first language of the user input;
  
  generating a plurality of translation hypotheses based on the translating;
  
  translating each translation hypothesis of the plurality of translation hypotheses to the first language to provide a plurality of translated hypotheses; and
  
  appending the plurality of translated hypotheses as results to the one or more results.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The system of claim 1, wherein generating the SLM comprises:
    - retrieving a baseline SLM from computer memory; and
      
      modifying the baseline SLM based on the grammar and the statistical speech recognition information to generate the SLM.
  - 3. The system of claim 1, wherein the operations further comprise determining a weight associated with each result of the one or more results based on the statistical speech recognition information, wherein identifying a particular candidate is further based on the weight associated with each result.
  - 4. The system of claim 1, wherein processing the user input using the SLM to generate one or more results comprises applying a paraphrase function to the user input to generate the one or more results as one or more fragments.
  - 5. The system of claim 4, wherein the operations further comprise assigning a weight to each fragment of the one or more fragments, the weight corresponding to a degree of similarity between the user input and a respective fragment.
  - 6. The system of claim 1, wherein comparing the one or more results to candidates provided in the grammar comprises:
    - applying a paraphrase function to each of the one or more results to generate one or more paraphrased results; and
      
      comparing the one or more paraphrased results to the candidates.
  - 7. The system of claim 1, wherein the operations further comprise:
    - determining that no candidate of the grammar corresponds to the one or more results based on the comparing; and
      
      generating an error indication in response to determining that no candidate of the grammar corresponds to the one or more results.
  - 8. The system of claim 7, wherein the operations further comprise transmitting a request for additional user input.
  - 9. The system of claim 1, wherein the one or more processors are provided in a server, and the user input and the grammar are transmitted to the server from a client computing device over a network.
  - 10. The system of claim 1, wherein the user input is received through a microphone of a computing device comprising the one or more processors.
  - 11. The system of claim 1, wherein retrieving third-party statistical speech recognition information comprises using data obtained over the network from one or more knowledge bases, the one or more knowledge bases including the World Wide Web, query streams input to web-based query web sites, or both.

12. A computer-readable medium coupled to one or more processors having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations comprising:
- receiving a user input and a grammar including annotations, the user input comprising audio data and the annotations providing syntax and semantics to the grammar;
  
  retrieving third-party statistical speech recognition information, the statistical speech recognition information being transmitted over a network;
  
  generating a statistical language model (SLM) based on the grammar and the statistical speech recognition information, the SLM preserving semantics of the grammar;
  
  processing the user input using the SLM to generate one or more results;
  
  comparing the one or more results to candidates provided in the grammar;
  
  identifying a particular candidate of the grammar based on the comparing;
  
  providing the particular candidate for input to an application executed on a computing device;
  
  translating the user input to a second language, different than a first language of the user input;
  
  generating a plurality of translation hypotheses based on the translating;
  
  translating each translation hypothesis of the plurality of translation hypotheses to the first language to provide a plurality of translated hypotheses; and
  
  appending the plurality of translated hypotheses as results to the one or more results.

13. A computer-implemented method, comprising:
- receiving, at a computing device, a user input and a grammar including annotations, the user input comprising audio data and the annotations providing syntax and semantics to the grammar;
  
  retrieving third-party statistical speech recognition information from a computer-readable storage device, the statistical speech recognition information being transmitted to the computing device over a network;
  
  generating a statistical language model (SLM) based on the grammar and the statistical speech recognition information, the SLM preserving semantics of the grammar;
  
  processing the user input using the SLM to generate one or more results;
  
  comparing the one or more results to candidates provided in the grammar;
  
  identifying a particular candidate of the grammar based on the comparing;
  
  providing the particular candidate for input to an application;
  
  translating the user input to a second language, different than a first language of the user input;
  
  generating a plurality of translation hypotheses based on the translating;
  
  translating each translation hypothesis of the plurality of translation hypotheses to the first language to provide a plurality of translated hypotheses; and
  
  appending the plurality of translated hypotheses as results to the one or more results.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Schalkwyk, Johan, Bringert, Bjorn, Singleton, David P.
Primary Examiner(s)
Harper, Vincent P

Application Number

US13/249,628
Time in Patent Office

494 Days
Field of Search

704/235, 704/240, 704/251, 704/255, 704/257
US Class Current

704/255
CPC Class Codes

G10L 15/1815   Semantic context, e.g. disa...

G10L 15/197   Probabilistic grammars, e.g...

G10L 2015/223   Execution procedure of a sp...

Robust speech recognition

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

43 Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Robust speech recognition

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

43 Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links