Robust speech recognition
First Claim
1. A system comprising:
- one or more processors;
a computer-readable medium coupled to the one or more processors having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations comprising;
receiving a user input and a grammar including annotations, the user input comprising audio data and the annotations providing syntax and semantics to the grammar;
retrieving third-party statistical speech recognition information, the statistical speech recognition information being transmitted over a network;
generating a statistical language model (SLM) based on the grammar and the statistical speech recognition information, the SLM preserving semantics of the grammar;
processing the user input using the SLM to generate one or more results;
comparing the one or more results to candidates provided in the grammar;
identifying a particular candidate of the grammar based on the comparing;
providing the particular candidate for input to an application executed on a computing device;
translating the user input to a second language, different than a first language of the user input;
generating a plurality of translation hypotheses based on the translating;
translating each translation hypothesis of the plurality of translation hypotheses to the first language to provide a plurality of translated hypotheses; and
appending the plurality of translated hypotheses as results to the one or more results.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing speech input. In one aspect, a method includes receiving a user input and a grammar including annotations, the user input comprising audio data and the annotations providing syntax and semantics to the grammar, retrieving third-party statistical speech recognition information, the statistical speech recognition information being transmitted over a network, generating a statistical language model (SLM) based on the grammar and the statistical speech recognition information, the SLM preserving semantics of the grammar, processing the user input using the SLM to generate one or more results, comparing the one or more results to candidates provided in the grammar, identifying a particular candidate of the grammar based on the comparing, and providing the particular candidate for input to an application executed on a computing device.
43 Citations
13 Claims
-
1. A system comprising:
-
one or more processors; a computer-readable medium coupled to the one or more processors having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations comprising; receiving a user input and a grammar including annotations, the user input comprising audio data and the annotations providing syntax and semantics to the grammar; retrieving third-party statistical speech recognition information, the statistical speech recognition information being transmitted over a network; generating a statistical language model (SLM) based on the grammar and the statistical speech recognition information, the SLM preserving semantics of the grammar; processing the user input using the SLM to generate one or more results; comparing the one or more results to candidates provided in the grammar; identifying a particular candidate of the grammar based on the comparing; providing the particular candidate for input to an application executed on a computing device; translating the user input to a second language, different than a first language of the user input; generating a plurality of translation hypotheses based on the translating; translating each translation hypothesis of the plurality of translation hypotheses to the first language to provide a plurality of translated hypotheses; and appending the plurality of translated hypotheses as results to the one or more results. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-readable medium coupled to one or more processors having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations comprising:
-
receiving a user input and a grammar including annotations, the user input comprising audio data and the annotations providing syntax and semantics to the grammar; retrieving third-party statistical speech recognition information, the statistical speech recognition information being transmitted over a network; generating a statistical language model (SLM) based on the grammar and the statistical speech recognition information, the SLM preserving semantics of the grammar; processing the user input using the SLM to generate one or more results; comparing the one or more results to candidates provided in the grammar; identifying a particular candidate of the grammar based on the comparing; providing the particular candidate for input to an application executed on a computing device; translating the user input to a second language, different than a first language of the user input; generating a plurality of translation hypotheses based on the translating; translating each translation hypothesis of the plurality of translation hypotheses to the first language to provide a plurality of translated hypotheses; and appending the plurality of translated hypotheses as results to the one or more results.
-
-
13. A computer-implemented method, comprising:
-
receiving, at a computing device, a user input and a grammar including annotations, the user input comprising audio data and the annotations providing syntax and semantics to the grammar; retrieving third-party statistical speech recognition information from a computer-readable storage device, the statistical speech recognition information being transmitted to the computing device over a network; generating a statistical language model (SLM) based on the grammar and the statistical speech recognition information, the SLM preserving semantics of the grammar; processing the user input using the SLM to generate one or more results; comparing the one or more results to candidates provided in the grammar; identifying a particular candidate of the grammar based on the comparing; providing the particular candidate for input to an application; translating the user input to a second language, different than a first language of the user input; generating a plurality of translation hypotheses based on the translating; translating each translation hypothesis of the plurality of translation hypotheses to the first language to provide a plurality of translated hypotheses; and appending the plurality of translated hypotheses as results to the one or more results.
-
Specification