DYNAMICALLY BIASING LANGUAGE MODELS
First Claim
Patent Images
1. A method performed by one or more computers, the method comprising:
- receiving audio data encoding one or more utterances;
generating a recognition lattice of the one or more utterances by performing speech recognition on the audio data using a first pass speech recognizer;
determining a specific context for the one or more utterances based on the recognition lattice;
in response to determining that the recognition lattice defines the specific context, generating a transcription of the one or more utterances by performing speech recognition on the audio data using a second pass speech recognizer biased towards the specific context defined by the recognition lattice; and
providing an output of the transcription of the one or more utterances.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition. In one aspect, a method comprises receiving audio data encoding one or more utterances; performing a first speech recognition on the audio data; identifying a context based on the first speech recognition; performing a second speech recognition on the audio data that is biased towards the context; and providing an output of the second speech recognition.
-
Citations
21 Claims
-
1. A method performed by one or more computers, the method comprising:
-
receiving audio data encoding one or more utterances; generating a recognition lattice of the one or more utterances by performing speech recognition on the audio data using a first pass speech recognizer; determining a specific context for the one or more utterances based on the recognition lattice; in response to determining that the recognition lattice defines the specific context, generating a transcription of the one or more utterances by performing speech recognition on the audio data using a second pass speech recognizer biased towards the specific context defined by the recognition lattice; and providing an output of the transcription of the one or more utterances. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system comprising:
-
one or more computers; and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving audio data encoding one or more utterances; generating a recognition lattice of the one or more utterances by performing speech recognition on the audio data using a first pass speech recognizer; determining a specific context for the one or more utterances based on the recognition lattice; in response to determining that the recognition lattice defines the specific context, generating a transcription of the one or more utterances by performing speech recognition on the audio data using a second pass speech recognizer biased towards the specific context defined by the recognition lattice; and providing an output of the transcription of the one or more utterances. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A system of one or more computers configured to perform operations comprising:
-
receiving audio data encoding one or more utterances; performing a first speech recognition on the audio data; identifying a context based on the first speech recognition; performing a second speech recognition on the audio data that is biased towards the context; and providing an output of the second speech recognition.
-
Specification