Cross-lingual initialization of language models
First Claim
Patent Images
1. A computer-implemented method performed by at least one processor, the method comprising:
- receiving audio input in a target language, and a target context of the audio input;
determining that a target corpus that corresponds to the target language and the target context is unavailable;
receiving logged speech recognition results that correspond to an existing corpus that is specific to a given language that differs from the target language, and to the same target context that corresponds to the received target language audio input and the logged speech recognition results;
generating the target corpus that corresponds to the target language and the target context by machine-translating the logged speech recognition results corresponding to the given language to the target language; and
estimating a context-specific language model that is specific to both the target language and the target context using the generated target corpus.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for initializing language models for automatic speech recognition. In one aspect, a method includes receiving logged speech recognition results from an existing corpus that is specific to a given language and a target context, generating a target corpus by machine-translating the logged speech recognition results from the given language to a different, target language, and estimating a language model that is specific to the different, target language and the same, target context, using the target corpus.
-
Citations
18 Claims
-
1. A computer-implemented method performed by at least one processor, the method comprising:
-
receiving audio input in a target language, and a target context of the audio input; determining that a target corpus that corresponds to the target language and the target context is unavailable; receiving logged speech recognition results that correspond to an existing corpus that is specific to a given language that differs from the target language, and to the same target context that corresponds to the received target language audio input and the logged speech recognition results; generating the target corpus that corresponds to the target language and the target context by machine-translating the logged speech recognition results corresponding to the given language to the target language; and estimating a context-specific language model that is specific to both the target language and the target context using the generated target corpus. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system comprising:
-
one or more non-transitory computer-readable storage media storing data that represents an existing corpus; an automated speech recognition engine, executable on one or more processors having access to the computer-readable storage media, and operable to receive audio input in a target language, and a target context of the audio, and further operable to determine that a target corpus that corresponds to the target language and the target context is unavailable; a machine translation engine, executable on one or more processors having access to the computer-readable storage media, and operable to receive logged speech recognition results that correspond to an existing corpus that is specific for a given language that differs from the target language, and to the same target context that corresponds to the received target language audio input and the logged speech recognition results, and further operable to generate the target corpus that corresponds to the target language and the target context by machine-translating the logged speech recognition results corresponding to the given language to the target language, wherein results of the translation are stored in the computer-readable storage medium as the target corpus; and a language model generator, executable on one or more processors having access to the computer-readable storage media, and operable to estimate a context-specific language model that is specific to both the target language and the target context using the generated target corpus. - View Dependent Claims (11, 12, 13)
-
-
14. A non-transitory computer storage medium encoded with a computer program, the program comprising instructions that when executed by data processing apparatus cause the data processing apparatus to perform operations comprising:
-
receiving audio input in a target language, and a target context of the audio input; determining that a target corpus that corresponds to the target language and the target context is unavailable; receiving logged speech recognition results that correspond to an existing corpus that is specific to a given language that differs from the target language, and to the same target context that corresponds to the received target language audio input and the logged speech recognition results; generating the target corpus that corresponds to the target language and the target context by machine-translating the logged speech recognition results corresponding to the given language to the target language; and estimating a context-specific language model that is specific to both the target language and the target context using the generated target corpus. - View Dependent Claims (15, 16, 17, 18)
-
Specification