Custom language models for audio content
First Claim
Patent Images
1. A computer-implemented method comprising:
- receiving a collection of source texts;
identifying a type from a collection of types for each source text, each source text being associated with a particular type;
generating, by data processing apparatus, a type-specific language model for each identified type using the source texts associated with the respective type;
storing the language models in a computer-readable medium;
receiving an audio source file to be processed, the audio source file having a particular type;
selecting a particular weighted combination of the language models, wherein weighting the particular weighted combination of the language models depends on the type of the audio source file to be processed; and
generating, by data processing apparatus, a text file from the audio source file based on the selected weighted combination of language models.
2 Assignments
0 Petitions
Accused Products
Abstract
This specification describes technologies relating to generating custom language models for audio content. In some implementations, a computer-implemented method is provided that includes the actions of receiving a collection of source texts; identifying a type from a collection of types for each source text, each source text being associated with a particular type; generating, for each identified type, a type-specific language model using the source texts associated with the respective type; and storing the language models.
-
Citations
40 Claims
-
1. A computer-implemented method comprising:
-
receiving a collection of source texts; identifying a type from a collection of types for each source text, each source text being associated with a particular type; generating, by data processing apparatus, a type-specific language model for each identified type using the source texts associated with the respective type; storing the language models in a computer-readable medium; receiving an audio source file to be processed, the audio source file having a particular type; selecting a particular weighted combination of the language models, wherein weighting the particular weighted combination of the language models depends on the type of the audio source file to be processed; and generating, by data processing apparatus, a text file from the audio source file based on the selected weighted combination of language models. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A computer program product, encoded on a non-transitory computer-readable medium, operable to cause data processing apparatus to perform operations comprising:
-
receiving a collection of source texts; identifying a type from a collection of types for each source text, each source text being associated with a particular type; generating, for each identified type, a type-specific language model using the source texts associated with the respective type; storing the language models; receiving an audio source file to be processed, the audio source file having a particular type; selecting a particular weighted combination of the language models, wherein weighting the particular weighted combination of the language models depends on the type of the audio source file to be processed; and generating a text file from the audio source file based on the selected weighted combination of language models. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. A system comprising:
a processor and a memory operable to perform operations including; receiving a collection of source texts; identifying a type from a collection of types for each source text, each source text being associated with a particular type; generating, for each identified type, a type-specific language model using the source texts associated with the respective type; storing the language models; receiving an audio source file to be processed, the audio source file having a particular type; selecting a particular weighted combination of the language models, wherein weighting the particular weighted combination of the language models depends on the type of the audio source file to be processed; and generating a text file from the audio source file based on the selected weighted combination of language models. - View Dependent Claims (28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39)
-
40. A method performed by a computer programmed to provide one or more language models, the method comprising:
-
receiving a collection of source texts; identifying a type from a collection of types for each source text, each source text being associated with a particular type; generating, by the computer, a type-specific language model for each identified type using the source texts associated with the respective type; storing the language models at the computer; receiving an audio source file to be processed, the audio source file having a particular type; selecting a weighted combination of the language models, wherein the weights are determined based on the type of the audio source file to be processed; and generating, by the computer, a text file from the audio source file based on the selected weighted combination of language models.
-
Specification