TRAINING ACOUSTIC MODELS
First Claim
1. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
receiving speech data and data identifying a transcription for the speech data;
accessing a phonetic representation for the transcription;
extracting training sequences from the phonetic representation for a particular phone in the phonetic representation, each of the training sequences including a different set of contextual phones surrounding the particular phone;
identifying a partitioning key based on a sequence of phones that occurs in each of the training sequences;
selecting, from among a plurality of processing modules, a processing module to which the identified partitioning key is assigned, the processing module being designated to train a portion of an acoustic model that corresponds to the identified partitioning key; and
transmitting, to the selected processing module, (i) data identifying the training sequences and (ii) a portion of the speech data that corresponds to the training sequence that includes the most contextual phones.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training acoustic models. Speech data and data identifying a transcription for the speech data are received. A phonetic representation for the transcription is accessed. Training sequences are identified for a particular phone in the phonetic representation. Each of the training sequences includes a different set of contextual phones surrounding the particular phone. A partitioning key is identified based on a sequence of phones that occurs in each of the training sequences. A processing module to which the identified partitioning key is assigned is selected. Data identifying the training sequences and a portion of the speech data are transmitted to the selected processing module.
-
Citations
20 Claims
-
1. A system comprising:
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving speech data and data identifying a transcription for the speech data; accessing a phonetic representation for the transcription; extracting training sequences from the phonetic representation for a particular phone in the phonetic representation, each of the training sequences including a different set of contextual phones surrounding the particular phone; identifying a partitioning key based on a sequence of phones that occurs in each of the training sequences; selecting, from among a plurality of processing modules, a processing module to which the identified partitioning key is assigned, the processing module being designated to train a portion of an acoustic model that corresponds to the identified partitioning key; and transmitting, to the selected processing module, (i) data identifying the training sequences and (ii) a portion of the speech data that corresponds to the training sequence that includes the most contextual phones. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
18. A computer-implemented method, comprising:
-
receiving speech data and data identifying a transcription for the speech data; accessing a phonetic representation for the transcription; extracting training sequences from the phonetic representation for a particular phone in the phonetic representation, each of the training sequences including a different set of contextual phones surrounding the particular phone; identifying a partitioning key based on a sequence of phones that occurs in each of the training sequences; selecting, from among a plurality of processing modules, a processing module to which the identified partitioning key is assigned, the processing module being designated to train a portion of an acoustic model that corresponds to the identified partitioning key; and transmitting, to the selected processing module, (i) data identifying the training sequences and (ii) a portion of the speech data that corresponds to the training sequence that includes the most contextual phones. - View Dependent Claims (19)
-
-
20. A computer storage medium encoded with a computer program, the program comprising instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
-
receiving speech data and data identifying a transcription for the speech data; accessing a phonetic representation for the transcription; extracting training sequences from the phonetic representation for a particular phone in the phonetic representation, each of the training sequences including a different set of contextual phones surrounding the particular phone; identifying a partitioning key based on a sequence of phones that occurs in each of the training sequences; selecting, from among a plurality of processing modules, a processing module to which the identified partitioning key is assigned, the processing module being designated to train a portion of an acoustic model that corresponds to the identified partitioning key; and transmitting, to the selected processing module, (i) data identifying the training sequences and (ii) a portion of the speech data that corresponds to the training sequence that includes the most contextual phones.
-
Specification