Automatic language independent triphone training using a phonetic table
First Claim
1. A method for training acoustic models for a target language comprising the steps of:
- a) providing a phonetic table of a reference, which characterizes the phones used in one or more reference languages with respect to their articulatory properties;
b) providing a phonetic table of a target language, which characterizes the phones used in the target language with respect to their articulatory properties;
c) providing a set of trained monophones for each reference language;
d) providing a database of utterances in the target language and phonetic transcription of the utterances in the database; and
e) processing using table correspondence processing methods and phonetic model seeding.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for training acoustic models for a new target language is provided using a phonetic table, which characterizes the phones, used in one or more reference language(s) with respect to their articulatory properties; a phonetic table, which characterizes the phones used in the target language with respect to their articulatory properties; a set of trained monophones for the reference language(s); and a database of sentences in the target language and its phonetic transcription. With these inputs, the new method completely and automatically takes care of the steps of monophone seeding and triphone clustering and machine intensive training steps involved in triphone acoustic training.
-
Citations
10 Claims
-
1. A method for training acoustic models for a target language comprising the steps of:
-
a) providing a phonetic table of a reference, which characterizes the phones used in one or more reference languages with respect to their articulatory properties;
b) providing a phonetic table of a target language, which characterizes the phones used in the target language with respect to their articulatory properties;
c) providing a set of trained monophones for each reference language;
d) providing a database of utterances in the target language and phonetic transcription of the utterances in the database; and
e) processing using table correspondence processing methods and phonetic model seeding. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method of creating seed context-dependent phonetic models comprising the steps of:
-
a) obtaining a first set of phonetic models representing a first set of phones;
b) obtaining a transcription of a database in terms of the first set of phones;
c) performing forced alignment of each utterance in a database using the first set of phonetic models and the transcription, resulting in the location of information in the database corresponding to each of the first phones in the database;
d) generating the locations of each context-dependent phone in the database based on the locations of the first phones in the database and the context of the first phones specified by the transcription; and
e) using the information in the database at all locations corresponding to a context-dependent phone constructing a context-dependent phonetic model for that context-dependent phone. - View Dependent Claims (7, 8)
-
-
9. A method for training acoustic models for a target language comprising the steps of:
-
deriving and encoding providing phonetic tables of one or more reference languages and a phonetic table for a new target language, providing a speech database collected in the new language and a phonetic transcription of the database, processing using table correspondence to generate seed monophone phonetic models specific to the new target language, training the monophone phonetic models automatically using existing known training techniques, automatically generating accurate seed triphone models specific to the language subsequent to monophone model training, determining optimal clustering of the triphone phonetic model parameters utilizing the phonetic table information, and automatically training the triphone phonetic models. - View Dependent Claims (10)
-
Specification