Method and apparatus for training a multilingual speech model set
First Claim
1. A method for generating a multilingual speech model set, said multilingual speech model set being suitable for use in a multilingual speech recognition system, said method comprising:
- a) providing a group of acoustic sub-word units comprising;
a first subgroup of acoustic sub-word units associated to a first language with each acoustic sub-word unit associated in said first subgroup having an associated speech model;
a second subgroup of acoustic sub-word units associated to a second language;
said first subgroup and said second subgroup sharing at least one common acoustic sub-word unit;
b) providing a training set comprising a plurality of entries, each entry having a speech token representative of a word and a label being an orthographic representation of the word;
c) providing a set of untrained speech models, said set of untrained speech models having at least a first untrained speech model, further comprising, (i) providing said first untrained speech model by initializing at least one acoustic sub-word unit of said second subgroup with said associated speech model of at least one acoustic sub-word unit of said first subgroup that is acoustically similar to said at least one acoustic sub-word unit of said second subgroup;
d) training the set of untrained speech models by utilizing said training set, a plurality of letter to acoustic sub-word unit rules sets and said group of acoustic sub-word units to derive the multilingual speech model set, each letter to acoustic sub-word unit rules set being associated to a different language.
7 Assignments
0 Petitions
Accused Products
Abstract
The invention relates to a method and apparatus for training a multilingual speech model set. The multilingual speech model set generated is suitable for use by a speech recognition system for recognizing spoken utterances for at least two different languages. The invention allows using a single speech recognition unit with a single speech model set to perform speech recognition on utterances from two or more languages. The method and apparatus make use of a group of a group of acoustic sub-word units comprised of a first subgroup of acoustic sub-word units associated to a first language and a second subgroup of acoustic sub-word units associated to a second language where the first subgroup and the second subgroup share at least one common acoustic sub-word unit. The method and apparatus also make use of a plurality of letter to acoustic sub-word unit rules sets, each letter to acoustic sub-word unit rules set being associated to a different language. A set of untrained speech models is trained on the basis of a training set comprising speech tokens and their associated labels in combination with the group of acoustic sub-word units and the plurality of letter to acoustic sub-word unit rules sets. The invention also provides a computer readable storage medium comprising a program element for implementing the method for training a multilingual speech model set.
336 Citations
24 Claims
-
1. A method for generating a multilingual speech model set, said multilingual speech model set being suitable for use in a multilingual speech recognition system, said method comprising:
-
a) providing a group of acoustic sub-word units comprising;
a first subgroup of acoustic sub-word units associated to a first language with each acoustic sub-word unit associated in said first subgroup having an associated speech model;
a second subgroup of acoustic sub-word units associated to a second language;
said first subgroup and said second subgroup sharing at least one common acoustic sub-word unit;
b) providing a training set comprising a plurality of entries, each entry having a speech token representative of a word and a label being an orthographic representation of the word;
c) providing a set of untrained speech models, said set of untrained speech models having at least a first untrained speech model, further comprising, (i) providing said first untrained speech model by initializing at least one acoustic sub-word unit of said second subgroup with said associated speech model of at least one acoustic sub-word unit of said first subgroup that is acoustically similar to said at least one acoustic sub-word unit of said second subgroup;
d) training the set of untrained speech models by utilizing said training set, a plurality of letter to acoustic sub-word unit rules sets and said group of acoustic sub-word units to derive the multilingual speech model set, each letter to acoustic sub-word unit rules set being associated to a different language. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An apparatus for generating a multilingual speech model set, said multilingual speech model set being suitable for use in a multilingual speech recognition system, said apparatus comprising:
-
a) a first memory unit for storing acoustic data elements representative of a group of acoustic sub-word units comprising;
a first subgroup of acoustic sub-word units associated to a first language with each acoustic sub-word unit having an associated acoustic data element;
a second subgroup of acoustic sub-word units associated to a second language;
said first subgroup and said second subgroup sharing at least one common acoustic sub-word unit;
b) a second memory unit for storing a plurality of letter to acoustic sub-word unit rules sets, each letter to acoustic sub-word unit rules set being associated to a different language;
c) a third memory unit suitable for storing a training set comprising a plurality of entries, each entry having a speech token representative of a word and a label being an orthographic representation of the word;
d) a fourth memory unit for storing a set of untrained speech models, said set of untrained speech models comprising at least one untrained speech model, said one untrained speech model generated by initializing at least one acoustic sub-word unit of said second subgroup with said associated acoustic data element of at least one acoustic sub-word unit of said first subgroup that is acoustically similar to said at least one acoustic sub-word unit of said second subgroup;
e) processing unit coupled to;
said first memory unit;
said second memory unit;
said third memory unit;
said fourth memory unit;
said processing unit being operative for training the set of untrained speech models by utilizing said training set, said plurality of letter to acoustic sub-word unit rules sets and said group of acoustic sub-word units to derive the multilingual speech model set. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer readable storage medium containing a program element suitable for use on a computer having a memory, said memory being suitable for storing:
-
a) a group of acoustic sub-word units comprising;
a first subgroup of acoustic sub-word units associated to a first language with each acoustic sub-word unit having an associated speech model;
a second subgroup of acoustic sub-word units associated to a second language;
said first subgroup and said second subgroup sharing at least one common acoustic sub-word unit;
b) a plurality of letter to acoustic sub-word unit rules sets, each letter to acoustic sub-word unit rules set being associated to a different language;
c) a training set comprising a plurality of entries, each entry having a speech token representative of a word and a label being an orthographic representation of the word;
d) a set of untrained speech models, said set of untrained speech models comprising at least one untrained speech model, said one untrained speech model generated by initializing at least one acoustic sub-word unit of said second subgroup with said associated speech model of at least one acoustic sub-word unit of said first subgroup that is acoustically similar to said at least one acoustic sub-word unit of said second subgroup;
said program element being operative for training the set of untrained speech models by utilizing said training set, said plurality of letter to acoustic sub-word unit rules sets and said group of acoustic sub-word units to derive a multilingual speech model set. - View Dependent Claims (22)
-
-
23. An apparatus for generating a multilingual speech model set, said multilingual speech model set being suitable for use in a multilingual speech recognition system, said apparatus comprising:
-
a) means for storing;
i) acoustic data elements representative of a group of acoustic sub-word units comprising;
a first subgroup of acoustic sub-word units associated to a first language with each acoustic sub-word unit having an associated acoustic data element;
a second subgroup of acoustic sub-word units associated to a second language;
said first subgroup and said second subgroup sharing at least one common acoustic sub-word unit;
ii) a plurality of letter to acoustic sub-word unit rules sets, each letter to acoustic sub-word unit rules set being associated to a different language;
iii) a training set comprising a plurality of entries, each entry having a speech token representative of a word and a label being an orthographic representation of the word;
iv) a set of untrained speech models, said set of untrained speech models comprising at least one untrained speech model, said one untrained speech model generated by initializing at least one acoustic sub-word unit of said second subgroup with said associated acoustic data element of at least one acoustic sub-word unit of said first subgroup that is acoustically similar to said at least one acoustic sub-word unit of said second subgroup;
means for training the set of untrained speech models by utilizing said training set, said plurality of letter to acoustic sub-word unit rules sets and said group of acoustic sub-word units to derive the multilingual speech model set. - View Dependent Claims (24)
-
Specification