Method of constructing model of recognizing english pronunciation variation
First Claim
1. A method of constructing a model of recognizing English pronunciation variations, applying to a computer connected to a non-transitory recording medium, for recognizing English pronunciations with intonations influenced by different non-English native languages, the method at least comprising:
- 1) providing a plurality of English expressions and at least one phonetic alphabet corresponding to each of the English expressions by the non-transitory recording medium, and collecting a plurality of corresponding sound information according to the phonetic alphabet of each of the English expression by the computer;
2) corresponding phonetic alphabets of the non-English native language and English to a plurality of international phonetic alphabets (IPAs) by the computer, so as to form a plurality of pronunciation models, wherein the computer forms each pronunciation models;
2-1) collecting a plurality of phonetic alphabet pronunciations directed to one of the IPAs, and converts each of the phonetic alphabet pronunciations into a corresponding characteristic value;
2-2) forming the characteristic values into a value group and calculates a grouping threshold value corresponding to the characteristic values;
2-3) calculating the computer calculates a mean value of the value group;
2-4) obtaining a first characteristic value from the value group which is away from the mean value by a maximum numerical distance;
2-5) calculating a second characteristic value in the value group which is away from the first characteristic value by a maximum numerical distance;
2-6) calculating numerical distances, wherein a first distance is calculated between each characteristic value and the first characteristic value and a second distance is calculated between each characteristic value and the second characteristic value, and forming value groups by the first distances and the second distances, one of the two value groups containing the characteristic values close to the first characteristic value and the other one of the two value groups containing the characteristic values close to the second characteristic value, respectively;
2-7) obtaining a within-group distance and a between-group distance of the two value groups, so as to calculate a grouping standard; and
2-8) determining whether the grouping standard is higher than the grouping threshold value through comparison, if yes, calculating each mean value of the two value groups and then, the step 2-4) to the step 2-8) are repeated for each one of the two value groups respectively, and if no, obtaining each value group of the pronunciation model that the computer want to form;
3) converting the sound information of each of the English expressions by using the pronunciation models, and constructing a pronunciation variation network corresponding to the English expression with reference to the phonetic alphabet of the English expression by the computer, so as to detect whether each of the English expressions has a pronunciation variation path; and
4) summarizing each of the pronunciation variation paths to form a plurality of pronunciation variation rules by the computer.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of constructing a model of recognizing English pronunciation variations is used to recognize English pronunciations with different intonations influenced by non-English native languages. The method includes collecting a plurality of sound information corresponding to English expressions; corresponding phonetic alphabets of the non-English native language and English of a region to International Phonetic Alphabets (IPAs), so as to form a plurality of pronunciation models; converting the sound information with the pronunciation models to form a pronunciation variation network of the corresponding English expressions, thereby detecting whether the English expressions have pronunciation variation paths; and finally summarizing the pronunciation variation paths to form a plurality of pronunciation variation rules. Furthermore, the pronunciation variations are represented by phonetics features to infer possible pronunciation variation rules, which are stored to form pronunciation variation models. The construction of the pronunciation variation models enhances applicability of an English recognition system and accuracy of voice recognition.
-
Citations
18 Claims
-
1. A method of constructing a model of recognizing English pronunciation variations, applying to a computer connected to a non-transitory recording medium, for recognizing English pronunciations with intonations influenced by different non-English native languages, the method at least comprising:
-
1) providing a plurality of English expressions and at least one phonetic alphabet corresponding to each of the English expressions by the non-transitory recording medium, and collecting a plurality of corresponding sound information according to the phonetic alphabet of each of the English expression by the computer; 2) corresponding phonetic alphabets of the non-English native language and English to a plurality of international phonetic alphabets (IPAs) by the computer, so as to form a plurality of pronunciation models, wherein the computer forms each pronunciation models; 2-1) collecting a plurality of phonetic alphabet pronunciations directed to one of the IPAs, and converts each of the phonetic alphabet pronunciations into a corresponding characteristic value; 2-2) forming the characteristic values into a value group and calculates a grouping threshold value corresponding to the characteristic values; 2-3) calculating the computer calculates a mean value of the value group; 2-4) obtaining a first characteristic value from the value group which is away from the mean value by a maximum numerical distance; 2-5) calculating a second characteristic value in the value group which is away from the first characteristic value by a maximum numerical distance; 2-6) calculating numerical distances, wherein a first distance is calculated between each characteristic value and the first characteristic value and a second distance is calculated between each characteristic value and the second characteristic value, and forming value groups by the first distances and the second distances, one of the two value groups containing the characteristic values close to the first characteristic value and the other one of the two value groups containing the characteristic values close to the second characteristic value, respectively; 2-7) obtaining a within-group distance and a between-group distance of the two value groups, so as to calculate a grouping standard; and 2-8) determining whether the grouping standard is higher than the grouping threshold value through comparison, if yes, calculating each mean value of the two value groups and then, the step 2-4) to the step 2-8) are repeated for each one of the two value groups respectively, and if no, obtaining each value group of the pronunciation model that the computer want to form; 3) converting the sound information of each of the English expressions by using the pronunciation models, and constructing a pronunciation variation network corresponding to the English expression with reference to the phonetic alphabet of the English expression by the computer, so as to detect whether each of the English expressions has a pronunciation variation path; and 4) summarizing each of the pronunciation variation paths to form a plurality of pronunciation variation rules by the computer. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A non-transitory recording medium of constructing a model of recognizing English pronunciation variations, recording computer-readable computer program codes, used for recognizing English pronunciations with different intonations influenced by non-English native languages, the non-transitory recording medium encoding with computer program codes which is executed by a computer to perform a method of constructing a pronunciation variation model comprising:
-
1) providing a plurality of English expressions and at least one phonetic alphabet corresponding to each of the English expressions by the non-transitory recording medium, and collecting a plurality of corresponding sound information according to the phonetic alphabet of each of the English expression by the computer; 2) corresponding the phonetic alphabets of the non-English native language and English to a plurality of international phonetic alphabets (IPAs) by the computer, so as to form a plurality of pronunciation models, wherein the computer forms each pronunciation models; 2-1) collecting a plurality of phonetic alphabet pronunciations directed to one of the IPAs, and converting each of the phonetic alphabet pronunciations into a corresponding characteristic value; 2-2) forming the characteristic values into a value group and calculates a grouping threshold value corresponding to the characteristic values; 2-3) calculating a mean value of the value group; 2-4) obtaining a first characteristic value from the value group which is away from the mean value by a maximum numerical distance; 2-5) calculating a second characteristic value in the value group which is away from the first characteristic value by a maximum numerical distance; 2-6) calculating numerical distances, wherein a first distance is calculated between each characteristic value and the first characteristic value and a second distance is calculated between each characteristic value and the second characteristic value, and forming two value groups by the first distances and the second distances, one of the two value groups containing the characteristic values close to the first characteristic value and the other one of the two value groups containing the characteristic values close to the second characteristic value, respectively; 2-7) obtaining a within-group distance and a between-group distance of the two value groups, so as to calculate a grouping standard; and 2-8) determining whether the grouping standard is higher than the grouping threshold value through comparison, if yes, calculating each mean value of the two value groups and then, the step 2-4) to the step 2-8) are repeated for each one of the two value groups respectively, and if no, obtaining each value group of the pronunciation model that the computer want to form; 3) converting the sound information of each of the English expressions by using the pronunciation models, and constructing a pronunciation variation network corresponding to the English expression with reference to the phonetic alphabet of the English expression by the computer, so as to detect whether the English expression has a pronunciation variation path; and 4) summarizing each of the pronunciation variation paths to form a plurality of pronunciation variation rules by the computer. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification