Method of constructing model of recognizing english pronunciation variation

US 8,000,964 B2
Filed: 02/21/2008
Issued: 08/16/2011
Est. Priority Date: 12/12/2007
Status: Expired due to Fees

First Claim

Patent Images

1. A method of constructing a model of recognizing English pronunciation variations, applying to a computer connected to a non-transitory recording medium, for recognizing English pronunciations with intonations influenced by different non-English native languages, the method at least comprising:

1) providing a plurality of English expressions and at least one phonetic alphabet corresponding to each of the English expressions by the non-transitory recording medium, and collecting a plurality of corresponding sound information according to the phonetic alphabet of each of the English expression by the computer;

2) corresponding phonetic alphabets of the non-English native language and English to a plurality of international phonetic alphabets (IPAs) by the computer, so as to form a plurality of pronunciation models, wherein the computer forms each pronunciation models;

2-1) collecting a plurality of phonetic alphabet pronunciations directed to one of the IPAs, and converts each of the phonetic alphabet pronunciations into a corresponding characteristic value;

2-2) forming the characteristic values into a value group and calculates a grouping threshold value corresponding to the characteristic values;

2-3) calculating the computer calculates a mean value of the value group;

2-4) obtaining a first characteristic value from the value group which is away from the mean value by a maximum numerical distance;

2-5) calculating a second characteristic value in the value group which is away from the first characteristic value by a maximum numerical distance;

2-6) calculating numerical distances, wherein a first distance is calculated between each characteristic value and the first characteristic value and a second distance is calculated between each characteristic value and the second characteristic value, and forming value groups by the first distances and the second distances, one of the two value groups containing the characteristic values close to the first characteristic value and the other one of the two value groups containing the characteristic values close to the second characteristic value, respectively;

2-7) obtaining a within-group distance and a between-group distance of the two value groups, so as to calculate a grouping standard; and

2-8) determining whether the grouping standard is higher than the grouping threshold value through comparison, if yes, calculating each mean value of the two value groups and then, the step 2-4) to the step 2-8) are repeated for each one of the two value groups respectively, and if no, obtaining each value group of the pronunciation model that the computer want to form;

3) converting the sound information of each of the English expressions by using the pronunciation models, and constructing a pronunciation variation network corresponding to the English expression with reference to the phonetic alphabet of the English expression by the computer, so as to detect whether each of the English expressions has a pronunciation variation path; and

4) summarizing each of the pronunciation variation paths to form a plurality of pronunciation variation rules by the computer.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of constructing a model of recognizing English pronunciation variations is used to recognize English pronunciations with different intonations influenced by non-English native languages. The method includes collecting a plurality of sound information corresponding to English expressions; corresponding phonetic alphabets of the non-English native language and English of a region to International Phonetic Alphabets (IPAs), so as to form a plurality of pronunciation models; converting the sound information with the pronunciation models to form a pronunciation variation network of the corresponding English expressions, thereby detecting whether the English expressions have pronunciation variation paths; and finally summarizing the pronunciation variation paths to form a plurality of pronunciation variation rules. Furthermore, the pronunciation variations are represented by phonetics features to infer possible pronunciation variation rules, which are stored to form pronunciation variation models. The construction of the pronunciation variation models enhances applicability of an English recognition system and accuracy of voice recognition.

Citations

18 Claims

1. A method of constructing a model of recognizing English pronunciation variations, applying to a computer connected to a non-transitory recording medium, for recognizing English pronunciations with intonations influenced by different non-English native languages, the method at least comprising:
- 1) providing a plurality of English expressions and at least one phonetic alphabet corresponding to each of the English expressions by the non-transitory recording medium, and collecting a plurality of corresponding sound information according to the phonetic alphabet of each of the English expression by the computer;
  
  2) corresponding phonetic alphabets of the non-English native language and English to a plurality of international phonetic alphabets (IPAs) by the computer, so as to form a plurality of pronunciation models, wherein the computer forms each pronunciation models;
  
  2-1) collecting a plurality of phonetic alphabet pronunciations directed to one of the IPAs, and converts each of the phonetic alphabet pronunciations into a corresponding characteristic value;
  
  2-2) forming the characteristic values into a value group and calculates a grouping threshold value corresponding to the characteristic values;
  
  2-3) calculating the computer calculates a mean value of the value group;
  
  2-4) obtaining a first characteristic value from the value group which is away from the mean value by a maximum numerical distance;
  
  2-5) calculating a second characteristic value in the value group which is away from the first characteristic value by a maximum numerical distance;
  
  2-6) calculating numerical distances, wherein a first distance is calculated between each characteristic value and the first characteristic value and a second distance is calculated between each characteristic value and the second characteristic value, and forming value groups by the first distances and the second distances, one of the two value groups containing the characteristic values close to the first characteristic value and the other one of the two value groups containing the characteristic values close to the second characteristic value, respectively;
  
  2-7) obtaining a within-group distance and a between-group distance of the two value groups, so as to calculate a grouping standard; and
  
  2-8) determining whether the grouping standard is higher than the grouping threshold value through comparison, if yes, calculating each mean value of the two value groups and then, the step 2-4) to the step 2-8) are repeated for each one of the two value groups respectively, and if no, obtaining each value group of the pronunciation model that the computer want to form;
  
  3) converting the sound information of each of the English expressions by using the pronunciation models, and constructing a pronunciation variation network corresponding to the English expression with reference to the phonetic alphabet of the English expression by the computer, so as to detect whether each of the English expressions has a pronunciation variation path; and
  
  4) summarizing each of the pronunciation variation paths to form a plurality of pronunciation variation rules by the computer.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of constructing a model of recognizing English pronunciation variations as claimed in claim 1, wherein the characteristic values of at least one value group of the pronunciation model correspond to the phonetic alphabets of the non-English native language.
  - 3. The method of constructing a model of recognizing English pronunciation variations as claimed in claim 1, wherein the characteristic values of at least one value group of the pronunciation model correspond to the phonetic alphabets of the English.
  - 4. The method of constructing a model of recognizing English pronunciation variations as claimed in claim 1, wherein the phonetic alphabet pronunciation is transformed into the characteristic value by using Fourier Transform equation.
  - 5. The method of constructing a model of recognizing English pronunciation variations as claimed in claim 1, wherein the step of constructing a pronunciation variation network corresponding to the English expression comprises:
    - setting the phonetic alphabet of the English expression as a reference;
      
      detecting whether an insertion pronunciation variation exists in each pronunciation of the phonetic alphabets of English;
      
      detecting whether a deletion pronunciation variation exists between each phonetic alphabet and its next phonetic alphabet;
      
      detecting a substitution pronunciation variation corresponding to each phonetic alphabet; and
      
      constructing the pronunciation variation network.
  - 6. The method of constructing a model of recognizing English pronunciation variations as claimed in claim 5, wherein the step of detecting a substitution pronunciation variation corresponding to each phonetic alphabet comprises:
    - obtaining a pronunciation type in the IPA for each phonetic alphabet; and
      
      using at least one IPA with the same pronunciation type as the substitution pronunciation variation of the phonetic alphabet.
  - 7. The method of constructing a model of recognizing English pronunciation variations as claimed in claim 5, wherein the step of detecting a substitution pronunciation variation corresponding to each phonetic alphabet comprises:
    - collecting pronunciations of the IPA;
      
      calculating pronunciation probability for each IPA, so as to establish a phone confusion matrix;
      
      obtaining at least one IPA in a pronunciation probability range based on the phonetic alphabet; and
      
      setting the IPA in the pronunciation probability range as the substitution pronunciation variation of the phonetic alphabet.
  - 8. The method of constructing a model of recognizing English pronunciation variations as claimed in claim 1, further comprising a step of analyzing the English expression to obtain an inference rule according to variation of the phonetic alphabet.
  - 9. The method of constructing a model of recognizing English pronunciation variations as claimed in claim 8, further comprising:
    - corresponding the phonetic alphabets to pronunciation characteristics of linguistics;
      
      analyzing the pronunciation variation network of the English expression, so as to obtain the inference rule; and
      
      determining whether the phonetic alphabets having the same pronunciation characteristic have the same inference rule.

10. A non-transitory recording medium of constructing a model of recognizing English pronunciation variations, recording computer-readable computer program codes, used for recognizing English pronunciations with different intonations influenced by non-English native languages, the non-transitory recording medium encoding with computer program codes which is executed by a computer to perform a method of constructing a pronunciation variation model comprising:
- 1) providing a plurality of English expressions and at least one phonetic alphabet corresponding to each of the English expressions by the non-transitory recording medium, and collecting a plurality of corresponding sound information according to the phonetic alphabet of each of the English expression by the computer;
  
  2) corresponding the phonetic alphabets of the non-English native language and English to a plurality of international phonetic alphabets (IPAs) by the computer, so as to form a plurality of pronunciation models, wherein the computer forms each pronunciation models;
  
  2-1) collecting a plurality of phonetic alphabet pronunciations directed to one of the IPAs, and converting each of the phonetic alphabet pronunciations into a corresponding characteristic value;
  
  2-2) forming the characteristic values into a value group and calculates a grouping threshold value corresponding to the characteristic values;
  
  2-3) calculating a mean value of the value group;
  
  2-4) obtaining a first characteristic value from the value group which is away from the mean value by a maximum numerical distance;
  
  2-5) calculating a second characteristic value in the value group which is away from the first characteristic value by a maximum numerical distance;
  
  2-6) calculating numerical distances, wherein a first distance is calculated between each characteristic value and the first characteristic value and a second distance is calculated between each characteristic value and the second characteristic value, and forming two value groups by the first distances and the second distances, one of the two value groups containing the characteristic values close to the first characteristic value and the other one of the two value groups containing the characteristic values close to the second characteristic value, respectively;
  
  2-7) obtaining a within-group distance and a between-group distance of the two value groups, so as to calculate a grouping standard; and
  
  2-8) determining whether the grouping standard is higher than the grouping threshold value through comparison, if yes, calculating each mean value of the two value groups and then, the step 2-4) to the step 2-8) are repeated for each one of the two value groups respectively, and if no, obtaining each value group of the pronunciation model that the computer want to form;
  
  3) converting the sound information of each of the English expressions by using the pronunciation models, and constructing a pronunciation variation network corresponding to the English expression with reference to the phonetic alphabet of the English expression by the computer, so as to detect whether the English expression has a pronunciation variation path; and
  
  4) summarizing each of the pronunciation variation paths to form a plurality of pronunciation variation rules by the computer.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
- - 11. The non-transitory recording medium as claimed in claim 10, wherein the characteristic values of at least one value group of the pronunciation model correspond to the phonetic alphabets of the non-English native language.
  - 12. The non-transitory recording medium as claimed in claim 10, wherein the characteristic values of at least one value group of the pronunciation model correspond to the phonetic alphabets of the English.
  - 13. The non-transitory recording medium as claimed in claim 10, wherein the phonetic alphabet pronunciation is transformed into the characteristic value by using Fourier Transform equation.
  - 14. The non-transitory recording medium as claimed in claim 10, wherein the step of constructing a pronunciation variation network corresponding to the English expression comprises:
    - setting the phonetic alphabet of the English expression as a reference;
      
      detecting whether an insertion pronunciation variation exists in each pronunciation of the phonetic alphabets of English;
      
      detecting whether a deletion pronunciation variation exists between each phonetic alphabet and its next phonetic alphabet;
      
      detecting a substitution pronunciation variation corresponding to each phonetic alphabet; and
      
      constructing the pronunciation variation network.
  - 15. The non-transitory recording medium as claimed in claim 14, wherein the step of detecting a substitution pronunciation variation corresponding to each phonetic alphabet comprises:
    - obtaining a pronunciation type in the IPA for each phonetic alphabet; and
      
      using at least one IPA with the same pronunciation type as the substitution pronunciation variation of the phonetic alphabet.
  - 16. The non-transitory recording medium as claimed in claim 14, wherein the step of detecting a substitution pronunciation variation corresponding to each phonetic alphabet comprises:
    - collecting pronunciations of the IPA;
      
      calculating pronunciation probability for each IPA, so as to establish a phone confusion matrix;
      
      obtaining at least one IPA in a pronunciation probability range based on the phonetic alphabet; and
      
      setting the IPA in the pronunciation probability range as the substitution pronunciation variation of the phonetic alphabet.
  - 17. The non-transitory recording medium as claimed in claim 10, further comprising a step of analyzing the English expression to obtain an inference rule according to the variation of the phonetic alphabet.
  - 18. The non-transitory recording medium as claimed in claim 17, wherein the step of detecting a substitution pronunciation variation corresponding to each phonetic alphabet comprises:
    - corresponding the phonetic alphabets to pronunciation characteristics of linguistics;
      
      analyzing the pronunciation variation network of the English expression, so as to obtain the inference rule; and
      
      determining whether the phonetic alphabets having the same pronunciation characteristic have the same inference rule.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Institute For Information Industry
Original Assignee
Institute For Information Industry
Inventors
Wu, Chung-Hsien, Hsieh, Chia-Hsin, Huang, Chien-Lin, Lee, Kuei-Ming, Hsu, Chin-Shun, Chai, Shen-Yen, Lin, Jui-Tang
Primary Examiner(s)
Wozniak; James S.
Assistant Examiner(s)
He; Jialong

Application Number

US12/034,842
Publication Number

US 20090157402A1
Time in Patent Office

1,272 Days
Field of Search

704236-240, 704/251
US Class Current

704/251
CPC Class Codes

G10L 15/187 Phonemic context, e.g. pron...

Method of constructing model of recognizing english pronunciation variation

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Method of constructing model of recognizing english pronunciation variation

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links