PRONUNCIATION DIAGNOSIS DEVICE, PRONUNCIATION DIAGNOSIS METHOD, RECORDING MEDIUM, AND PRONUNCIATION DIAGNOSIS PROGRAM

US 20090305203A1
Filed: 09/29/2006
Published: 12/10/2009
Est. Priority Date: 09/29/2005
Status: Abandoned Application

First Claim

Patent Images

1-13. -13. (canceled)

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A pronunciation diagnosis device according to the present invention diagnoses the pronunciation of a speaker using articulatory attribute data including articulatory attribute values corresponding to an articulatory attribute of a desirable pronunciation for each phoneme in each audio language system, the articulatory attribute including any one condition of the tongue in the oral cavity, the lips, the vocal cord, the uvula, the nasal cavity, the teeth, and the jaws, or a combination including at least one of the conditions of the articulatory organs; the way of applying force in the conditions of articulatory organs; and a combination of breathing conditions; extracting an acoustic feature from an audio signal generated by a speaker, the acoustic feature being a frequency feature quantity, a sound volume, and a duration time, a rate of change or change pattern thereof, and at least one combination thereof; estimating an attribute value associated with the articulatory attribute on the basis of the extracted acoustic feature; and comparing the estimated attribute value with the desirable articulatory attribute data.

173 Citations

35 Claims

1-13. -13. (canceled)

14. A pronunciation diagnosis device comprising:
- articulatory attribute data including articulatory attribute values corresponding to an articulatory attribute of a desirable pronunciation for each phoneme in each audio language system, the articulatory attribute including any one condition of articulatory organs selected from the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, or a combination including at least one of the conditions of the articulatory organs;
  
  the way of applying force in the conditions of articulatory organs; and
  
  a combination of breathing conditions;
  
  extracting means for extracting an acoustic feature from an audio signal generated by a speaker, the acoustic feature being a frequency feature quantity, a sound volume, and a duration time, a rate of change or change pattern thereof, and at least one combination thereof;
  
  attribute-value estimating means for estimating an attribute value associated with the articulatory attribute on the basis of the extracted acoustic feature; and
  
  diagnosing means for diagnosing the pronunciation of the speaker by comparing the estimated attribute value with the desirable articulatory attribute data.
- View Dependent Claims (15, 20)
- - 15. The pronunciation diagnosis device according to claim 14, further comprising:
    - outputting means for outputting a pronunciation diagnosis result of the speaker.
  - 20. The pronunciation diagnosis device according to claim 14, wherein the phoneme comprises a consonant.

16. A pronunciation diagnosis device comprising:
- acoustic-feature extracting means for extracting an acoustic feature of a phoneme of a pronunciation, the acoustic feature being a frequency feature quantity, a sound volume, a duration time, a rate of change or change pattern thereof, and at least one combination thereof;
  
  articulatory-attribute-distribution forming means for forming a distribution, for each phoneme in each audio language system, according to the extracted acoustic feature of the phoneme, the distribution being formed of any one of the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, a combination including at least one of the conditions of these articulatory organs, the way of applying force during the conditions of these articulatory organs, or a combination of breathing conditions; and
  
  articulatory-attribute determining means for determining an articulatory attribute categorized by the articulatory-attribute-distribution forming means on the basis of a threshold value.
- View Dependent Claims (18, 21)
- - 18. The pronunciation diagnosis device according to claim 16, further comprising:
    - threshold-value changing means for changing the threshold value.
  - 21. The pronunciation diagnosis device according to claim 16, wherein the phoneme comprises a consonant.

17. A pronunciation diagnosis device comprising:
- acoustic-feature extracting means for extracting an acoustic feature of phonemes of similar pronunciations, the acoustic feature being a frequency feature quantity, a sound volume, and a duration time, a rate of change or change pattern thereof, and at least one combination thereof;
  
  first articulatory-attribute-distribution forming means for forming a first distribution, for each phoneme in each audio language system, according to the extracted acoustic feature of one of the phonemes, the first distribution being formed of any one of the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, or a combination including at least one of the conditions of these articulatory organs, the way of applying force during the conditions of these articulatory organs, or a combination of breathing conditions as articulatory attributes for pronouncing the one of phonemes;
  
  second articulatory-attribute-distribution forming means for forming a second distribution according to the extracted acoustic feature of the other of the phonemes by a speaker, the second distribution being formed of any one of the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, a combination including at least one of the conditions of these articulatory organs, the way of applying force during the conditions of these articulatory organs, or a combination of breathing conditions;
  
  first articulatory-attribute determining means for determining an articulatory attribute categorized by the first articulatory-attribute-distribution forming means on the basis of a first threshold value; and
  
  second articulatory-attribute determining means for determining an articulatory attribute categorized by the second articulatory-attribute-distribution forming means on the basis of a second threshold value.
- View Dependent Claims (19, 22)
- - 19. The pronunciation diagnosis device according to claim 17, further comprising:
    - threshold-value changing means for changing the threshold value.
  - 22. The pronunciation diagnosis device according to claim 17, wherein the phoneme comprises a consonant.

23. A method of diagnosing pronunciation, comprising:
- an extracting step of extracting an acoustic feature from an audio signal generated by a speaker, the acoustic feature being a frequency feature quantity, a sound volume, and a duration time, a rate of change or change pattern thereof, and at least one combination thereof;
  
  an attribute-value estimating step of estimating an attribute value associated with the articulatory attribute on the basis of the extracted acoustic feature;
  
  a diagnosing step of diagnosing the pronunciation of the speaker by comparing the estimated attribute value with articulatory attribute data including articulatory attribute values corresponding to an articulatory attribute of a desirable pronunciation for each phoneme in each audio language system, the articulatory attribute including any one condition of articulatory organs selected from the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, or a combination including at least one of the conditions of the articulatory organs;
  
  the way of applying force in the conditions of articulatory organs; and
  
  a combination of breathing conditions as articulatory attributes for pronouncing the phoneme; and
  
  an outputting step of outputting a pronunciation diagnosis result of the speaker.
- View Dependent Claims (26, 29, 32, 35)
- - 26. The method of diagnosing pronunciation according to claim 23, further comprising:
    - a threshold-value changing step of changing the threshold value.
  - 29. A recording medium for storing a program for instructing a computer to execute the method according to claim 23.
  - 32. A computer program for instructing a computer to execute the method according to claim 23.
  - 35. A computer program for instructing a computer to execute the method according to claim 26.

24. A method of diagnosing pronunciation, comprising:
- an acoustic-feature extracting step of extracting at least one combination of an acoustic feature of a phoneme of a pronunciation, the acoustic feature being a frequency feature quantity, a sound volume, a duration time, and a rate of change or change pattern thereof;
  
  an articulatory-attribute-distribution forming step of forming a distribution, for each phoneme in each audio language system, according to the extracted acoustic feature of the phoneme, the distribution being formed of any one of the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, a combination including at least one of the conditions of these articulatory organs, the way of applying force during the conditions of these articulatory organs, or a combination of breathing conditions as articulatory attributes for pronouncing the phoneme; and
  
  an articulatory-attribute determining step of determining an articulatory attribute categorized by the articulatory-attribute-distribution forming means on the basis of a threshold value.
- View Dependent Claims (27, 30, 33)
- - 27. The method of diagnosing pronunciation according to claim 24, further comprising:
    - a threshold-value changing step of changing the threshold value.
  - 30. A recording medium for storing a program for instructing a computer to execute the method according to claim 24.
  - 33. A computer program for instructing a computer to execute the method according to claim 24.

25. A method of diagnosing pronunciation, comprising:
- an acoustic-feature extracting step of extracting an acoustic feature of phonemes of similar pronunciations, the acoustic feature being a frequency feature quantity, a sound volume, and a duration time, a rate of change or change pattern thereof, and at least one combination thereof;
  
  an first articulatory-attribute-distribution forming step of forming a first distribution, for each phoneme in each audio language system, according to the extracted acoustic feature of one of the phonemes, the first distribution being formed of any one of the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, or a combination including at least one of the conditions of these articulatory organs, the way of applying force during the conditions of these articulatory organs, or a combination of breathing conditions as articulatory attributes for pronouncing the one of phonemes;
  
  a second articulatory-attribute-distribution forming step of forming a second distribution according to the extracted acoustic feature of the other of the phonemes by a speaker, the second distribution being formed of any one of the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, a combination including at least one of the conditions of these articulatory organs, the way of applying force during the conditions of these articulatory organs, or a combination of breathing conditions;
  
  a first articulatory-attribute determining step of determining an articulatory attribute categorized by the first articulatory-attribute-distribution forming means on the basis of a first threshold value; and
  
  a second articulatory-attribute determining step of determining an articulatory attribute categorized by the second articulatory-attribute-distribution forming means on the basis of a second threshold value.
- View Dependent Claims (31, 34)
- - 31. A recording medium for storing a program for instructing a computer to execute the method according to claim 25.
  - 34. A computer program for instructing a computer to execute the method according to claim 25.

28. A recording medium for storing, for each audio language system, comprising at least one of an articulatory attribute database including articulatory attributes of each phoneme constituting the audio language system, a threshold value database including threshold values for estimating an articulatory attribute value, a word-segment composition database, a feature axis database, and a correction content database.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
National Institute of Advanced Industrial Science and Technology (Government of Japan), Machi Okumura
Original Assignee
National Institute of Advanced Industrial Science and Technology (Government of Japan), Machi Okumura
Inventors
Kojima, Hiroaki, Omura, Hiroshi, Okumura, Machi

Application Number

US12/088,614
Publication Number

US 20090305203A1
Time in Patent Office

Days
Field of Search
US Class Current

434/185
CPC Class Codes

G09B 19/04   Speaking with audible prese...

G09B 19/06   Foreign languages with audi...

G09B 5/06   with both visual and audibl...

G10L 15/25   using position of the lips,...

G10L 17/26   Recognition of special voic...

PRONUNCIATION DIAGNOSIS DEVICE, PRONUNCIATION DIAGNOSIS METHOD, RECORDING MEDIUM, AND PRONUNCIATION DIAGNOSIS PROGRAM

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

173 Citations

35 Claims

Specification

Solutions

Use Cases

Quick Links

PRONUNCIATION DIAGNOSIS DEVICE, PRONUNCIATION DIAGNOSIS METHOD, RECORDING MEDIUM, AND PRONUNCIATION DIAGNOSIS PROGRAM

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

173 Citations

35 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links