PRONUNCIATION DIAGNOSIS DEVICE, PRONUNCIATION DIAGNOSIS METHOD, RECORDING MEDIUM, AND PRONUNCIATION DIAGNOSIS PROGRAM
1 Assignment
0 Petitions
Accused Products
Abstract
A pronunciation diagnosis device according to the present invention diagnoses the pronunciation of a speaker using articulatory attribute data including articulatory attribute values corresponding to an articulatory attribute of a desirable pronunciation for each phoneme in each audio language system, the articulatory attribute including any one condition of the tongue in the oral cavity, the lips, the vocal cord, the uvula, the nasal cavity, the teeth, and the jaws, or a combination including at least one of the conditions of the articulatory organs; the way of applying force in the conditions of articulatory organs; and a combination of breathing conditions; extracting an acoustic feature from an audio signal generated by a speaker, the acoustic feature being a frequency feature quantity, a sound volume, and a duration time, a rate of change or change pattern thereof, and at least one combination thereof; estimating an attribute value associated with the articulatory attribute on the basis of the extracted acoustic feature; and comparing the estimated attribute value with the desirable articulatory attribute data.
173 Citations
35 Claims
-
1-13. -13. (canceled)
-
14. A pronunciation diagnosis device comprising:
-
articulatory attribute data including articulatory attribute values corresponding to an articulatory attribute of a desirable pronunciation for each phoneme in each audio language system, the articulatory attribute including any one condition of articulatory organs selected from the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, or a combination including at least one of the conditions of the articulatory organs;
the way of applying force in the conditions of articulatory organs; and
a combination of breathing conditions;extracting means for extracting an acoustic feature from an audio signal generated by a speaker, the acoustic feature being a frequency feature quantity, a sound volume, and a duration time, a rate of change or change pattern thereof, and at least one combination thereof; attribute-value estimating means for estimating an attribute value associated with the articulatory attribute on the basis of the extracted acoustic feature; and diagnosing means for diagnosing the pronunciation of the speaker by comparing the estimated attribute value with the desirable articulatory attribute data. - View Dependent Claims (15, 20)
-
-
16. A pronunciation diagnosis device comprising:
-
acoustic-feature extracting means for extracting an acoustic feature of a phoneme of a pronunciation, the acoustic feature being a frequency feature quantity, a sound volume, a duration time, a rate of change or change pattern thereof, and at least one combination thereof; articulatory-attribute-distribution forming means for forming a distribution, for each phoneme in each audio language system, according to the extracted acoustic feature of the phoneme, the distribution being formed of any one of the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, a combination including at least one of the conditions of these articulatory organs, the way of applying force during the conditions of these articulatory organs, or a combination of breathing conditions; and articulatory-attribute determining means for determining an articulatory attribute categorized by the articulatory-attribute-distribution forming means on the basis of a threshold value. - View Dependent Claims (18, 21)
-
-
17. A pronunciation diagnosis device comprising:
-
acoustic-feature extracting means for extracting an acoustic feature of phonemes of similar pronunciations, the acoustic feature being a frequency feature quantity, a sound volume, and a duration time, a rate of change or change pattern thereof, and at least one combination thereof; first articulatory-attribute-distribution forming means for forming a first distribution, for each phoneme in each audio language system, according to the extracted acoustic feature of one of the phonemes, the first distribution being formed of any one of the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, or a combination including at least one of the conditions of these articulatory organs, the way of applying force during the conditions of these articulatory organs, or a combination of breathing conditions as articulatory attributes for pronouncing the one of phonemes; second articulatory-attribute-distribution forming means for forming a second distribution according to the extracted acoustic feature of the other of the phonemes by a speaker, the second distribution being formed of any one of the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, a combination including at least one of the conditions of these articulatory organs, the way of applying force during the conditions of these articulatory organs, or a combination of breathing conditions; first articulatory-attribute determining means for determining an articulatory attribute categorized by the first articulatory-attribute-distribution forming means on the basis of a first threshold value; and second articulatory-attribute determining means for determining an articulatory attribute categorized by the second articulatory-attribute-distribution forming means on the basis of a second threshold value. - View Dependent Claims (19, 22)
-
-
23. A method of diagnosing pronunciation, comprising:
-
an extracting step of extracting an acoustic feature from an audio signal generated by a speaker, the acoustic feature being a frequency feature quantity, a sound volume, and a duration time, a rate of change or change pattern thereof, and at least one combination thereof; an attribute-value estimating step of estimating an attribute value associated with the articulatory attribute on the basis of the extracted acoustic feature; a diagnosing step of diagnosing the pronunciation of the speaker by comparing the estimated attribute value with articulatory attribute data including articulatory attribute values corresponding to an articulatory attribute of a desirable pronunciation for each phoneme in each audio language system, the articulatory attribute including any one condition of articulatory organs selected from the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, or a combination including at least one of the conditions of the articulatory organs;
the way of applying force in the conditions of articulatory organs; and
a combination of breathing conditions as articulatory attributes for pronouncing the phoneme; andan outputting step of outputting a pronunciation diagnosis result of the speaker. - View Dependent Claims (26, 29, 32, 35)
-
-
24. A method of diagnosing pronunciation, comprising:
-
an acoustic-feature extracting step of extracting at least one combination of an acoustic feature of a phoneme of a pronunciation, the acoustic feature being a frequency feature quantity, a sound volume, a duration time, and a rate of change or change pattern thereof; an articulatory-attribute-distribution forming step of forming a distribution, for each phoneme in each audio language system, according to the extracted acoustic feature of the phoneme, the distribution being formed of any one of the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, a combination including at least one of the conditions of these articulatory organs, the way of applying force during the conditions of these articulatory organs, or a combination of breathing conditions as articulatory attributes for pronouncing the phoneme; and an articulatory-attribute determining step of determining an articulatory attribute categorized by the articulatory-attribute-distribution forming means on the basis of a threshold value. - View Dependent Claims (27, 30, 33)
-
-
25. A method of diagnosing pronunciation, comprising:
-
an acoustic-feature extracting step of extracting an acoustic feature of phonemes of similar pronunciations, the acoustic feature being a frequency feature quantity, a sound volume, and a duration time, a rate of change or change pattern thereof, and at least one combination thereof; an first articulatory-attribute-distribution forming step of forming a first distribution, for each phoneme in each audio language system, according to the extracted acoustic feature of one of the phonemes, the first distribution being formed of any one of the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, or a combination including at least one of the conditions of these articulatory organs, the way of applying force during the conditions of these articulatory organs, or a combination of breathing conditions as articulatory attributes for pronouncing the one of phonemes; a second articulatory-attribute-distribution forming step of forming a second distribution according to the extracted acoustic feature of the other of the phonemes by a speaker, the second distribution being formed of any one of the height, position, shape, and movement of the tongue, the shape, opening, and movement of the lips, the condition of the glottis, the condition of the vocal cord, the condition of the uvula, the condition of the nasal cavity, the positions of the upper and lower teeth, the condition of the jaws, and the movement of the jaws, a combination including at least one of the conditions of these articulatory organs, the way of applying force during the conditions of these articulatory organs, or a combination of breathing conditions; a first articulatory-attribute determining step of determining an articulatory attribute categorized by the first articulatory-attribute-distribution forming means on the basis of a first threshold value; and a second articulatory-attribute determining step of determining an articulatory attribute categorized by the second articulatory-attribute-distribution forming means on the basis of a second threshold value. - View Dependent Claims (31, 34)
-
-
28. A recording medium for storing, for each audio language system, comprising at least one of an articulatory attribute database including articulatory attributes of each phoneme constituting the audio language system, a threshold value database including threshold values for estimating an articulatory attribute value, a word-segment composition database, a feature axis database, and a correction content database.
Specification