Systems and methods for an automated pronunciation assessment system for similar vowel pairs
First Claim
1. A computer-implemented method of assessing non-native speech proficiency, comprising:
- receiving, using a sound receiving device, a non-native speech sample uttered by a user;
generating, using a processing system, word hypotheses for the non-native speech sample uttered by a user, the word hypotheses being generated by an automatic speech recognition software;
generating, using the processing system, time alignments between the word hypotheses and corresponding sounds of the non-native speech sample, the time alignments being generated by a time alignment software;
identifying, using the processing system, a plurality of vowel sound boundaries in the non-native speech sample using the word hypotheses and the time alignments;
analyzing, using the processing system, portions of the non-native speech sample within the vowel sound boundaries to extract first vowel characteristics associated with a first vowel sound and second vowel characteristics associated with a second vowel sound represented in the non-native speech sample, wherein the first vowel sound and the second vowel sound form a set of phonetically similar vowel sounds, and wherein the first vowel sound and the second vowel sound are uttered by the user;
computing, using the processing system, a distance measurement using the first vowel characteristics and the second vowel characteristics, the distance measurement representing a difference between the first vowel characteristics and the second vowel characteristics, to determine whether the non-native speech sample exhibits a distinction in pronunciation of the first vowel sound and the second vowel sound;
generating, using the processing system, an assessment of speech proficiency based on the distance measurement; and
outputting the assessment of speech proficiency through a display interface.
2 Assignments
0 Petitions
Accused Products
Abstract
Computer-implemented systems and methods are provided for assessing non-native speech proficiency. a non-native speech sample is processed to identify a plurality of vowel sound boundaries in the non-native speech sample. Portions of the non-native speech sample are analyzed within the vowel sound boundaries to extract vowel characteristics associated with a first vowel sound and a second vowel sound represented in the non-native speech sample. The vowel characteristics are processed to identify a first vowel pronunciation metric for the first vowel sound and a second vowel pronunciation metric for the second vowel sound, and the first vowel pronunciation metric and the second vowel pronunciation metric are processed to determine whether the non-native speech sample exhibits a distinction in pronunciation of the first vowel sound and the second vowel sound.
9 Citations
15 Claims
-
1. A computer-implemented method of assessing non-native speech proficiency, comprising:
-
receiving, using a sound receiving device, a non-native speech sample uttered by a user; generating, using a processing system, word hypotheses for the non-native speech sample uttered by a user, the word hypotheses being generated by an automatic speech recognition software; generating, using the processing system, time alignments between the word hypotheses and corresponding sounds of the non-native speech sample, the time alignments being generated by a time alignment software; identifying, using the processing system, a plurality of vowel sound boundaries in the non-native speech sample using the word hypotheses and the time alignments; analyzing, using the processing system, portions of the non-native speech sample within the vowel sound boundaries to extract first vowel characteristics associated with a first vowel sound and second vowel characteristics associated with a second vowel sound represented in the non-native speech sample, wherein the first vowel sound and the second vowel sound form a set of phonetically similar vowel sounds, and wherein the first vowel sound and the second vowel sound are uttered by the user; computing, using the processing system, a distance measurement using the first vowel characteristics and the second vowel characteristics, the distance measurement representing a difference between the first vowel characteristics and the second vowel characteristics, to determine whether the non-native speech sample exhibits a distinction in pronunciation of the first vowel sound and the second vowel sound; generating, using the processing system, an assessment of speech proficiency based on the distance measurement; and outputting the assessment of speech proficiency through a display interface. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented system for assessing non-native speech proficiency, comprising:
-
one or more data processors; one or more computer-readable storage mediums encoded with instructions for commanding the one or more data processors to execute steps that include; obtaining a non-native speech sample uttered by a user received through a sound receiving device; generating word hypotheses for a non-native speech sample uttered by a user, the word hypotheses being generated by automatic speech recognition software instructions; generating time alignments between the word hypotheses and corresponding sounds of the non-native speech sample, the time alignments being generated by a time alignment software; identifying a plurality of vowel sound boundaries in the non-native speech sample using the word hypotheses and the time alignments; analyzing portions of the non-native speech sample within the vowel sound boundaries to extract first vowel characteristics associated with a first vowel sound and second vowel characteristics associated with a second vowel sound represented in the non-native speech sample, wherein the first vowel sound and the second vowel sound form a set of phonetically similar vowel sounds, and wherein the first vowel sound and the second vowel sound are uttered by the user; computing, a distance measurement using the first vowel characteristics and the second vowel characteristics, the distance measurement representing a difference between the first vowel characteristics and the second vowel characteristics, to determine whether the non-native speech sample exhibits a distinction in pronunciation of the first vowel sound and the second vowel sound; generating, using the processing system, an assessment of speech proficiency based on the distance measurement; and outputting the assessment of speech proficiency through a display interface. - View Dependent Claims (13, 14)
-
-
15. A non-transitory computer-readable storage medium comprising instructions for which when executed cause a processing system to execute steps comprising:
-
obtaining a non-native speech sample uttered by a user received through a sound receiving device; generating word hypotheses for a non-native speech sample uttered by a user, the word hypotheses being generated by automatic speech recognition software; generating time alignments between the word hypotheses and corresponding sounds of the non-native speech sample, the time alignments being generated by a time alignment software; identifying a plurality of vowel sound boundaries in the non-native speech sample using the word hypotheses and the time alignments; analyzing portions of the non-native speech sample within the vowel sound boundaries to extract first vowel characteristics associated with a first vowel sound and second vowel characteristics associated with a second vowel sound represented in the non-native speech sample, wherein the first vowel sound and the second vowel sound form a set of phonetically similar vowel sounds, and wherein the first vowel sound and the second vowel sound are uttered by the user; computing a distance measurement using the first vowel characteristics and the second vowel characteristics, the distance measurement representing a difference between the first vowel characteristics and the second vowel characteristics, to determine whether the non-native speech sample exhibits a distinction in pronunciation of the first vowel sound and the second vowel sound; generating, using the processing system, an assessment of speech proficiency based on the distance measurement; and outputting the assessment of speech proficiency through a display interface.
-
Specification