Systems and methods for an automated pronunciation assessment system for similar vowel pairs

US 9,489,864 B2
Filed: 01/07/2014
Issued: 11/08/2016
Est. Priority Date: 01/07/2013
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method of assessing non-native speech proficiency, comprising:

receiving, using a sound receiving device, a non-native speech sample uttered by a user;

generating, using a processing system, word hypotheses for the non-native speech sample uttered by a user, the word hypotheses being generated by an automatic speech recognition software;

generating, using the processing system, time alignments between the word hypotheses and corresponding sounds of the non-native speech sample, the time alignments being generated by a time alignment software;

identifying, using the processing system, a plurality of vowel sound boundaries in the non-native speech sample using the word hypotheses and the time alignments;

analyzing, using the processing system, portions of the non-native speech sample within the vowel sound boundaries to extract first vowel characteristics associated with a first vowel sound and second vowel characteristics associated with a second vowel sound represented in the non-native speech sample, wherein the first vowel sound and the second vowel sound form a set of phonetically similar vowel sounds, and wherein the first vowel sound and the second vowel sound are uttered by the user;

computing, using the processing system, a distance measurement using the first vowel characteristics and the second vowel characteristics, the distance measurement representing a difference between the first vowel characteristics and the second vowel characteristics, to determine whether the non-native speech sample exhibits a distinction in pronunciation of the first vowel sound and the second vowel sound;

generating, using the processing system, an assessment of speech proficiency based on the distance measurement; and

outputting the assessment of speech proficiency through a display interface.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Computer-implemented systems and methods are provided for assessing non-native speech proficiency. a non-native speech sample is processed to identify a plurality of vowel sound boundaries in the non-native speech sample. Portions of the non-native speech sample are analyzed within the vowel sound boundaries to extract vowel characteristics associated with a first vowel sound and a second vowel sound represented in the non-native speech sample. The vowel characteristics are processed to identify a first vowel pronunciation metric for the first vowel sound and a second vowel pronunciation metric for the second vowel sound, and the first vowel pronunciation metric and the second vowel pronunciation metric are processed to determine whether the non-native speech sample exhibits a distinction in pronunciation of the first vowel sound and the second vowel sound.

9 Citations

View as Search Results

15 Claims

1. A computer-implemented method of assessing non-native speech proficiency, comprising:
- receiving, using a sound receiving device, a non-native speech sample uttered by a user;
  
  generating, using a processing system, word hypotheses for the non-native speech sample uttered by a user, the word hypotheses being generated by an automatic speech recognition software;
  
  generating, using the processing system, time alignments between the word hypotheses and corresponding sounds of the non-native speech sample, the time alignments being generated by a time alignment software;
  
  identifying, using the processing system, a plurality of vowel sound boundaries in the non-native speech sample using the word hypotheses and the time alignments;
  
  analyzing, using the processing system, portions of the non-native speech sample within the vowel sound boundaries to extract first vowel characteristics associated with a first vowel sound and second vowel characteristics associated with a second vowel sound represented in the non-native speech sample, wherein the first vowel sound and the second vowel sound form a set of phonetically similar vowel sounds, and wherein the first vowel sound and the second vowel sound are uttered by the user;
  
  computing, using the processing system, a distance measurement using the first vowel characteristics and the second vowel characteristics, the distance measurement representing a difference between the first vowel characteristics and the second vowel characteristics, to determine whether the non-native speech sample exhibits a distinction in pronunciation of the first vowel sound and the second vowel sound;
  
  generating, using the processing system, an assessment of speech proficiency based on the distance measurement; and
  
  outputting the assessment of speech proficiency through a display interface.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, wherein a proficient native speaker pronounces the first vowel sound distinctly from the second vowel sound, and wherein a non-proficient non-native speaker pronounces the first vowel sound and the second vowel sound substantially identically.
  - 3. The method of claim 1, wherein the set of phonetically similar vowel sounds includes one or more of:
    - /i/ versus /l/;
      
      /e/ versus /ε
      
      /;
      
      /a/ versus /Λ
      
      /; and
      
      /u/ versus //.
  - 4. The method of claim 1, wherein the set of phonetically similar vowel sounds includes one or more of:
    - /i/ as in “
      
      cheese”
      
      versus /l/ as in “
      
      six”
      
      ;
      
      /e/ as in “
      
      snake”
      
      versus /ε
      
      / as in “
      
      chess”
      
      ;
      
      /a/ as in “
      
      Bob”
      
      versus /Λ
      
      / as in “
      
      sun”
      
      ; and
      
      /u/ as in “
      
      food”
      
      versus // as in “
      
      good”
      
      .
  - 5. The method of claim 1, further comprising:
    - outputting, through the display interface, feedback that offers vowel pronunciation suggestions for improving communicative competence through better vowel pronunciation.
  - 6. The method of claim 1, wherein the first vowel characteristics and the second vowel characteristics comprise vowel formant measurements.
  - 7. The method of claim 6, wherein a vowel formant measurement comprises a measurement of an amplitude peak in a vowel spectrum that indicates a resonant frequency of a vowel.
  - 8. The method of claim 1, wherein the first vowel characteristics comprise an F1 measurement and an F2 measurement of the first vowel sound, and wherein the second vowel characteristics comprise an F1 measurement and an F2 measurement of the second vowel sound.
  - 9. The method of claim 8, wherein the distance measurement is based on a calculation of:
    - Dist(v_i, v_j)=√
      
      {square root over ((F1_vi−
      
      F1_vj)²+(F2_vi−
      
      F2_vj)²)},wherein F1_viis a mean F1 measurement for vowel sound i, wherein F1 _vjis a mean F1 measurement for vowel sound j, wherein F2_viis a mean F2 measurement for vowel sound i, wherein F2_vjis a mean F2 measurement for vowel sound j.
  - 10. The method of claim 1, wherein generating the assessment is further based on a stress metric, an intonation metric, a vocabulary metric, or a grammar metric.
  - 11. The method of claim 1, further comprising:
    - wherein the assessment is determined using a scoring model.

12. A computer-implemented system for assessing non-native speech proficiency, comprising:
- one or more data processors;
  
  one or more computer-readable storage mediums encoded with instructions for commanding the one or more data processors to execute steps that include;
  
  obtaining a non-native speech sample uttered by a user received through a sound receiving device;
  
  generating word hypotheses for a non-native speech sample uttered by a user, the word hypotheses being generated by automatic speech recognition software instructions;
  
  generating time alignments between the word hypotheses and corresponding sounds of the non-native speech sample, the time alignments being generated by a time alignment software;
  
  identifying a plurality of vowel sound boundaries in the non-native speech sample using the word hypotheses and the time alignments;
  
  analyzing portions of the non-native speech sample within the vowel sound boundaries to extract first vowel characteristics associated with a first vowel sound and second vowel characteristics associated with a second vowel sound represented in the non-native speech sample, wherein the first vowel sound and the second vowel sound form a set of phonetically similar vowel sounds, and wherein the first vowel sound and the second vowel sound are uttered by the user;
  
  computing, a distance measurement using the first vowel characteristics and the second vowel characteristics, the distance measurement representing a difference between the first vowel characteristics and the second vowel characteristics, to determine whether the non-native speech sample exhibits a distinction in pronunciation of the first vowel sound and the second vowel sound;
  
  generating, using the processing system, an assessment of speech proficiency based on the distance measurement; and
  
  outputting the assessment of speech proficiency through a display interface.
- View Dependent Claims (13, 14)
- - 13. The system of claim 12, wherein a proficient native speaker pronounces the first vowel sound distinctly from the second vowel sound, and wherein a non-proficient non-native speaker pronounces the first vowel sound and the second vowel sound substantially identically.
  - 14. The system of claim 12, wherein the set of phonetically similar vowel sounds includes one or more of:
    - /i/ versus /l/;
      
      /e/ versus /ε
      
      /;
      
      /a/ versus /Λ
      
      /; and
      
      /u/ versus //.

15. A non-transitory computer-readable storage medium comprising instructions for which when executed cause a processing system to execute steps comprising:
- obtaining a non-native speech sample uttered by a user received through a sound receiving device;
  
  generating word hypotheses for a non-native speech sample uttered by a user, the word hypotheses being generated by automatic speech recognition software;
  
  generating time alignments between the word hypotheses and corresponding sounds of the non-native speech sample, the time alignments being generated by a time alignment software;
  
  identifying a plurality of vowel sound boundaries in the non-native speech sample using the word hypotheses and the time alignments;
  
  analyzing portions of the non-native speech sample within the vowel sound boundaries to extract first vowel characteristics associated with a first vowel sound and second vowel characteristics associated with a second vowel sound represented in the non-native speech sample, wherein the first vowel sound and the second vowel sound form a set of phonetically similar vowel sounds, and wherein the first vowel sound and the second vowel sound are uttered by the user;
  
  computing a distance measurement using the first vowel characteristics and the second vowel characteristics, the distance measurement representing a difference between the first vowel characteristics and the second vowel characteristics, to determine whether the non-native speech sample exhibits a distinction in pronunciation of the first vowel sound and the second vowel sound;
  
  generating, using the processing system, an assessment of speech proficiency based on the distance measurement; and
  
  outputting the assessment of speech proficiency through a display interface.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Educational Testing Service
Original Assignee
Educational Testing Service
Inventors
Evanini, Keelan
Primary Examiner(s)
Desir, Pierre-Louis
Assistant Examiner(s)
Shin, Seong Ah A

Application Number

US14/148,772
Publication Number

US 20140195239A1
Time in Patent Office

1,036 Days
Field of Search

704/209, 704/251, 704/254, 704/236
US Class Current

1/1
CPC Class Codes

G09B 19/04   Speaking with audible prese...

G09B 5/06   with both visual and audibl...

G10L 15/04   Segmentation; Word boundary...

G10L 15/187   Phonemic context, e.g. pron...

G10L 25/15   the extracted parameters be...

G10L 25/60   for measuring the quality o...

Systems and methods for an automated pronunciation assessment system for similar vowel pairs

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

9 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for an automated pronunciation assessment system for similar vowel pairs

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

9 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links