SYSTEMS AND METHODS FOR MODELING L1-SPECIFIC PHONOLOGICAL ERRORS IN COMPUTER-ASSISTED PRONUNCIATION TRAINING SYSTEM

US 20140006029A1
Filed: 07/01/2013
Published: 01/02/2014
Est. Priority Date: 06/29/2012
Status: Active Grant

First Claim

Patent Images

1. A non-transitory processor-readable medium storing code representing instructions to be executed by a processor, the code comprising code to cause the processor to:

receive acoustic data representing an utterance spoken by a language learner in a non-native language in response to prompting the language learner to recite a word in the non-native language;

receive a pronunciation lexicon of the word in the non-native language, the pronunciation lexicon of the word including at least one alternative pronunciation of the word determined based on a pronunciation lexicon of a native language of the language learner;

generate an acoustic model of the at least one alternative pronunciation of the word from the pronunciation lexicon of the word in the non-native language;

identify a mispronunciation of the word in the utterance based on a comparison of the acoustic data with the acoustic model; and

send feedback related to the mispronunciation of the word to the language learner.

View all claims

14 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A non-transitory processor-readable medium storing code representing instructions to be executed by a processor includes code to cause the processor to receive acoustic data representing an utterance spoken by a language learner in a non-native language in response to prompting the language learner to recite a word in the non-native language and receive a pronunciation lexicon of the word in the non-native language. The pronunciation lexicon includes at least one alternative pronunciation of the word based on a pronunciation lexicon of a native language of the language learner. The code causes the processor to generate an acoustic model of the at least one alternative pronunciation in the non-native language and identify a mispronunciation of the word in the utterance based on a comparison of the acoustic data with the acoustic model. The code causes the processor to send feedback related to the mispronunciation of the word to the language learner.

Citations

20 Claims

1. A non-transitory processor-readable medium storing code representing instructions to be executed by a processor, the code comprising code to cause the processor to:
- receive acoustic data representing an utterance spoken by a language learner in a non-native language in response to prompting the language learner to recite a word in the non-native language;
  
  receive a pronunciation lexicon of the word in the non-native language, the pronunciation lexicon of the word including at least one alternative pronunciation of the word determined based on a pronunciation lexicon of a native language of the language learner;
  
  generate an acoustic model of the at least one alternative pronunciation of the word from the pronunciation lexicon of the word in the non-native language;
  
  identify a mispronunciation of the word in the utterance based on a comparison of the acoustic data with the acoustic model; and
  
  send feedback related to the mispronunciation of the word to the language learner.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 15)
- - 2. The non-transitory processor-readable medium of claim 1, wherein the code to cause the processor to identify includes code to cause the processor to identify grammar data associated with the acoustic data that is different from grammar data associated with the acoustic model to produce a grammar inaccuracy, the feedback including the grammar inaccuracy.
  - 3. The non-transitory processor-readable medium of claim 1, wherein the at least one alternative pronunciation of the word from the pronunciation lexicon is determined based on the pronunciation lexicon of the native language and phonetically annotated data related to the native language.
  - 4. The non-transitory processor-readable medium of claim 1, further comprising code to cause the processor to:
    - generate the pronunciation lexicon of the word in the non-native language based on the pronunciation lexicon of the native language and phonetically annotated data related to the native language.
  - 5. The non-transitory processor-readable medium of claim 1, further comprising code to cause the processor to:
    - generate a speech model of the word based on the pronunciation lexicon of the word in the non-native language and the acoustic model,the code to cause the processor to identify including code to cause the processor to identify the mispronunciation of the word in the utterance based on the speech model and the comparison.
  - 6. The non-transitory processor-readable medium of claim 1, wherein the code to cause the processor to generate the acoustic model includes code to cause the processor to:
    - generate a first lattice for the word in the non-native language; and
      
      generate a second lattice for the at least one alternative pronunciation of the word based on phonetically annotated data associated with the word in the non-native language.
  - 7. The non-transitory processor-readable medium of claim 6, wherein the first lattice and the second lattice are part of a minimum phone error training process used to train the acoustic model.
  - 8. The non-transitory processor-readable medium of claim 1, wherein the pronunciation lexicon of the word in the non-native language is received from a machine translation module.
  - 9. The non-transitory processor-readable medium of claim 1, wherein the utterance is a first utterance, the at least one alternative pronunciation of the word being a first pronunciation of the word, the code further comprising code to cause the processor to:
    - generate an acoustic model for a second pronunciation of the word;
      
      identify the second pronunciation of the word in a second utterance based on a comparison of acoustic data representing the second utterance with the acoustic model for the second pronunciation of the word; and
      
      send feedback related to the second pronunciation of the word to the language learner.
  - 15. The method of claim 9, further comprising:
    - generating a pronunciation lexicon of the word in the non-native language, the pronunciation lexicon of the word including a set of alternative pronunciations of the word including the alternative pronunciation of the word; and
      
      generating a speech model of the word based on the pronunciation lexicon of the word in the non-native language and the acoustic model, the identifying including identifying the mispronunciation of the word in the utterance in response to the speech recognition engine recognizing the acoustic data as part of the acoustic model and the speech model.

10. A method, comprising:
- receiving acoustic data representing an utterance spoken by a language learner in a non-native language in response to prompting the language learner to recite a word in the non-native language;
  
  generating an alternative pronunciation of the word based on a pronunciation lexicon of a native language of the language learner and phonetically annotated data associated with a native pronunciation of the word;
  
  generating an acoustic model for the alternative pronunciation of the word;
  
  identifying a mispronunciation of the word in the utterance in response to a speech recognition engine recognizing the acoustic data as part of the acoustic model; and
  
  sending feedback related to the mispronunciation of the word to the language learner in response to the identifying.
- View Dependent Claims (11, 12, 13, 14)
- - 11. The method of claim 10, when the utterance is a first utterance, the alternative pronunciation is a first pronunciation, the method further comprising:
    - generating an acoustic model for a second pronunciation of the word;
      
      identifying the second pronunciation of the word in a second utterance of the word in response to the speech recognition engine recognizing the acoustic data representing the second utterance as part of the acoustic model for the second pronunciation of the word; and
      
      sending feedback related to the second pronunciation of the word to the language learner in response to the identifying the second pronunciation of the word.
  - 12. The method of claim 10, further comprising:
    - identifying grammar data associated with the acoustic data that is different from grammar data associated with the acoustic model to produce a grammar inaccuracy, the feedback including the grammar inaccuracy.
  - 13. The method of claim 10, wherein the generating the acoustic model includes:
    - generating a first lattice for the word in the non-native language; and
      
      generating a second lattice for the alternative pronunciation of the word based on the phonetically annotated data with the native pronunciation of the word.
  - 14. The method of claim 13, wherein the first lattice and the second lattice are part of a minimum phone error training process used to train the acoustic model.

16. A method, comprising:
- receiving a phrase having a plurality of words from a language learning module in response to the language learning module prompting a language learner to recite the phrase in a non-native language, the language learner having a native language;
  
  generating a non-native lexicon including a set of alternative phrases having a probability greater than a threshold level of being spoken by the language learner when the language learner attempts to recite the phrase in the non-native language;
  
  generating an acoustic model for each alternative phrase from the set of alternative phrases based on phonetically annotated data associated with a native recitation of each word from the plurality of words in the phrase;
  
  identifying that the language learner recited an alternative phrase from the set of alternative phrases based on a comparison of the acoustic model for the alternative phrase and acoustic data representing an utterance spoken by the language learner in response to the language learning module prompting the language learner to recite the phrase in the non-native language;
  
  identifying at least one word from the plurality of words in the phrase that was incorrectly recited by the language learner to produce the alternative phrase; and
  
  sending feedback to the language learner associated with the at least one word.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The method of claim 16, wherein the generating the non-native lexicon includes generating the non-native lexicon based on a lexicon of the native language and phonetically annotated data associated with the native recitation of each word from the plurality of words in the phrase.
  - 18. The method of claim 16, wherein the generating the non-native lexicon includes generating the non-native lexicon based on a lexicon of the native language and phonetically annotated data associated with a native pronunciation of each word from the plurality of words.
  - 19. The method of claim 16, wherein the alternative phrase from the set of alternative phrases includes at least one grammatical inaccuracy associated with the native recitation of the phrase.
  - 20. The method of claim 16, wherein the generating the acoustic model includes generating a first lattice for each word from the plurality of words in the phrase, and generating a second lattice for each alternative phrase from the set of alternative phrases.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Rosetta Stone Limited (Rosetta Stone Incorporated)
Original Assignee
Rosetta Stone Limited (Rosetta Stone Incorporated)
Inventors
Stanley, Theban, Hacioglu, Kadri, Siivola, Vesa

Granted Patent

US 10,068,569 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/254
CPC Class Codes

G09B 19/06   Foreign languages with audi...

G10L 15/187   Phonemic context, e.g. pron...

G10L 15/19   Grammatical context, e.g. d...

SYSTEMS AND METHODS FOR MODELING L1-SPECIFIC PHONOLOGICAL ERRORS IN COMPUTER-ASSISTED PRONUNCIATION TRAINING SYSTEM

First Claim

14 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

SYSTEMS AND METHODS FOR MODELING L1-SPECIFIC PHONOLOGICAL ERRORS IN COMPUTER-ASSISTED PRONUNCIATION TRAINING SYSTEM

First Claim

14 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links