System and method of speech recognition for non-native speakers of a language

US 7,640,159 B2
Filed: 07/22/2004
Issued: 12/29/2009
Est. Priority Date: 07/22/2004
Status: Active Grant

First Claim

Patent Images

1. A method for speech recognition of input speech in a language from a non-native speaker, the method comprising acts of:

generating one or more feature vectors based upon one or more voice-induced electrical signals that result from the input speech;

generating a first-language phoneme sequence from the one or more feature vectors based upon a first-language acoustic model, wherein the first-language acoustic model corresponds to a first language;

determining a second-language speech segment from the first-language phoneme sequence based upon a second-language lexicon model, wherein the second-language lexicon model corresponds to a second language that is different from the first language;

determining a confidence score associated with a combination of the first-language acoustic model and the second-language lexicon model; and

selecting the first-language acoustic model from a plurality of acoustic models based at least in part on the determined confidence score, each of the plurality of acoustic models corresponding to a different respective language.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An accent compensative speech recognition system and related methods for use with a signal processor generating one or more feature vectors based upon a voice-induced electrical signal are provided. The system includes a first-language acoustic module that determines a first-language phoneme sequence based upon one or more feature vectors, and a second-language lexicon module that determines a second-language speech segment based upon the first-language phoneme sequence. A method aspect includes the steps of generating a first-language phoneme sequence from at least one feature vector based upon a first-language phoneme model, and determining a second-language speech segment from the first-language phoneme sequence based upon a second-language lexicon model.

Citations

18 Claims

1. A method for speech recognition of input speech in a language from a non-native speaker, the method comprising acts of:
- generating one or more feature vectors based upon one or more voice-induced electrical signals that result from the input speech;
  
  generating a first-language phoneme sequence from the one or more feature vectors based upon a first-language acoustic model, wherein the first-language acoustic model corresponds to a first language;
  
  determining a second-language speech segment from the first-language phoneme sequence based upon a second-language lexicon model, wherein the second-language lexicon model corresponds to a second language that is different from the first language;
  
  determining a confidence score associated with a combination of the first-language acoustic model and the second-language lexicon model; and
  
  selecting the first-language acoustic model from a plurality of acoustic models based at least in part on the determined confidence score, each of the plurality of acoustic models corresponding to a different respective language.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, wherein the act of selecting the first-language acoustic model from the plurality of acoustic models comprises selecting the first-language acoustic model based at least in part of the determined confidence score being greater than another confidence score associated with a combination of another first-language acoustic model and the second-language lexicon model.
  - 3. The method of claim 1, wherein the confidence score is associated with the second-language speech segment.
  - 4. The method of claim 3, wherein the confidence score is based on a probability assessment that the first-language phoneme sequence generated based upon the first-language acoustic model corresponds to the second-language speech segment determined using the second-language lexicon model.
  - 5. The method of claim 1, further comprising an act of:
    - generating a speech recognition output for subsequent input speech from the non-native speaker using the first-language acoustic model.
  - 6. The method of claim 1, wherein the first-language acoustic model is associated with a native language of the non-native speaker and the second-language lexicon model is associated with a spoken language of the input speech.

7. At least one computer-readable medium encoded with instructions that, when executed by at least one computer system, perform a method for speech recognition of input speech in a language from a non-native speaker, the method comprising acts of:
- generating, based upon a first-language acoustic model, a first-language phoneme sequence from one or more feature vectors, the one or more feature vectors being based upon one or more voice-induced electrical signals that result from the input speech, wherein the first-language acoustic model corresponds to a first language;
  
  determining a second-language speech segment from the first-language phoneme sequence based upon a second-language lexicon model, wherein the second-language lexicon model corresponds to a second language that is different from the first language;
  
  determining a confidence score associated with a combination of the first-language acoustic model and the second-language lexicon model; and
  
  selecting the first-language acoustic model from a plurality of acoustic models based at least in part on the determined confidence score, each of the plurality of acoustic models corresponding to a different respective language.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The at least one computer-readable medium of claim 7, wherein the act of selecting the first-language acoustic model from the plurality of acoustic models comprises selecting the first-language acoustic model based at least in part of the determined confidence score being greater than another confidence score associated with a combination of another first-language acoustic model and the second-language lexicon model.
  - 9. The at least one computer-readable medium of claim 7, wherein the confidence score is associated with the second-language speech segment.
  - 10. The at least one computer-readable medium of claim 8, wherein the confidence score is based on a probability assessment that the first-language phoneme sequence generated based upon the first-language acoustic model corresponds to the second-language speech segment determined using the second-language lexicon model.
  - 11. The at least one computer-readable medium of claim 7, wherein the method further comprises an act of:
    - generating a speech recognition output for subsequent input speech from the non-native speaker using the first-language acoustic model.
  - 12. The at least one computer-readable medium of claim 7, wherein the first-language acoustic model is associated with a native language of the non-native speaker and the second-language lexicon model is associated with a spoken language of the input speech.

13. An apparatus for speech recognition of input speech in a language from a non-native speaker, the apparatus comprising:
- at least one computer-readable medium encoded with instructions; and
  
  at least one processing unit coupled to the at least one computer-readable medium, wherein upon execution of the instructions by the at least one processing unit, the at least one processing unit;
  
  generates one or more feature vectors based upon one or more voice-induced electrical signals that result from the input speech;
  
  generates a first-language phoneme sequence from the one or more feature vectors based upon a first-language acoustic model, wherein the first-language acoustic model corresponds to a first language;
  
  determines a second-language speech segment from the first-language phoneme sequence based upon a second-language lexicon model, wherein the second-language lexicon model corresponds to a second language that is different from the first language;
  
  determines a confidence score associated with a combination of the first-language acoustic model and the second-language lexicon model; and
  
  selects the first-language acoustic model from a plurality of acoustic models based at least in part on the determined confidence score, each of the plurality of acoustic models corresponding to a different respective language.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The apparatus of claim 13, wherein the at least one processing unit:
    - selects the first-language acoustic model based at least in part of the determined confidence score being greater than another confidence score associated with a combination of another first-language acoustic model and the second-language lexicon model.
  - 15. The apparatus of claim 13, wherein the confidence score is associated with the second-language speech segment.
  - 16. The apparatus of claim 15, wherein the confidence score is based on a probability assessment that the first-language phoneme sequence generated based upon the first-language acoustic model corresponds to the second-language speech segment determined using the second-language lexicon model.
  - 17. The apparatus of claim 13, wherein the at least one processing unit:
    - generates a speech recognition output for subsequent input speech from the non-native speaker using the first-language acoustic model.
  - 18. The apparatus of claim 13, wherein the first-language acoustic model is associated with a native language of the non-native speaker and the second-language lexicon model is associated with a spoken language of the input speech.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Reich, David E.
Primary Examiner(s)
Hudspeth; David R
Assistant Examiner(s)
Kovacek; David

Application Number

US10/896,426
Publication Number

US 20060020462A1
Time in Patent Office

1,986 Days
Field of Search

704 1- 10, 704231-259, 704E15001-E1505, 379 8801- 8815, 379/201.01, 379/201.06, 379/907
US Class Current

704/254
CPC Class Codes

G10L 15/187   Phonemic context, e.g. pron...

G10L 2015/025   Phonemes, fenemes or fenone...

G10L 2015/227   of the speaker; Human-fact...

System and method of speech recognition for non-native speakers of a language

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

System and method of speech recognition for non-native speakers of a language

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links