Recognition of Speech With Different Accents

US 20140129218A1
Filed: 11/06/2012
Published: 05/08/2014
Est. Priority Date: 06/06/2012
Status: Active Grant

First Claim

Patent Images

1. A method for recognizing speech, comprising:

loading a digital representation of a first human utterance;

processing the digital first utterance with a first accent category model;

processing the digital first utterance with a second accent category model;

selecting a category of accents based on results from the processing the first accent category model and the processing the second accent category model;

selecting a plurality of accent models belonging to the selected category of accents;

loading a digital representation of a second human utterance;

processing the digital second utterance with each of the selected plurality of accent models; and

fusing the results of the processing the digital second utterance to produce a recognition output.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Computer-based speech recognition can be improved by recognizing words with an accurate accent model. In order to provide a large number of possible accents, while providing real-time speech recognition, a language tree data structure of possible accents is provided in one embodiment such that a computerized speech recognition system can benefit from choosing among accent categories when searching for an appropriate accent model for speech recognition.

44 Citations

View as Search Results

20 Claims

1. A method for recognizing speech, comprising:
- loading a digital representation of a first human utterance;
  
  processing the digital first utterance with a first accent category model;
  
  processing the digital first utterance with a second accent category model;
  
  selecting a category of accents based on results from the processing the first accent category model and the processing the second accent category model;
  
  selecting a plurality of accent models belonging to the selected category of accents;
  
  loading a digital representation of a second human utterance;
  
  processing the digital second utterance with each of the selected plurality of accent models; and
  
  fusing the results of the processing the digital second utterance to produce a recognition output.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method for recognizing speech of claim 1, wherein the selecting the plurality of accent models comprises traversing a tree data structure of accents.
  - 3. The method for recognizing speech of claim 1, wherein the processing the digital first utterance with a first accent category model is based on an accent dictionary.
  - 4. The method for recognizing speech of claim 1, wherein the processing the digital first utterance with a first accent category model is based on a language model.
  - 5. The method for recognizing speech of claim 1, wherein the processing the digital second utterance with each of the selected plurality of accent models is performed via parallel processing.
  - 6. The method for recognizing speech of claim 1, wherein the processing the digital first utterance with the first and second accent category models is performed in parallel, and wherein the fusing comprises selecting a highest scored result from the results from the processing the first and second accent category models.
  - 7. The method for recognizing speech of claim 1, wherein the processing the digital first utterance with the first and second accent category models is performed in parallel, and wherein the fusing comprises selecting a highest scored complementary result from the results from the processing the first and second accent category models.
  - 8. The method for recognizing speech of claim 1, wherein the fusing further comprises storing an identified accent to a non-transitory computer readable storage medium.
  - 9. The method for recognizing speech of claim 1, wherein the loading the digital representation of the first human utterance comprises receiving the first human utterance from a voice control application.
  - 10. The method for recognizing speech of claim 1, further comprising a second method of speech recognition.

11. An apparatus for speech processing, comprising:
- a first comparison module configured to determine a selected accent category based on whether a first accent category model or a second accent category model is a better match for a first human sound to be captured from an audio transducer; and
  
  a second comparison module configured to determine which accent model of a plurality of accent models is a best match for a second human sound to be captured from the audio transducer, wherein the plurality of accent models is associated with the selected accent category.
- View Dependent Claims (12, 13, 14, 15)
- - 12. The apparatus for speech processing of claim 11, wherein the first comparison module is configured to access the first accent category model and the second accent category model from a tree data structure.
  - 13. The apparatus for speech processing of claim 11, further comprising an accent dictionary.
  - 14. The apparatus for speech processing of claim 11, further comprising a language model.
  - 15. The apparatus for speech processing of claim 11, wherein at least one of the first or second comparison modules includes parallel processors.

16. A non-transitory computer readable storage medium, comprising:
- instructions for a processor to process a first accent category model and a second accent category model;
  
  conditional instructions to process a first plurality of accent models based on a result of the first accent category model;
  
  wherein accents represented in the first plurality of accent models are within a category represented by the first accent category model.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The non-transitory computer readable storage medium of claim 16, wherein at least one of the first accent category model or the second accent category model are represented in a tree data structure.
  - 18. The non-transitory computer readable storage medium of claim 16, further comprising an accent dictionary.
  - 19. The non-transitory computer readable storage medium of claim 16, further comprising a language model.
  - 20. The non-transitory computer readable storage medium of claim 16, wherein at least one of the first or second accent category models comprise Spanish, French, or both.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Monterey Research, LLC (Vector Capital Corporation)
Original Assignee
Spansion LLC (Infineon Technologies AG)
Inventors
Fastow, Richard, Liu, Chen

Granted Patent

US 9,009,049 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/231
CPC Class Codes

G10L 15/005   Language recognition

G10L 15/187   Phonemic context, e.g. pron...

G10L 15/32   Multiple recognisers used i...

Recognition of Speech With Different Accents

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

44 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Recognition of Speech With Different Accents

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

44 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links