SYSTEM AND METHOD FOR RAPID CUSTOMIZATION OF SPEECH RECOGNITION MODELS

US 20120253799A1
Filed: 03/28/2011
Published: 10/04/2012
Est. Priority Date: 03/28/2011
Status: Active Grant

First Claim

Patent Images

1. A method of generating a domain-specific speech recognition model, the method comprising:

identifying a speech recognition domain;

combining a plurality of speech recognition models to yield a combined speech recognition model, each speech recognition model of the plurality of speech recognition models being from a respective speech recognition domain;

receiving an amount of data specific to the speech recognition domain, wherein the amount of data is less than a minimum threshold to create a new domain-specific model; and

tuning the combined speech recognition model for the speech recognition domain based on the amount of data.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating domain-specific speech recognition models for a domain of interest by combining and tuning existing speech recognition models when a speech recognizer does not have access to a speech recognition model for that domain of interest and when available domain-specific data is below a minimum desired threshold to create a new domain-specific speech recognition model. A system configured to practice the method identifies a speech recognition domain and combines a set of speech recognition models, each speech recognition model of the set of speech recognition models being from a respective speech recognition domain. The system receives an amount of data specific to the speech recognition domain, wherein the amount of data is less than a minimum threshold to create a new domain-specific model, and tunes the combined speech recognition model for the speech recognition domain based on the data.

121 Citations

View as Search Results

20 Claims

1. A method of generating a domain-specific speech recognition model, the method comprising:
- identifying a speech recognition domain;
  
  combining a plurality of speech recognition models to yield a combined speech recognition model, each speech recognition model of the plurality of speech recognition models being from a respective speech recognition domain;
  
  receiving an amount of data specific to the speech recognition domain, wherein the amount of data is less than a minimum threshold to create a new domain-specific model; and
  
  tuning the combined speech recognition model for the speech recognition domain based on the amount of data.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein each respective speech recognition domain comprises at least one of business, finance, travel, medical, sports, news, politics, entertainment, and education.
  - 3. The method of claim 1, wherein tuning the combined speech recognition model is performed in a cloud computing environment.
  - 4. The method of claim 1, wherein tuning the combined speech recognition model is performed on-demand in response to a request.
  - 5. The method of claim 1, wherein the plurality of speech recognition models comprises at least two speech recognition models from different domains.
  - 6. The method of claim 1, wherein the combined speech recognition model and at least one of the plurality of speech recognition models are from different domains.
  - 7. The method of claim 1, wherein the amount of data comprises at least one of text, speech, transition data, metadata, and audio.
  - 8. The method of claim 1, wherein the speech recognition domain is specific to a particular user.
  - 9. The method of claim 1, wherein tuning the combined speech recognition model further comprises sampling the amount of data.
  - 10. The method of claim 1, further comprising recognizing speech using the combined speech recognition model.

11. A system for recognizing speech, the system comprising:
- a processor;
  
  a first module configured to control the processor to identify a speech recognition domain;
  
  a second module configured to control the processor to combine a plurality of speech recognition models to yield a combined speech recognition model, each speech recognition model of the plurality of speech recognition models being from a respective speech recognition domain;
  
  a third module configured to control the processor to receive an amount of data specific to the speech recognition domain, wherein the amount of data is less than a minimum threshold to create a new domain-specific model;
  
  a fourth module configured to control the processor to tune the combined speech recognition model for the speech recognition domain based on the amount of data; and
  
  a fifth module configured to control the processor to recognize speech using the combined speech recognition model.
- View Dependent Claims (12, 13, 14, 15)
- - 12. The system of claim 11, wherein the fourth module is further configured to control the processor to tune the combined speech recognition model on-demand in response to a request.
  - 13. The system of claim 11, wherein the plurality of speech recognition models comprises at least two speech recognition models from different domains.
  - 14. The system of claim 11, wherein the combined speech recognition model and at least one of the plurality of speech recognition models are from different domains.
  - 15. The system of claim 11, wherein the amount of data comprises at least one of text, speech, transition data, metadata, and audio.

16. A non-transitory computer-readable storage medium storing instructions which, when executed by a computing device, cause the computing device to generate a speech recognition model for a specific recognition domain, the instructions comprising:
- combining a plurality of speech recognition models to yield a combined speech recognition model, each speech recognition model of the plurality of speech recognition models being from a respective speech recognition domain;
  
  receiving an amount of data specific to a speech recognition domain, wherein the amount of data is less than a minimum threshold to create a new domain-specific model; and
  
  tuning the combined speech recognition model for the speech recognition domain based on the amount of data.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The non-transitory computer-readable storage medium of claim 16, wherein combining the plurality of speech recognition models is performed at at least one of a core n-gram level and a sentence level.
  - 18. The non-transitory computer-readable storage medium of claim 16, wherein tuning the combined speech recognition model further comprises sampling the amount of data.
  - 19. The non-transitory computer-readable storage medium of claim 16, further comprising recognizing speech using the combined speech recognition model.
  - 20. The non-transitory computer-readable storage medium of claim 16, wherein the speech recognition domain is specific to a particular user.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
BANGALORE, Srinivas, Bell, Robert, Caseiro, Diamantino Antonio, Gilbert, Mazin, Haffner, Patrick

Granted Patent

US 9,679,561 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/231
CPC Class Codes

G10L 15/06   Creation of reference templ...

G10L 15/065   Adaptation

G10L 15/183   using context dependencies,...

G10L 15/22   Procedures used during a sp...

G10L 15/30   Distributed recognition, e....

G10L 2015/0635   updating or merging of old ...

G10L 2015/0636   Threshold criteria for the ...

G10L 2015/228   of application context

SYSTEM AND METHOD FOR RAPID CUSTOMIZATION OF SPEECH RECOGNITION MODELS

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

121 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

SYSTEM AND METHOD FOR RAPID CUSTOMIZATION OF SPEECH RECOGNITION MODELS

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

121 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links