System and method for rapid customization of speech recognition models
First Claim
1. A method comprising:
- generating a multi-domain speech recognition model comprising a combination of speech recognition models selected based on a speech pattern of a user, wherein each speech recognition model from the combination of speech recognition models is associated with a respective speech recognition domain;
receiving sample data associated with a specific speech recognition domain;
when the sample data associated with the specific speech recognition domain is more than a threshold, generating a new domain-specific speech recognition model for the specific speech recognition domain; and
when the sample data is less than the threshold, modifying the multi-domain speech recognition model by weighting components of the multi-domain speech recognition model associated with the specific speech recognition domain to recognize speech associated with the user or additional speech from the user.
3 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating domain-specific speech recognition models for a domain of interest by combining and tuning existing speech recognition models when a speech recognizer does not have access to a speech recognition model for that domain of interest and when available domain-specific data is below a minimum desired threshold to create a new domain-specific speech recognition model. A system configured to practice the method identifies a speech recognition domain and combines a set of speech recognition models, each speech recognition model of the set of speech recognition models being from a respective speech recognition domain. The system receives an amount of data specific to the speech recognition domain, wherein the amount of data is less than a minimum threshold to create a new domain-specific model, and tunes the combined speech recognition model for the speech recognition domain based on the data.
61 Citations
16 Claims
-
1. A method comprising:
-
generating a multi-domain speech recognition model comprising a combination of speech recognition models selected based on a speech pattern of a user, wherein each speech recognition model from the combination of speech recognition models is associated with a respective speech recognition domain; receiving sample data associated with a specific speech recognition domain; when the sample data associated with the specific speech recognition domain is more than a threshold, generating a new domain-specific speech recognition model for the specific speech recognition domain; and when the sample data is less than the threshold, modifying the multi-domain speech recognition model by weighting components of the multi-domain speech recognition model associated with the specific speech recognition domain to recognize speech associated with the user or additional speech from the user. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A system comprising:
-
a processor; and a computer-readable storage medium having instruction stored which, when executed by the processor, result in the processor performing operations comprising; generating a multi-domain speech recognition model comprising a combination of speech recognition models selected based on a speech pattern of a user, wherein each speech recognition model from the combination of speech recognition models is associated with a respective speech recognition domain; receiving sample data associated with a specific speech recognition domain; when the sample data associated with the specific speech recognition domain is more than a threshold, generating a new domain-specific speech recognition model for the specific speech recognition domain; and when the sample data is less than the threshold, modifying the multi-domain speech recognition model by weighting components of the multi-domain speech recognition model associated with the specific speech recognition domain to recognize speech associated with the user or additional speech from the user. - View Dependent Claims (14, 15)
-
-
16. A computer-readable storage device storing instructions which, when executed by a computing device, cause the computing device to perform operations comprising:
-
generating a multi-domain speech recognition model comprising a combination of speech recognition models selected based on a speech pattern of a user, wherein each speech recognition model from the combination of speech recognition models is associated with a respective speech recognition domain; receiving sample data associated with a specific speech recognition domain; when the sample data associated with the specific speech recognition domain is more than a threshold, generating a new domain-specific speech recognition model for the specific speech recognition domain; when the sample data is less than the threshold, modifying the multi-domain speech recognition model by weighting components of the multi-domain speech recognition model associated with the specific speech recognition domain to recognize speech associated with the user or additional speech from the user.
-
Specification