Rapid adaptation of speech models
First Claim
1. A method of generating a source-adapted model for use in speech recognition, the method comprising:
- generating a collection of elements from an initial model;
assembling source speech data that corresponds to elements in the collection of elements from a set of source speech data for a particular source associated with the source-adapted model;
generating statistics from the assembled source speech data;
modifying the statistics using an element of the initial model and a smoothing factor that accounts for the relative importance of the element of the initial model and the assembled source speech data;
using the modified statistics in determining a transform that maps between the assembled source speech data and the collection of elements of the initial model; and
producing elements of the source-adapted model from corresponding elements of the initial model by applying the transform to the elements of the initial model;
wherein determining the transform comprises determining a relationship between each element of the initial model in the collection and a portion of the assembled source speech data that corresponds to that element.
8 Assignments
0 Petitions
Accused Products
Abstract
A source-adapted model for use in speech recognition is generated by defining a linear relationship between a first element of an initial model and a first element of the source-adapted model. Thereafter, speech data that corresponds to the first element of the initial model is assembled from a set of speech data for a particular source associated with the source-adapted model. A linear transform that maps between the assembled speech data and the first element of the initial model is then determined. Finally, a first element of the source-adapted model is produced from the first element of the initial model using the linear transform.
148 Citations
25 Claims
-
1. A method of generating a source-adapted model for use in speech recognition, the method comprising:
-
generating a collection of elements from an initial model; assembling source speech data that corresponds to elements in the collection of elements from a set of source speech data for a particular source associated with the source-adapted model; generating statistics from the assembled source speech data; modifying the statistics using an element of the initial model and a smoothing factor that accounts for the relative importance of the element of the initial model and the assembled source speech data; using the modified statistics in determining a transform that maps between the assembled source speech data and the collection of elements of the initial model; and producing elements of the source-adapted model from corresponding elements of the initial model by applying the transform to the elements of the initial model; wherein determining the transform comprises determining a relationship between each element of the initial model in the collection and a portion of the assembled source speech data that corresponds to that element. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 23)
-
-
11. A method of generating a source-adapted model for use in speech recognition, the method comprising:
-
generating a collection of elements from an initial model; assembling speech data that corresponds to elements in the collection of elements from a set of speech data for a particular source associated with the source-adapted model; generating statistics from the assembled source speech data; modifying the statistics using an element of the initial model and a smoothing factor that accounts for an amount of source speech data available for a particular element; determining a transform that maps between the assembled speech data and the collection of elements of the initial model using the modified statistics; and producing elements of the source-adapted model from corresponding elements of the initial model by applying the transform to the elements of the initial model. - View Dependent Claims (12, 24)
-
-
13. A method of generating a source-adapted model for use in speech recognition, the method comprising:
-
generating a collection of elements from the initial model; assembling speech data that corresponds to elements in the collection of elements from a set of speech data for a particular source associated with the source-adapted model; generating statistics from the assembled source speech data; modifying the statistics using an element of the initial model and a smoothing factor that controls the relative importance of the element of the initial model and the assembled source speech data; using the modified statistics in determining a transform that maps between the assembled speech data and the collection of elements of the initial model; and producing elements of the source-adapted model from corresponding elements of the initial model by applying the transform to the elements of the initial model; wherein a human operator identifies classes of related elements and the collection of elements is generated using an automatic procedure that favors including elements from a common class in the collection and penalizes including elements from different classes in the collection. - View Dependent Claims (25)
-
-
14. A method of training a speech model for use in speech recognition, the method comprising:
-
assembling sets of speech data that correspond to elements of an initial model from a set of speech data for one or more sources, the assembled speech data including multiple items; calculating representative values for each set of speech data; modifying a representative value for a set of speech data using a corresponding element of the initial model and a smoothing factor that controls the relative importance of the corresponding element of the initial model and the set of speech data; determining a relationship between the representative values for each set of speech data and values of the corresponding element of the initial model, where a relationship for a first set of speech data differs from a relationship for a second set of speech data; modifying each item of the sets of speech data using the relationship for the set to which the item belongs; and generating elements of the speech model using the modified sets of speech data. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
-
21. A method of training a speech model for use in speech recognition, the method comprising:
-
assembling speech data that corresponds to a first element of the speech model from a set of speech data for a first source; determining a transform that maps between the assembled speech data and the first element of the speech model; generating an inverse of the transform; modifying the assembled speech data using the inverse of the transform; and updating the first element of the speech model using the modified assembled speech data, wherein determining the transform comprises modifying the transform using the first element of the speech model and a smoothing factor that controls the relative importance of the first element of the speech model and the assembled speech data. - View Dependent Claims (22)
-
Specification