Speaker adaptation system and method based on class-specific pre-clustering training speakers
First Claim
1. A method of speech recognition comprising the steps of:
- grouping acoustics to form classes based on acoustic features;
clustering training speakers by the classes to provide class-specific cluster systems;
selecting from the cluster systems, a subset of cluster systems closest to adaptation data from a speaker;
transforming the subset of cluster systems to bring the subset of cluster systems closer to the speaker based on the adaptation data to form adapted cluster systems; and
combining the adapted cluster systems to create a speaker adapted system for decoding speech from the speaker.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of speech recognition, in accordance with the present invention includes the steps of grouping acoustics to form classes based on acoustic features, clustering training speakers by the classes to provide class-specific cluster systems, selecting from the cluster systems, a subset of cluster systems closest to adaptation data from a test speaker, transforming the subset of cluster systems to bring the subset of cluster systems closer to the test speaker based on the adaptation data to form adapted cluster systems and combining the adapted cluster systems to create a speaker adapted system for decoding speech from the test speaker. System and methods for building speech recognition systems as well as adapting speaker systems for class-specific speaker clusters are included.
134 Citations
42 Claims
-
1. A method of speech recognition comprising the steps of:
-
grouping acoustics to form classes based on acoustic features; clustering training speakers by the classes to provide class-specific cluster systems; selecting from the cluster systems, a subset of cluster systems closest to adaptation data from a speaker; transforming the subset of cluster systems to bring the subset of cluster systems closer to the speaker based on the adaptation data to form adapted cluster systems; and combining the adapted cluster systems to create a speaker adapted system for decoding speech from the speaker. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of building class-specific cluster systems comprising the steps of:
-
providing a speaker dependent system for each of a plurality of training speakers; partitioning an acoustic space according to classes, each class being characterized by a set of acoustic features; grouping the speaker dependent systems with the acoustic spaces according to classes to build acoustic spaces with common features from all the speaker dependent systems; and clustering the grouped acoustic spaces with common features to form cluster systems based on acoustic characteristics of the speakers, the acoustic characteristics including class-specific characteristics. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method of speech recognition comprising the steps of:
-
providing a speaker dependent system for each of a plurality of training speakers; providing an acoustic space for each of the training speakers, each acoustic space being characterized by a set of acoustic features; grouping the speaker dependent systems with the acoustic spaces to build acoustic spaces with common features from all the speaker dependent systems; clustering the grouped acoustic spaces to form cluster systems based on a common acoustic characteristic; selecting from a group of cluster systems, a subset of cluster systems closest to adaptation data from a speaker; transforming the subset of cluster systems to bring the subset of cluster systems closer to the speaker based on the adaptation data to form adapted cluster systems; and combining the adapted cluster systems to create a speaker adapted system for decoding speech from the speaker. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. A system for speech recognition comprising:
-
means for grouping acoustics to form classes based on acoustic features; means for clustering training speakers by the classes to provide class-specific cluster systems; means for selecting from the cluster systems, a subset of cluster systems closest to adaptation data from a speaker; means for transforming the subset of cluster systems to bring the subset of cluster systems closer to the speaker based on the adaptation data to form adapted cluster systems; and means for combining the adapted cluster systems to create a speaker adapted system for decoding speech from the speaker. - View Dependent Claims (31, 32, 33, 34, 35)
-
-
36. A system for speech recognition comprising:
-
a speaker dependent system for each of a plurality of training speakers; an acoustic space for each of the training speakers, each acoustic space being characterized by a set of acoustic features; means for grouping the speaker dependent systems with the acoustic spaces to build acoustic spaces with common features from all the speaker dependent systems; means for clustering the grouped acoustic spaces to form cluster systems based on a common acoustic characteristic; means for selecting from a group of cluster systems, a subset of cluster systems closest to adaptation data from a speaker; means for transforming the subset of cluster systems to bring the subset of cluster systems closer to the speaker based on the adaptation data to form adapted cluster systems; and means for combining the adapted cluster systems to create a speaker adapted system for decoding speech from the speaker. - View Dependent Claims (37, 38, 39, 40)
-
-
41. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for speech recognition, the method steps comprising:
-
grouping acoustics to form classes based on acoustic features; clustering training speakers by the classes to provide class-specific cluster systems; selecting from the cluster systems, a subset of cluster systems closest to adaptation data from a speaker; transforming the subset of cluster systems to bring the subset of cluster systems closer to the speaker based on the adaptation data to form adapted cluster systems; and combining the adapted cluster systems to create a speaker adapted system for decoding speech from the speaker.
-
-
42. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for building class-specific cluster systems for speech recognition systems, the method steps comprising:
-
providing a speaker dependent system for each of a plurality of training speakers; partitioning an acoustic space according to classes, each class being characterized by a set of acoustic features; grouping the speaker dependent systems with the acoustic spaces according to classes to build acoustic spaces with common features from all the speaker dependent systems; and clustering the grouped acoustic spaces with common features to form cluster systems based on acoustic characteristics of the speakers, the acoustic characteristics including class-specific characteristics.
-
Specification