Method and apparatus for probabilistic recognition using small number of state clusters
First Claim
Patent Images
1. In a speech recognition system using a method for recognizing human speech, the method comprising the steps of:
- selecting a model to represent a selected subunit of speech, the model having associated with it a plurality of states;
determining states that may be represented by a set of simple probability functions; and
clustering said states that may be represented by a set of simple probability functions into a limited number of clusters, wherein said simple probability functions for each of said limited number of state clusters is greater in number than said limited number of state clusters.
0 Assignments
0 Petitions
Accused Products
Abstract
Probabilistic recognition using clusters and simple probability functions provides improved performance by employing a limited number of clusters each using a relatively large number of simple probability functions. The simple probability functions for each of the limited number of state clusters are greater in number than the limited number of state clusters.
27 Citations
18 Claims
-
1. In a speech recognition system using a method for recognizing human speech, the method comprising the steps of:
-
selecting a model to represent a selected subunit of speech, the model having associated with it a plurality of states;
determining states that may be represented by a set of simple probability functions; and
clustering said states that may be represented by a set of simple probability functions into a limited number of clusters, wherein said simple probability functions for each of said limited number of state clusters is greater in number than said limited number of state clusters. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
caching log-likelihoods for the simple probability functions in a mixture as soon as they are computed for a frame so that if the same mixture needs to be evaluated at that frame for another triphone state, the cache is used.
-
-
15. The method according to claim 1 wherein redundant simple probability functions in the state cluster overlap region are more effectively used to cover the acoustic space of the clusters, resulting in smaller variances and a reducing the number of distance components to be computed.
-
16. The method according to claim 1 further comprising:
reducing the size of a simple probability function shortlists by decreasing the number of state clusters with a corresponding reduction in simple probability function computations.
-
17. A computer readable medium having stored thereon a plurality of instructions, the plurality of instructions including instructions which, when executed by a processor, cause the processor to perform the steps comprising of:
-
selecting a model to represent a selected subunit of speech, the model having associated with it a plurality of states;
determining states that may be represented by a set of simple probability functions; and
clustering said states that may be represented by a set of simple probability functions into a limited number of clusters, wherein said simple probability functions for each of said limited number of state clusters is greater in number than said limited number of state dusters.
-
-
18. A speech recognizer comprising:
-
a logic processing device;
storage means;
a set of probabilistic models stored in the storage means;
said models including a limited number of state clusters, wherein at least one of said limited number of state clusters is represented by a number of simple probability functions, wherein said simple probability functions for each of said limited number of state clusters is greater in number than said limited number of state clusters;
a feature extractor in a computer for extracting feature data capable of being processed by said computer from a speech signal; and
recognizing means for matching features from unidentified speech data to the models to produce a most likely path through the models where the path defines the most likely subunits and words in the speech data.
-
Specification