Method and apparatus for automatic recognition using features encoded with product-space vector quantization
First Claim
Patent Images
1. A system for assigning codeword bits among a number of feature vectors to be used in automatic recognition comprising:
- a front end encoder for receiving a physical signal;
a feature extraction engine for converting said signal into a series of digitally encoded numerical feature vectors, said feature vectors selected in order to perform recognition, each of said feature vectors comprising at least two separable numerical parameters;
a subvector quantizer for dividing said feature vectors into a number of subvectors and for performing vectors quantization on said subvectors based a first assignment of bit numbers to each subvector in order to assign a codeword to each subvector to approximate said each subvector, a recognition engine for performing recognition using said codewords representative of said quantized subvectors to produce a sequence of labels;
memory for storing a plurality of statistical models with trained parameters;
a tester for measuring recognition performance based on comparison of said labels with the corresponding pre-transcribed labels of said physical signal from a development set of the tester; and
feedback means from said tester to said subvector quantizer, for feeding back performance criteria;
wherein said subvector quantizer is further operative in response to said performance criteria to assign additional bits to said subvectors incrementally until the desired level of recognition performance is reached or a threshold of assigned bits is reached.
2 Assignments
0 Petitions
Accused Products
Abstract
An automatic recognition system and method divides observation vectors into subvectors and determines a quantization index for the subvectors. Subvector indices can then be transmitted or otherwise stored and used to perform recognition. In a further embodiment, recognition probabilities are determined for subvectors separately and these probabilities are combined to generate probabilities for the observed vectors. An automatic system for assigning bits to subvector indices can be used to improve recognition.
94 Citations
19 Claims
-
1. A system for assigning codeword bits among a number of feature vectors to be used in automatic recognition comprising:
-
a front end encoder for receiving a physical signal;
a feature extraction engine for converting said signal into a series of digitally encoded numerical feature vectors, said feature vectors selected in order to perform recognition, each of said feature vectors comprising at least two separable numerical parameters;
a subvector quantizer for dividing said feature vectors into a number of subvectors and for performing vectors quantization on said subvectors based a first assignment of bit numbers to each subvector in order to assign a codeword to each subvector to approximate said each subvector, a recognition engine for performing recognition using said codewords representative of said quantized subvectors to produce a sequence of labels;
memory for storing a plurality of statistical models with trained parameters;
a tester for measuring recognition performance based on comparison of said labels with the corresponding pre-transcribed labels of said physical signal from a development set of the tester; and
feedback means from said tester to said subvector quantizer, for feeding back performance criteria;
wherein said subvector quantizer is further operative in response to said performance criteria to assign additional bits to said subvectors incrementally until the desired level of recognition performance is reached or a threshold of assigned bits is reached. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A recognition system for automatically recognizing physical signals and deriving known labels comprising:
-
a front end encoder for receiving a physical signal;
a feature extraction engine for converting said signal into a series of digitally encoded numerical feature vectors, said vectors selected in order to perform recognition, each of said vectors comprised of at least two separable numerical parameters;
a subvector quantizer for separating said feature vectors into at least two subvectors and for determining a codeword for each subvector to approximate said subvector;
a channel for transmitting codewords for said subvectors to a recognition engine;
memory for storing a plurality of statistical models with trained parameters; and
a recognition engine capable of using said stored statistical models to recognize known labels from a set of unidentified feature vectors wherein said recognition engine performs vector quantized subvector recognition using discreet HMMs having the form;
where Ps(Xt) is the probability for a particular model state s that Xt was produced by that state, λ
i is the weight of the i-th mixture component, kl is the codebook index observed at time t for the first subvector, andPsi(YQl=kl) is the probability that the first subvector index is kl, derived from a table lookup for this model state and mixture component i.
-
-
9. A method for assigning codewords bits among a number of feature vectors to be used in automatic recognition comprising:
-
dividing an observation vector into a number of subvectors;
assigning a first set of bit numbers to each subvector;
performing vector quantization on said subvectors;
performing recognition using said quantized subvectors;
measuring recognition performance;
assigning additional bits to subvectors incrementally until the desired recognition performance is reached or a threshold of assigned bits is reached; and
selecting the bit values that achieve the most desired performance. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17)
where Ps(Xt) is the probability for a particular model state s that Xt was produced by that state, λ
i is the weight of the i-th mixture component, kl is the codebook index observed at time t for the first subvector, andPsi(VQl=kl) is the probability that the fist subvector index is kl, derived from a table lookup for this model state and mixture component i.
-
-
18. A method for developing models in a recognition system for responding to data representative of captured physical speech, comprising the steps of:
-
selecting a multi-state model with state probability functions, said state probability functions being of a general form with initially undetermined parameters, said models divided into subvector models for recognizing subparts of observation vectors;
creating individual instances of a model for each subunit of speech to be processed;
using training data from a plurality of speakers to determine acoustic features of states of said models and to estimate probability density functions for said models;
clustering states based on their acoustic similarity;
creating a plurality of cluster codebooks, said cluster codebooks consisting of probability density functions that are shared by each cluster'"'"'s states; and
reestimating the probability densities of each cluster codebook and the parameters of the probability equations in each cluster.
-
-
19. A method for developing models in a recognition system for responding to data representative of captured physical speech, comprising the steps of:
-
selecting a multi-state model with state probability functions, said state probability functions being of a general form with initially undetermined parameters, said models divided into subvector models for recognizing subparts of observation vectors, wherein said observation computation is based on performing an iteration of a forward-backward algorithm on the training speech data and is of the following form for every state s and mixture component i and at every time t;
where the quantities α
t(s),β
t(s) are the alpha and beta probabilities that are computed with the forward-backward algorithm, and the probabilities of the subvectors are computed using the previous estimates of the model parameters;
thereafter computing new estimates for the subvector probabilities using the following formula;
thereafter updating similarly the probabilities of all the subvectors for all states s and mixtures i;
thereafterreplacing previous values of said subvector probabilities with new estimates until a predefined convergence criterion is not met, thereafter creating individual instances of a model for each subunit of speech to be processed;
using training data from a plurality of speakers to determine acoustic features of states of said models and to estimate probability density functions for said models;
clustering states based on their acoustic similarity;
creating a plurality of cluster codebooks said cluster codebooks consisting of probability density functions that are shared by each clusters states; and
reestimating the probability densities of each cluster codebook and the parameters of the probability equations in each cluster.
-
Specification