Voice analyzing system using hidden Markov model and having plural neural network predictors
First Claim
1. An analyzing system for analyzing object signals, comprising voice signals, by estimating a generation likelihood of an observation vector sequence being a time series of feature vectors X (=x1, . . . , xT ;
- T is a total number of frames) with use of a Markov model having a plurality of states i (i=1, . . . , N;
N is a total number of states) and given transition probabilities from state i to state j (i, j=1, . . . , N), comprising;
feature extraction means for converting the object signals into the time series of feature vectors X;
a state designation means for determining a state i at a time t stochastically using said Markov model;
a plurality of predictors each of which is composed of a neural network and is provided per each state of said Markov model for generating a predictional vector gi (t) of said feature vector xt in said state i at the time t based on values of the feature vectors other than said feature vector xt ;
a first calculation means for calculating an error vector of said predictional vector gi (t) to said feature vector xT ; and
a second calculation means for calculating a generation likelihood of said error vector using a predetermined probability distribution of the error vector according to which said error vector is generated.
1 Assignment
0 Petitions
Accused Products
Abstract
An analyzing system analyzes object signals, particularly voice signals, by estimating a generation likelihood of an observation vector sequence being a time series of feature vectors with use of a Markov model having a plurality of states and given transition probabilities from state to state. A state designation section determines a state i at a time t stochastically using the Markov model. Plural predictors, each of which is composed of a neural network and is provided per each state of the Markov model, are provided for generating a predictional vector of the feature vector xt in the state i at the time t based on values of the feature vectors other than the feature vector xt. A first calculation section calculates an error vector of the predictional vector to the feature vector xT. A second calculation section calculates a generation likelihood of the error vector using a predetermined probability distribution of the error vector according to which the error vector is generated.
47 Citations
5 Claims
-
1. An analyzing system for analyzing object signals, comprising voice signals, by estimating a generation likelihood of an observation vector sequence being a time series of feature vectors X (=x1, . . . , xT ;
- T is a total number of frames) with use of a Markov model having a plurality of states i (i=1, . . . , N;
N is a total number of states) and given transition probabilities from state i to state j (i, j=1, . . . , N), comprising;feature extraction means for converting the object signals into the time series of feature vectors X; a state designation means for determining a state i at a time t stochastically using said Markov model; a plurality of predictors each of which is composed of a neural network and is provided per each state of said Markov model for generating a predictional vector gi (t) of said feature vector xt in said state i at the time t based on values of the feature vectors other than said feature vector xt ; a first calculation means for calculating an error vector of said predictional vector gi (t) to said feature vector xT ; and a second calculation means for calculating a generation likelihood of said error vector using a predetermined probability distribution of the error vector according to which said error vector is generated. - View Dependent Claims (2, 3, 4)
- T is a total number of frames) with use of a Markov model having a plurality of states i (i=1, . . . , N;
-
5. A recognition system for recognizing object signals comprising voice signals, comprising:
-
a plurality of analyzing apparatuses each for estimating a generation likelihood of an observation vector sequence being a time series of feature vectors X (=xt, . . . , xt ;
t is a total number of frames) with use of a Markov model having a plurality of states i (i=1, . . . , N;
N is a total number of states) and given transition probabilities from state i to state j (i, j=1, . . . , N);feature extraction means for converting the object signals into the time series of feature vectors X; said each of analyzing apparatuses comprising a state designation means for determining a state i at a time t stochastically using said Markov model, a plurality of predictors each of which is composed of a neural network and is provided per each state of said Markov model for generating a predictional vector gi (t) of said feature vector xt in said state i at the time t based on values of the feature vectors other than said feature vector xt, a first calculation means for calculating an error vector of said predictional vector gi (t) to said feature vector xt, and a second calculation means for calculating a generation likelihood of said error vector using a predetermined probability distribution of the error vector according to which said error vector is generated, and being adapted for a category for an observation vector sequence to be categorized; a maximum likelihood detection means which compares likelihoods obtained by said plurality of analyzing apparatuses and detects a maximum value among said likelihoods; and a decision means for identifying said observation vector sequence to the category corresponding to one of said plurality of analyzing apparatuses which gives the maximum likelihood detected by said maximum likelihood detection means.
-
Specification