Voice recognition device and method using a (GGM) Guaranteed Global minimum Mapping
First Claim
1. A voice recognition device comprising:
- analyzing means for acoustically analyzing voice every predetermined frame unit to extract a feature vector X;
converting means for subjecting the feature vector X output from said analyzing means to a predetermined conversion process; and
recognition means for recognizing the voice on the basis of a new feature vector output from said conversion means, wherein said conversion means conducts the predetermined conversion processing according to a mapping F from an N-dimensional vector space Ω
N to an M-dimensional vector space Ω
M, the feature vector X is a vector on the N-dimensional vector space Ω
N, and the function fm (X) of an mth component of the mapping F is represented by the following linear summation of the products of complete component functions gmk (X) of Lm determined on the basis of the distribution of the learning sample Sq (=(S0q, S1q, S2q, . . . , SN-1q)) on the N-dimensional measurable vector space which is classified into categories Cq (q=0, 1, 2, . . . , Q-1) of Q, and coefficients cmk of Lm ;
##EQU12## wherein when teacher vectors Tq (=(t0q, t1q, t2q, . . . , tM-1q)) on an M-dimensional measurable vector space Ω
M for the categories Cq of Q are provided and a predetermined estimation function J is calculated, the coefficient cmk is determined so as to minimize the estimation function J.
1 Assignment
0 Petitions
Accused Products
Abstract
A voice recognition device according to the present invention including a voice analyzer for acoustically analyzing voice every predetermined frame unit to extract a feature vector X, a converter for subjecting the feature vector X output from the analyzer to a predetermined conversion process, and a voice recognizer for recognizing the voice on the basis of a new feature vector output from the converter, wherein the converter conducts the predetermined conversion processing according to a mapping F from an N-dimensional vector space ΩN to an M-dimensional vector space ΩM, the feature vector X is a vector on the N-dimensional vector space ΩN and the function fm (X) of an m-th component of the mapping F is represented by the following linear summation of the products of functions gmk (X) and coefficients cmk of Lm : ##EQU1## Each function gmk (X) may be set to a monomial.
28 Citations
26 Claims
-
1. A voice recognition device comprising:
-
analyzing means for acoustically analyzing voice every predetermined frame unit to extract a feature vector X; converting means for subjecting the feature vector X output from said analyzing means to a predetermined conversion process; and recognition means for recognizing the voice on the basis of a new feature vector output from said conversion means, wherein said conversion means conducts the predetermined conversion processing according to a mapping F from an N-dimensional vector space Ω
N to an M-dimensional vector space Ω
M, the feature vector X is a vector on the N-dimensional vector space Ω
N, and the function fm (X) of an mth component of the mapping F is represented by the following linear summation of the products of complete component functions gmk (X) of Lm determined on the basis of the distribution of the learning sample Sq (=(S0q, S1q, S2q, . . . , SN-1q)) on the N-dimensional measurable vector space which is classified into categories Cq (q=0, 1, 2, . . . , Q-1) of Q, and coefficients cmk of Lm ;
##EQU12## wherein when teacher vectors Tq (=(t0q, t1q, t2q, . . . , tM-1q)) on an M-dimensional measurable vector space Ω
M for the categories Cq of Q are provided and a predetermined estimation function J is calculated, the coefficient cmk is determined so as to minimize the estimation function J. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A voice recognition method comprising:
-
a voice analyzing step for acoustically analyzing voice every predetermined frame unit to extract a feature vector X; a vector conversion step for subjecting the feature vector X extracted in said analyzing step to a predetermined conversion process; and a voice recognition step for recognizing the voice on the basis of the new feature vector output in said vector conversion step, wherein the predetermined conversion processing is conducted according to a mapping F from an Ndimensional vector space Ω
N to an M-dimensional vector space Ω
M in said vector conversion step, the feature vector X is a vector on the N-dimensional vector space Ω
N, and the function fm (X) of an m-th component of the mapping F is represented by the following linear summation of the products of complete component functions gmk (X) of Lm determined on the basis of the distribution of the learning sample Sq (=(S0q, S1q, S2q, . . . , SN-1q)) on the N-dimensional measurable vector space which is classified into categories Cq (q=0, 1, 2, . . . , Q-1) of Q, and coefficients cmk of Lm ;
##EQU14## wherein when teacher vectors Tq (=(t0q, t1q, t2q, . . . , tM-1q)) on an M-dimensional measurable vector space gm for the categories Cq of Q are provided and a predetermined estimation function J is calculated, the coefficient Cmk is determined so as to minimize the estimation function J. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
Specification