Speech signal processing system
First Claim
1. A speech signal processing system comprising:
- an input terminal for receiving successive sample values of a speech waveform S(n) at successive time points n, where n=0, 1, 2, . . . ;
inverse-filter means connected to said input terminal for obtaining successive sample values of a prediction residual waveform e(n) by removing a short-time correlation from the speech waveform S(n);
phase-equalizing filter means connected to said input terminal for receiving the speech waveform S(n) therefrom and producing successive samples of a phase-equalized speech waveform Sp(n) in the time domain by zero-phasing a prediction residual waveform component in the speech waveform in accordance with successive sets of M+1 phase-equalizing filter coefficients h(m,n) supplied thereto as filter coefficients thereof, where m=0, 1, 2, . . . , M, and M is a positive integer; and
filter coefficient determining means connected to the output of said inverse-filter means for determining said phase-equalizing filter coefficients h(m,n) on the basis of said prediction residual waveform e(n), said filter coefficient determining means including voiced/unvoiced sound discriminator means connected to the output of said inverse-filter means for discriminating whether said speech waveform is a voiced sound or an unvoiced sound based on whether a computed value of an auto-correlation function on said prediction residual waveform during an analysis window of a length N at said filter coefficient determining means is above or below a threshold value, pitch position detecting means connected to the outputs of said inversefilter means and said voiced/unvoiced sound discriminator means for detecting, when said speech waveform is discriminated as a voiced sound, pitch positions nl from said prediction residual waveform e(n), and filter coefficient computing means connected to the outputs of said inverse-filter means, said voiced/unvoiced sound discriminator means and said pitch position detecting means, respectively, for computing, when said speech waveform is discriminated as a voiced sound, a set of the M+1 phase-equalizing filter coefficients h(m,n) for a time point n of each pitch position n=nl by solving the following simultaneous equations given for K=0, 1, . . . M, ##EQU26## where L is the number of the pitch positions nl in the analysis window and V(m) is an auto-correlation function of said prediction residual waveform e(n) given by;
##EQU27## and for setting, when said speech waveform is discriminated as an unvoiced sound, a particular one order of coefficient of said phase-equalizing filter coefficients to a certain value and the other orders thereof to zero;
the output of said filter coefficient determining means being connected to said phase-equalizing filter means so that successive sets of said phase-equalizing filter coefficients h(m,nl) determined by said filter coefficient determining means are supplied to said phase-equalizing filter means as the filter coefficients thereof, whereby said phase-equalizing filter means outputs the phaseequalized speech waveform Sp(n) as the output of said system representing the input speech waveform.
0 Assignments
0 Petitions
Accused Products
Abstract
A speech signal processing system in which the correlation is removed from the sample values of a speech waveform supplied to an inverse-filter for obtaining sample values of a prediction residual waveform, phase-equalizing filter coefficients are determined to have phase-characteristic inverse to that of the prediction residual waveform at each pitch position of the speech waveform, the phase-equalizing filter coefficients are set as filter coefficients of the phase-equalizing filter, and the speech waveform or the prediction residual waveform is passed through the phase-equalizing filter, thereby zero-phasing the prediction residual waveform or the prediction residual waveform component in the speech waveform and concentrating energy around the pitch position.
53 Citations
22 Claims
-
1. A speech signal processing system comprising:
-
an input terminal for receiving successive sample values of a speech waveform S(n) at successive time points n, where n=0, 1, 2, . . . ; inverse-filter means connected to said input terminal for obtaining successive sample values of a prediction residual waveform e(n) by removing a short-time correlation from the speech waveform S(n); phase-equalizing filter means connected to said input terminal for receiving the speech waveform S(n) therefrom and producing successive samples of a phase-equalized speech waveform Sp(n) in the time domain by zero-phasing a prediction residual waveform component in the speech waveform in accordance with successive sets of M+1 phase-equalizing filter coefficients h(m,n) supplied thereto as filter coefficients thereof, where m=0, 1, 2, . . . , M, and M is a positive integer; and filter coefficient determining means connected to the output of said inverse-filter means for determining said phase-equalizing filter coefficients h(m,n) on the basis of said prediction residual waveform e(n), said filter coefficient determining means including voiced/unvoiced sound discriminator means connected to the output of said inverse-filter means for discriminating whether said speech waveform is a voiced sound or an unvoiced sound based on whether a computed value of an auto-correlation function on said prediction residual waveform during an analysis window of a length N at said filter coefficient determining means is above or below a threshold value, pitch position detecting means connected to the outputs of said inversefilter means and said voiced/unvoiced sound discriminator means for detecting, when said speech waveform is discriminated as a voiced sound, pitch positions nl from said prediction residual waveform e(n), and filter coefficient computing means connected to the outputs of said inverse-filter means, said voiced/unvoiced sound discriminator means and said pitch position detecting means, respectively, for computing, when said speech waveform is discriminated as a voiced sound, a set of the M+1 phase-equalizing filter coefficients h(m,n) for a time point n of each pitch position n=nl by solving the following simultaneous equations given for K=0, 1, . . . M, ##EQU26## where L is the number of the pitch positions nl in the analysis window and V(m) is an auto-correlation function of said prediction residual waveform e(n) given by;
##EQU27## and for setting, when said speech waveform is discriminated as an unvoiced sound, a particular one order of coefficient of said phase-equalizing filter coefficients to a certain value and the other orders thereof to zero;the output of said filter coefficient determining means being connected to said phase-equalizing filter means so that successive sets of said phase-equalizing filter coefficients h(m,nl) determined by said filter coefficient determining means are supplied to said phase-equalizing filter means as the filter coefficients thereof, whereby said phase-equalizing filter means outputs the phaseequalized speech waveform Sp(n) as the output of said system representing the input speech waveform. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A speech signal processing system comprising:
-
an input terminal for receiving successive sample values of a speech waveform S(n) at successive time points n, where n=0, 1, 2, . . . ; inverse-filter means connected to said input terminal for obtaining successive sample values of a prediction residual waveform e(n) by removing a short-time correlation from the speech waveform S(n); phase-equalizing filter means connected to the output of said inverse-filter means for obtaining a phase-equalized residual waveform ep(n) in the time domain by zero-phasing the prediction residual waveform e(n) from said inverse-filter means in accordance with successive sets of M+1 phase-equalizing filter coefficients h(m,n) supplied thereto as filter coefficients thereof, where m=0, 1, 2, . . . , M and M is a positive integer; and filter coefficient determining means connected to the output of said inverse-filter means for determining said phase-equalizing filter coefficients h(m,n) on the basis of said prediction residual waveform e(n), said filter coefficient determining means including voiced/unvoiced sound discriminator means connected to the output of said inverse-filter means for discriminating whether said speech waveform is a voiced sound or unvoiced sound based on whether a computed value of an auto-correlation function on said prediction residual waveform during an analysis window of a length N at said filter coefficient determining means is above or below a threshold value, pitch position detecting means connected to the outputs of said inverse-filter means and said voiced/unvoiced sound discriminator means for detecting, when said speech waveform is discriminated as a voiced sound, pitch positions nl from said prediction residual waveform e(n), and filter coefficient computing means connected to the outputs of said inverse-filter means, said voiced/unvoiced sound discriminator means and said pitch position detecting means, respectively, for computing, when said speech waveform is discriminated as a voiced sound, a set of the M+1 phase-equalizing filter coefficients h(m,n) for a time point n of each pitch position n=nl by solving the following simultaneous equations given for k=0, 1, . . . M, ##EQU29## where L is the number of the pitch positions nl in the analysis window and V(m) is an auto-correlation function of said prediction residual waveform e(n) given by;
##EQU30## and for setting, when said speech waveform is discriminated as an unvoiced sound, a particular one order of coefficient of said phase-equalizing filter coefficients to a certain value and the other orders thereof to zero;the output of said filter coefficient determining means being connected to said phase-equalizing means so that successive set of said phase-equalizing filter coefficients h(m,nl) determined by said filter coefficient determining means are supplied to said phase-equalizing filter means as filter coefficients thereof, whereby said phase-equalizing filter means outputs the phase-equalized prediction residual waveform ep(n) as the output of said system representing the input speech waveform. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
Specification