Filter for speech modification or enhancement, and various apparatus, systems and method using same
First Claim
1. A filter comprising:
- filtering means for filtering synthesized speech signals through a transfer function defined by filter coefficients to generate modified synthesized speech signals; and
filter coefficient generation means for generating said filter coefficients on the basis of spectral information represented in the form of a multi-dimensional vector and belonging to a predetermined domain and pertaining to input speech signals, in such a manner that formant characteristics of said modified synthesized speech signals are enhanced in accordance with said spectral information and in comparison with formant characteristics of said synthesized speech signals;
said spectral information being any one of line spectrum pairs (LSP) information, partial autocorrelation coefficients (PARCOR) information and log area ratio (LAR) information.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech modification or enhancement filter, and apparatus, system and method using the same. Synthesized speech signals are filtered to generate modified synthesized speech signals. From spectral information represented as a multi-dimensional vector, a filter coefficient is determined so as to ensure that formant characteristics of the modified synthesized speech signals are enhanced in comparison with those of the synthesized speech signal and in accordance with the spectral information. The spectral information can be any one of LSP information, PARCOR information and LAR information. A degree of freedom of design of the speech modification filter used for the aural suppression of quantizing noise contained in the synthesized speech signals is thus heightened leading to the improvement of intelligibility of said synthesized speech signals. A good formant enhancement effect can be obtained without allowing any perceptible level of distortions to occur within a range of permissible spectral gradients.
33 Citations
29 Claims
-
1. A filter comprising:
-
filtering means for filtering synthesized speech signals through a transfer function defined by filter coefficients to generate modified synthesized speech signals; and filter coefficient generation means for generating said filter coefficients on the basis of spectral information represented in the form of a multi-dimensional vector and belonging to a predetermined domain and pertaining to input speech signals, in such a manner that formant characteristics of said modified synthesized speech signals are enhanced in accordance with said spectral information and in comparison with formant characteristics of said synthesized speech signals; said spectral information being any one of line spectrum pairs (LSP) information, partial autocorrelation coefficients (PARCOR) information and log area ratio (LAR) information. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A speech synthesizing apparatus comprising:
-
means for generating synthesized speech signals on the basis of spectral information represented in the form of a multidimensional vector and belonging to a predetermined domain and pertaining to input speech signals; means for filtering synthesized speech signals through a transfer function defined by filter coefficients to generate modified synthesized speech signals; and means for generating said filter coefficients on the basis of said spectral information in such a manner that formant characteristics of said modified synthesized speech signals are enhanced in accordance with said spectral information and in comparison with formant characteristics of said synthesized speech signals; said spectral information being any one of line spectrum pairs (LSP) information, partial autocorrelation coefficients (PARCOR) information and log area ratio (LAR) information.
-
-
24. A speech synthesizing apparatus comprising:
-
means for generating a synthesized speech signal on the basis of first spectral information represented in the form of a multi-dimensional vector and belonging to a predetermined domain and pertaining to input speech signals; means for transforming said first spectral information into second spectral information belonging to a different domain from said predetermined domain; means for filtering synthesized speech signals through a transfer function defined by filter coefficients to generate modified synthesized speech signals; and means for generating said filter coefficients on the basis of said second spectral information so as to ensure that formant characteristics of said modified synthesized speech signals are enhanced in accordance with said second spectral information and in comparison with formant characteristics of said synthesized speech signals; said spectral information being any one of line spectrum pairs (LSP) information, partial autocorrelation coefficients (PARCOR) information and log area ratio (LAR) information.
-
-
25. A speech synthesizing apparatus comprising:
-
means for generating synthesized speech signals on the basis of first spectral information represented in the form of a multi-dimensional vector and belonging to a predetermined domain and pertaining to input speech signals; means for analyzing said synthesized speech signals to generate second spectral information; means for filtering synthesized speech signals through a transfer function defined by filter coefficients to generate modified synthesized speech signals; and means for generating said filter coefficients on the basis of said second spectral information so as to ensure that formants characteristics of said modified synthesized speech signals are enhanced in accordance with said second spectral information and in comparison with formant characteristics of said synthesized speech signals; said spectral information being any one of line spectrum pairs (LSP) information, partial autocorrelation coefficients (PARCOR) information and log area ratio (LAR) information.
-
-
26. A speech storage/transmission system comprising:
-
means for analyzing input speech signals to generate spectral information represented in the form of a multi-dimensional vector and belonging to a predetermined domain and pertaining to said input speech signals; means for storing or transmitting said spectral information;
means for generating synthesized speech signals on the basis of said spectral information which has been stored or transmitted;means for filtering said synthesized speech signals through &
-transfer function defined by filter coefficients to generate modified synthesized speech signals; andmeans for generating said filter coefficients on the basis of said spectral information so as to ensure that formant characteristics of said modified synthesized speech signals are enhanced in accordance with said spectral information and in comparison with formant characteristics of said synthesized speech signals; said spectral information being any one of line spectrum pairs (LSP) information, partial autocorrelation coefficients (PARCOR) information and log area ratio (LAR) information.
-
-
27. A speech storage/transmission system comprising:
-
means for analyzing input speech signals to generate first spectral information represented in the form of a multi-dimensional vector and belonging to a predetermined domain and pertaining to said input speech signals; means for storing or transmitting said first spectral information; means for generating a synthesized speech signal on the basis of said first spectral information which has been stored or transmitted; means for transforming said first spectral information into second spectral information belonging to a different domain from said predetermined domain; means for filtering said synthesized speech signals through a transfer function defined by filter coefficients to generate modified synthesized speech signals; and means for generating said filter coefficients on the basis of said second spectral information so as to ensure that formant characteristics of said modified synthesized speech signals are enhanced in accordance with said second spectral information and in comparison with formant characteristics of said synthesized speech signals; said spectral information being any one of line spectrum pairs (LSP) information, partial autocorrelation coefficients (PARCOR) information and log area ratio (LAR) information.
-
-
28. A speech storage/transmission system comprising:
-
means for analyzing input speech signals to generate first spectral information represented in the form of a multi-dimensional vector and belonging to a predetermined domain and pertaining to said input speech signals; means for storing or transmitting said first spectral information; means for generating synthesized speech signals on the basis of said first spectral information which has been stored or transmitted; means for analyzing said synthesized speech signals to generate second spectral information; means for filtering said synthesized speech signals through a transfer function defined by filter coefficients to generate modified synthesized speech signals; and means for generating said filter coefficients on the basis of said second spectral information so as to ensure that formant characteristics of said modified synthesized speech signal are enhanced in accordance with said second spectral information and in comparison with formant characteristics of said synthesized speech signals; said spectral information being any one of line spectrum pairs (LSP) information, partial autocorrelation coefficients (PARCOR) information and log area ratio (LAR) information.
-
-
29. A speech modification method comprising:
-
first step of filtering synthesized speech signals through a translation function defined by filter coefficients to generate modified synthesized speech signals; and second step of generating said filter coefficients on the basis of spectral information represented by a multi-dimensional vector and belonging to a predetermined domain and pertaining to said synthesized speech signals, so as to ensure that formant characteristics of said modified synthesized speech signals are enhanced in accordance with said spectral information and in comparison with formant characteristics of said synthesized speech signals;
said second step preceding the execution of said first step;said spectral information being any one of line spectrum pairs (LSP) information, partial autocorrelation coefficients (PARCOR) information and log area ratio (LAR) information.
-
Specification