Method and apparatus for speech data
First Claim
1. A data processing device for generating, from a preset code, filter data to be afforded to a speech synthesis filter adapted for synthesizing the speech based on linear prediction coefficients and a preset input signal, comprising:
- code decoding means for decoding said code produced by encoding original filter data, to output decoded filter data;
acquisition means for acquiring preset tap coefficients as found by carrying out learning,wherein said tap coefficients are used to predict the original filter data from said decoded filter data; and
prediction means for carrying out preset predictive calculations, using said tap coefficients and the decoded filter data, to find prediction values of said filter data, to send the so found prediction values to said speech synthesis filter for use as linear prediction coefficients in said speech syntheses filter.
0 Assignments
0 Petitions
Accused Products
Abstract
There is disclosed a speech processing device in which prediction taps for finding prediction values of the speech of high sound quality are extracted from the synthesized sound obtained on affording linear prediction coefficients and residual signals, generated from a preset code, to a speech synthesis filter, speech of high sound quality being higher in sound quality than the synthesized sound, and in which the prediction taps are used along with preset tap coefficients to perform preset predictive calculations to find the prediction values of the speech of high sound quality. The speech of high sound quality is higher in sound quality than the synthesized sound. The device includes a prediction tap extracting unit (45) for extracting, from the synthesized sound, the prediction taps used for predicting the speech of high sound quality, as target speech, the prediction values of which are to be found, and a class tap extraction unit (46) for extracting class taps, used for classifying the target speech to one of a plurality of classes, from the above code. The device also includes a classification unit (47) for finding the class of the target speech based on the class taps, acquisition unit for acquiring the tap coefficients associated with the class of the target speech from among the tap coefficients as found on learning from class to class, and a prediction unit (49) for finding the prediction values of the target speech using the prediction taps and the tap coefficients associated with the class of the target speech.
-
Citations
34 Claims
-
1. A data processing device for generating, from a preset code, filter data to be afforded to a speech synthesis filter adapted for synthesizing the speech based on linear prediction coefficients and a preset input signal, comprising:
-
code decoding means for decoding said code produced by encoding original filter data, to output decoded filter data; acquisition means for acquiring preset tap coefficients as found by carrying out learning, wherein said tap coefficients are used to predict the original filter data from said decoded filter data; and prediction means for carrying out preset predictive calculations, using said tap coefficients and the decoded filter data, to find prediction values of said filter data, to send the so found prediction values to said speech synthesis filter for use as linear prediction coefficients in said speech syntheses filter. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A data processing method for generating, from a preset code, filter data to be afforded to a speech synthesis filter adapted for synthesizing the speech based on linear prediction coefficients and on a preset input signal, comprising:
-
a code decoding step of decoding said code to output decoded filter data; an acquisition step of acquiring preset tap coefficients as found by carrying out learning, wherein said preset tap coefficients are used to predict the original filter data from said decoded filter data; and a prediction step of carrying out preset predictive calculations, using said tap coefficients and the decoded filter data, to find prediction values of said filter data, to send the so found prediction values to said speech synthesis filter for use as linear prediction coefficients in said speech syntheses filter.
-
-
13. A non-transitory computer-readable record medium storing a program that when executed on a computer causes controlling a processor to implement a method for generating, from a preset code, filter data to be afforded to a speech synthesis filter adapted for synthesizing the speech based on linear prediction coefficients and a preset input signal, said program comprising:
-
a code decoding step of decoding said code to output decoded filter data; an acquisition step of acquiring preset tap coefficients as found by carrying out learning, wherein said preset tap coefficients are used to predict the original filter data from said decoded filter data; and a prediction step of carrying out preset predictive calculations, using said tap coefficients and the decoded filter data, to find prediction values of said filter data, to send the so found prediction values to said speech synthesis filter for use as linear prediction coefficients in said speech syntheses filter.
-
-
14. A learning device for learning preset tap coefficients usable for finding, by predictive calculations from a code associated with filter data to be applied to a speech synthesis filter which synthesizes the speech based on linear prediction coefficients and a preset input signal, prediction values of said filter data, comprising:
-
code decoding means for decoding the code corresponding to filter data to output decoded filter data; and learning means for carrying out learning so that prediction errors of prediction values of said filter data obtained on carrying out predictive calculations using said tap coefficients and decoded filter data will be statistically smallest to find said tap coefficients, wherein said tap coefficients are used to predict the original filter data from said decoded filter data. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
-
-
22. A learning method for learning preset tap coefficients usable for finding, by predictive calculations from a code associated with filter data to be applied to a speech synthesis filter which synthesizes the speech based on linear prediction coefficients and a preset input signal, prediction values of said filter data, comprising:
-
a code decoding step of decoding the code corresponding to filter data to output decoded filter data; and a learning step of carrying out learning so that prediction errors of prediction values of said filter data obtained on carrying out predictive calculations using said tap coefficients and decoded filter data will be statistically smallest to find said tap coefficients, wherein said tap coefficients are used to predict the original filter data from said decoded filter data.
-
-
23. A non-transitory computer-readable record medium storing a program that when executed on a computer causes controlling a processor to implement a method for having a computer execute learning processing of learning preset tap coefficients usable for finding, by predictive calculations from a code associated with filter data to be applied to a speech synthesis filter which synthesizes the speech based on linear prediction coefficients and a preset input signal, prediction values of said filter data, said program comprising:
-
a code decoding step of decoding the code corresponding to filter data to output decoded filter data; and a learning step of carrying out learning so that prediction errors of prediction values of said filter data obtained on carrying out predictive calculations using said tap coefficients and decoded filter data will be statistically smallest to find said tap coefficients, wherein said tap coefficients are used to predict the original filter data from said decoded filter data.
-
-
24. A data processing device for generating, from a preset code, filter data to be afforded to a speech synthesis filter adapted for synthesizing the speech based on linear prediction coefficients and a preset input signal, comprising:
-
a decoder configured to decode said code produced by encoding original filter data, to output decoded filter data; an acquisition unit configured to acquire preset tap coefficients as found by carrying out learning, wherein said preset tap coefficients are used to predict the original filter data from said decoded filter data; and a predictor configured to carry out preset predictive calculations, using said tap coefficients and the decoded filter data, to find prediction values of said filter data, to send the so found prediction values to said speech synthesis filter for use as linear prediction coefficients in said speech syntheses filter. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34)
-
Specification