Method and apparatus for speech data

US 7,912,711 B2
Filed: 09/21/2007
Issued: 03/22/2011
Est. Priority Date: 08/09/2000
Status: Expired due to Fees

First Claim

Patent Images

1. A data processing device for generating, from a preset code, filter data to be afforded to a speech synthesis filter adapted for synthesizing the speech based on linear prediction coefficients and a preset input signal, comprising:

code decoding means for decoding said code produced by encoding original filter data, to output decoded filter data;

acquisition means for acquiring preset tap coefficients as found by carrying out learning,wherein said tap coefficients are used to predict the original filter data from said decoded filter data; and

prediction means for carrying out preset predictive calculations, using said tap coefficients and the decoded filter data, to find prediction values of said filter data, to send the so found prediction values to said speech synthesis filter for use as linear prediction coefficients in said speech syntheses filter.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

There is disclosed a speech processing device in which prediction taps for finding prediction values of the speech of high sound quality are extracted from the synthesized sound obtained on affording linear prediction coefficients and residual signals, generated from a preset code, to a speech synthesis filter, speech of high sound quality being higher in sound quality than the synthesized sound, and in which the prediction taps are used along with preset tap coefficients to perform preset predictive calculations to find the prediction values of the speech of high sound quality. The speech of high sound quality is higher in sound quality than the synthesized sound. The device includes a prediction tap extracting unit (45) for extracting, from the synthesized sound, the prediction taps used for predicting the speech of high sound quality, as target speech, the prediction values of which are to be found, and a class tap extraction unit (46) for extracting class taps, used for classifying the target speech to one of a plurality of classes, from the above code. The device also includes a classification unit (47) for finding the class of the target speech based on the class taps, acquisition unit for acquiring the tap coefficients associated with the class of the target speech from among the tap coefficients as found on learning from class to class, and a prediction unit (49) for finding the prediction values of the target speech using the prediction taps and the tap coefficients associated with the class of the target speech.

Citations

34 Claims

1. A data processing device for generating, from a preset code, filter data to be afforded to a speech synthesis filter adapted for synthesizing the speech based on linear prediction coefficients and a preset input signal, comprising:
- code decoding means for decoding said code produced by encoding original filter data, to output decoded filter data;
  
  acquisition means for acquiring preset tap coefficients as found by carrying out learning,wherein said tap coefficients are used to predict the original filter data from said decoded filter data; and
  
  prediction means for carrying out preset predictive calculations, using said tap coefficients and the decoded filter data, to find prediction values of said filter data, to send the so found prediction values to said speech synthesis filter for use as linear prediction coefficients in said speech syntheses filter.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The data processing device according to claim 1 wherein said prediction means carries out one-dimensional linear predictive calculations to find prediction values of said filter data.
  - 3. The data processing device according to claim 1 wherein said acquisition means acquires said tap coefficients from storage means holding said tap coefficients.
  - 4. The data processing device according to claim 1 further comprising:
    - prediction tap extraction means for extracting prediction taps from said decoded filter data, said prediction taps being usable along with said tap coefficients for predicting said filter data, the prediction values of which are to be found, said prediction means carrying out predictive calculations using said prediction taps and tap coefficients.
  - 5. The data processing device according to claim 4 further comprising:
    - class tap extraction means for extracting class taps from said decoded filter data, said class taps being used for sorting said decoded filter data to one of a plurality of classes, by way of classification, and classification means for finding the class for said decoded filter data, based on said class taps;
      
      said prediction means carrying out predictive calculations using said prediction taps and said tap coefficients associated with the class of said filter data.
  - 6. The data processing device according to claim 4 further comprising:
    - class tap extraction means for extracting class taps from said code, said class taps being used for sorting said decoded filter data to one of a plurality of classes, by way of classification, and classification means for finding the class for said decoded filter data, based on said class tap;
      
      said prediction means carrying out predictive calculations using said prediction taps and said tap coefficients associated with the class of said decoded filter data.
  - 7. The data processing device according to claim 6 wherein said class tap extraction means extracts said class taps from both said code and said decoded filter data.
  - 8. The data processing device according to claim 1 wherein said tap coefficients have been obtained on carrying out learning so that prediction errors of predicted values of said filter data obtained on carrying out preset predictive calculations employing said tap coefficients and said decoded filter data will be statistically minimum.
  - 9. The data processing device according to claim 1 wherein said filter data is at least one or both of said preset input signal and said linear prediction coefficients.
  - 10. The data processing device according to claim 1 further comprising:
    - said speech synthesis filter.
  - 11. The data processing according to claim 1 wherein said code is obtained on encoding speech in accordance with a CELP (Code Excited Linear Prediction Coding) system.

12. A data processing method for generating, from a preset code, filter data to be afforded to a speech synthesis filter adapted for synthesizing the speech based on linear prediction coefficients and on a preset input signal, comprising:
- a code decoding step of decoding said code to output decoded filter data;
  
  an acquisition step of acquiring preset tap coefficients as found by carrying out learning,wherein said preset tap coefficients are used to predict the original filter data from said decoded filter data; and
  
  a prediction step of carrying out preset predictive calculations, using said tap coefficients and the decoded filter data, to find prediction values of said filter data, to send the so found prediction values to said speech synthesis filter for use as linear prediction coefficients in said speech syntheses filter.

13. A non-transitory computer-readable record medium storing a program that when executed on a computer causes controlling a processor to implement a method for generating, from a preset code, filter data to be afforded to a speech synthesis filter adapted for synthesizing the speech based on linear prediction coefficients and a preset input signal, said program comprising:
- a code decoding step of decoding said code to output decoded filter data;
  
  an acquisition step of acquiring preset tap coefficients as found by carrying out learning,wherein said preset tap coefficients are used to predict the original filter data from said decoded filter data; and
  
  a prediction step of carrying out preset predictive calculations, using said tap coefficients and the decoded filter data, to find prediction values of said filter data, to send the so found prediction values to said speech synthesis filter for use as linear prediction coefficients in said speech syntheses filter.

14. A learning device for learning preset tap coefficients usable for finding, by predictive calculations from a code associated with filter data to be applied to a speech synthesis filter which synthesizes the speech based on linear prediction coefficients and a preset input signal, prediction values of said filter data, comprising:
- code decoding means for decoding the code corresponding to filter data to output decoded filter data; and
  
  learning means for carrying out learning so that prediction errors of prediction values of said filter data obtained on carrying out predictive calculations using said tap coefficients and decoded filter data will be statistically smallest to find said tap coefficients,wherein said tap coefficients are used to predict the original filter data from said decoded filter data.
- View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
- - 15. The learning device according to claim 14 wherein said learning means performs the learning so that the prediction errors of the prediction values of said filter data obtained on carrying out one-dimensional linear predictive calculations using said tap coefficients and the decoded filter data will be statistically smallest.
  - 16. The learning device according to claim 14 further comprising:
    - predictive tap extraction means for extracting from said decoded filter data prediction taps used along with said tap coefficients for predicting said filter data;
      
      said learning means effecting learning so that the prediction errors of prediction values of said filter data obtained on carrying out predictive calculations using said prediction taps and tap coefficients will be statistically smallest.
  - 17. The learning device according to claim 16 further comprising:
    - class tap extraction means for extracting a class tap from said decoded filter data, said class tap being used for sorting said filter data to one of a plurality of classes, by way of classification, and classification means for finding the class for said filter data based on said class tap;
      
      said learning means performing learning so that the prediction errors of prediction values of said filter data obtained on carrying out predictive calculations using said prediction taps and said tap coefficients associated with the class of said filter data will be statistically smallest.
  - 18. The learning device according to claim 16 further comprising:
    - class tap extraction means for extracting a class tap from said code, said class tap being used for sorting said filter data to one of a plurality of classes, by way of classification, and classification means for finding the class for said filter data based on said class tap;
      
      said learning means performing learning so that the prediction errors of prediction values of said filter data obtained on carrying out predictive calculations using said prediction taps and tap coefficients will be statistically smallest.
  - 19. The learning device according to claim 18 wherein said class tap extraction means extracts said class tap from both said code and said decoded filter data.
  - 20. The learning device according to claim 14 wherein said filter data is at least one or both of said preset input signal and said linear prediction coefficients.
  - 21. The learning device according to claim 14 wherein said code is obtained on encoding speech in accordance with a CELP (Code Excited Linear Prediction Coding) system.

22. A learning method for learning preset tap coefficients usable for finding, by predictive calculations from a code associated with filter data to be applied to a speech synthesis filter which synthesizes the speech based on linear prediction coefficients and a preset input signal, prediction values of said filter data, comprising:
- a code decoding step of decoding the code corresponding to filter data to output decoded filter data; and
  
  a learning step of carrying out learning so that prediction errors of prediction values of said filter data obtained on carrying out predictive calculations using said tap coefficients and decoded filter data will be statistically smallest to find said tap coefficients,wherein said tap coefficients are used to predict the original filter data from said decoded filter data.

23. A non-transitory computer-readable record medium storing a program that when executed on a computer causes controlling a processor to implement a method for having a computer execute learning processing of learning preset tap coefficients usable for finding, by predictive calculations from a code associated with filter data to be applied to a speech synthesis filter which synthesizes the speech based on linear prediction coefficients and a preset input signal, prediction values of said filter data, said program comprising:
- a code decoding step of decoding the code corresponding to filter data to output decoded filter data; and
  
  a learning step of carrying out learning so that prediction errors of prediction values of said filter data obtained on carrying out predictive calculations using said tap coefficients and decoded filter data will be statistically smallest to find said tap coefficients,wherein said tap coefficients are used to predict the original filter data from said decoded filter data.

24. A data processing device for generating, from a preset code, filter data to be afforded to a speech synthesis filter adapted for synthesizing the speech based on linear prediction coefficients and a preset input signal, comprising:
- a decoder configured to decode said code produced by encoding original filter data, to output decoded filter data;
  
  an acquisition unit configured to acquire preset tap coefficients as found by carrying out learning, wherein said preset tap coefficients are used to predict the original filter data from said decoded filter data; and
  
  a predictor configured to carry out preset predictive calculations, using said tap coefficients and the decoded filter data, to find prediction values of said filter data, to send the so found prediction values to said speech synthesis filter for use as linear prediction coefficients in said speech syntheses filter.
- View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34)
- - 25. The data processing device according to claim 24 wherein said predictor carries out one-dimensional linear predictive calculations to find prediction values of said filter data.
  - 26. The data processing device according to claim 24 wherein said acquisition unit acquires said tap coefficients from a store holding said tap coefficients.
  - 27. The data processing device according to claim 24 further comprising:
    - a prediction tap extractor configured to extract prediction taps from said decoded filter data, said prediction taps being usable along with said tap coefficients for predicting said filter data, the prediction values of which are to be found, said predictor carrying out predictive calculations using said prediction taps and tap coefficients.
  - 28. The data processing device according to claim 27 further comprising:
    - a class tap extractor configured to extract class taps from said decoded filter data, said class taps being used for sorting said decoded filter data to one of a plurality of classes, by way of classification, and a classifier configured to find the class for said decoded filter data, based on said class taps;
      
      said predictor carrying out predictive calculations using said prediction taps and said tap coefficients associated with the class of said filter data.
  - 29. The data processing device according to claim 27 further comprising:
    - a class tap extractor configured to extract class taps from said code, said class taps being used for sorting said decoded filter data to one of a plurality of classes, by way of classification, and a classifier for finding the class for said decoded filter data, based on said class tap;
      
      said predictor carrying out predictive calculations using said prediction taps and said tap coefficients associated with the class of said decoded filter data.
  - 30. The data processing device according to claim 29 wherein said class tap extractor extracts said class taps from both said code and said decoded filter data.
  - 31. The data processing device according to claim 24 wherein said tap coefficients have been obtained on carrying out learning so that prediction errors of predicted values of said filter data obtained on carrying out preset predictive calculations employing said tap coefficients and said decoded filter data will be statistically minimum.
  - 32. The data processing device according to claim 24 wherein said filter data is at least one or both of said preset input signal and said linear prediction coefficients.
  - 33. The data processing device according to claim 24 further comprising:
    - a speech synthesis filter.
  - 34. The data processing according to claim 24 wherein said code is obtained on encoding speech in accordance with a CELP (Code Excited Linear Prediction Coding) system.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sony Corporation (Sony Group Corp.)
Original Assignee
Sony Corporation (Sony Group Corp.)
Inventors
Fujimori, Yasuhiro, Kimura, Hiroto, Kondo, Tetsujiro, Hattori, Masaaki, Watanabe, Tsutomu
Primary Examiner(s)
Chawan; Vijay B

Application Number

US11/903,550
Publication Number

US 20080027720A1
Time in Patent Office

1,278 Days
Field of Search

704/219, 704/223, 704/207, 704/262, 704/263, 704/264, 704/265, 704/222, 704/229, 704/220, 704/230
US Class Current

704/219
CPC Class Codes

G10L 19/26 Pre-filtering or post-filte...

G10L 21/038 using band spreading techni...

Method and apparatus for speech data

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

34 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for speech data

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

34 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links