Speech compression coding with discrete cosine transformation of stochastic elements

US 5,943,644 A
Filed: 06/18/1997
Issued: 08/24/1999
Est. Priority Date: 06/21/1996
Status: Expired due to Fees

First Claim

Patent Images

1. A speech compression coding method, comprising the steps of:

a) dividing a digital speech waveform into frames and sub-frames; and

b) extracting and coding spectrum envelope elements, pitch elements and stochastic element from the frames and sub-frames;

wherein said step b) calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;

and codes the second error signal so as to obtain the stochastic elements as a result of transforming the second error signal into a signal of a frequency domain through a transformation and coding coefficients of the transformed domain.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A digital speech waveform is divided into frames and sub-frames. Spectrum envelope information, pitch elements and stochastic elements are extracted and coded for the frames and sub-frames. A second error signal is calculated as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements. The second error signal is coded so as to obtain the stochastic elements as a result of transforming the second error signal into a signal of a frequency domain through discrete cosine transformation and coding coefficients of the transformed domain.

Citations

39 Claims

1. A speech compression coding method, comprising the steps of:
- a) dividing a digital speech waveform into frames and sub-frames; and
  
  b) extracting and coding spectrum envelope elements, pitch elements and stochastic element from the frames and sub-frames;
  
  wherein said step b) calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of transforming the second error signal into a signal of a frequency domain through a transformation and coding coefficients of the transformed domain.
- View Dependent Claims (2, 3, 4)
- - 2. The speech compression coding method according to claim 1, wherein the transformation is a discrete cosine transformation.
  - 3. The speech compression coding method according to claim 1, wherein the transformation is a discrete Fourier transformation.
  - 4. The speech compression coding method according to claim 1, wherein the transformation is a K-L (Karhunen-Loeve) transformation.

5. A speech compression coding method, comprising the steps of:
- a) receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  b) coding the digital speech waveform in a predetermined coding method;
  
  c) storing the coded digital speech waveform;
  
  d) retrieving and decoding the stored coded digital speech waveform;
  
  e) converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said step b) comprises the steps of;
  
  b1) dividing the digital speech waveform into frames and sub-frames; and
  
  b2) extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said step d) comprises steps of;
  
  d1) decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  d2) generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  d3) generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said step b2) calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of transforming the second error signal into a signal of a frequency domain through a transformation and coding coefficients of the transformed domain.
- View Dependent Claims (6, 7, 8)
- - 6. The speech compression coding method according to claim 5, wherein the transformation is a discrete cosine transformation.
  - 7. The speech compression coding method according to claim 5, wherein the transformation is a discrete Fourier transformation.
  - 8. The speech compression coding method according to claim 5, wherein the transformation is a K-L (Karhunen-Loeve) transformation.

9. A speech compression coding method, comprising the steps of:
- a) receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  b) coding the digital speech waveform in a predetermined coding method;
  
  c) storing the coded digital speech waveform;
  
  d) retrieving and decoding the stored coded digital speech waveform;
  
  e) converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said step b) comprises the steps of;
  
  b1) dividing the digital speech waveform into frames and sub-frames; and
  
  b2) extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said step d) comprises steps of;
  
  d1) decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  d2) generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  d3) generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said step b2) calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of transforming the second error signal into a signal of a frequency domain, selecting a predetermined number N of frequencies, at which frequencies the signal transformed to the frequency domain has spectrum intensity levels from a maximum level through an Nth level, and codes the selected frequencies and the spectrum coefficients at the selected frequencies.

10. A speech compression coding method, comprising the steps of:
- a) receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  b) coding the digital speech waveform in a predetermined coding method;
  
  c) storing the coded digital speech waveform;
  
  d) retrieving and decoding the stored coded digital speech waveform;
  
  e) converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said step b) comprises the steps of;
  
  b1) dividing the digital speech waveform into frames and sub-frames; and
  
  b2) extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said step d) comprises steps of;
  
  d1) decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  d2) generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  d3) generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said step b2) calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of selecting a predetermined number N of samples, which have spectrum intensity levels from a maximum level through an Nth level, and codes the positions of the selected samples and the intensities of the samples.

11. A speech compression coding method, comprising the steps of:
- a) receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  b) coding the digital speech waveform in a predetermined coding method;
  
  c) storing the coded digital speech waveform;
  
  d) retrieving and decoding the stored coded digital speech waveform;
  
  e) converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said step b) comprises the steps of;
  
  b1) dividing the digital speech waveform into frames and sub-frames; and
  
  b2) extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said step d) comprises steps of;
  
  d1) decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  d2) generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  d3) generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said step b2) calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of selecting samples, which have intensity levels from a maximum level through an Nth level, and codes the positions of the selected samples and the intensities of the samples, and also, transforming the second error signal into a signal of a frequency domain, selecting N frequencies, at which frequencies the signal transformed to the frequency domain has spectrum intensity levels from a maximum level through an Nth level, and codes the selected frequencies and the spectrum coefficients at the selected frequencies.
- View Dependent Claims (21)
- - 21. The speech compression coding device according to claim 11, wherein the transformation is a K-L (Karhunen-Loeve) transformation.

12. A speech compression coding method, comprising the steps of:
- a) receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  b) coding the digital speech waveform in a predetermined coding method;
  
  c) storing the coded digital speech waveform;
  
  d) retrieving and decoding the stored coded digital speech waveform;
  
  e) converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said step b) comprises the steps of;
  
  b1) dividing the digital speech waveform into frames and sub-frames; and
  
  b2) extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said step d) comprises steps of;
  
  d1) decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  d2) generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  d3) generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said step b2) calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of selecting a predetermined number of samples, which have intensity levels from a maximum level through an Nth level, and codes the positions of the selected samples and the intensities of the samples, and also, transforming the second error signal into a signal of a frequency domain, selecting a predetermined number N of frequencies, at which frequencies the signal transformed to the frequency domain has spectrum intensity levels from a maximum level through an Nth level, and codes the selected frequencies and the spectrum coefficients at the selected frequencies.

13. A speech compression coding method, comprising the steps of:
- a) receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  b) coding the digital speech waveform in a predetermined coding method;
  
  c) storing the coded digital speech waveform;
  
  d) retrieving and decoding the stored coded digital speech waveform;
  
  e) converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said step b) comprises the steps of;
  
  b1) dividing the digital speech waveform into frames and sub-frames; and
  
  b2) extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said step d) comprises steps of;
  
  d1) decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  d2) generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  d3) generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said step b2) calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of selecting samples, which have intensity levels from a maximum level through an Nth level, and codes the positions of the selected samples and the intensities of the samples, and also, transforming the second error signal into a signal of a frequency domain, selecting N frequencies, at which frequencies the signal transformed to the frequency domain has spectrum intensity levels from a maximum level through an Nth level, and codes the selected frequencies and the spectrum coefficients at the selected frequencies, and, selecting a predetermined number of sets of codes from among the obtained sets of the codes so that a resulting decoded speech has minimum distortion from the input speech.

14. A speech compression coding device, comprising:
- a frame dividing portion dividing a digital speech waveform into frames and sub-frames; and
  
  an extracting and coding portion extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  wherein;
  
  said extracting and coding portion calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of transforming the second error signal into a signal of a frequency domain through a transformation and coding coefficients of the transformed domain.
- View Dependent Claims (15, 16, 17)
- - 15. The speech compression coding device according to claim 14, wherein the transformation is a discrete cosine transformation.
  - 16. The speech compression coding device according to claim 14, wherein the transformation is a discrete Fourier transformation.
  - 17. The speech compression coding device according to claim 14, wherein the transformation is a K-L (Karhunen-Loeve) transformation.

18. A speech compression coding device, comprising:
- an analog-to-digital converting portion receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  a speech coding portion coding the digital speech waveform in a predetermined coding method;
  
  a storage portion storing the coded digital speech waveform;
  
  a speech decoding portion retrieving and decoding the stored coded digital speech waveform;
  
  a digital-to-analog converting portion converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said speech coding portion comprises;
  
  a frame dividing portion dividing the digital speech waveform into frames and sub-frames; and
  
  an extracting and coding portion extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said speech decoding portion comprises;
  
  a decoding portion decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  an excitation vector signal generating portion generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  a synthetic speech generating portion generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said extracting and coding portion calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of transforming the second error signal into a signal of a frequency domain through a transformation and coding coefficients of the transformed domain.
- View Dependent Claims (19, 20)
- - 19. The speech compression coding device according to claim 18, wherein the transformation is a discrete cosine transformation.
  - 20. The speech compression coding device according to claim 18, wherein the transformation is a discrete Fourier transformation.

22. A speech compression coding device, comprising:
- an analog-to-digital converting portion receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  a speech coding portion coding the digital speech waveform in a predetermined coding method;
  
  a storage portion storing the coded digital speech waveform;
  
  a speech decoding portion retrieving and decoding the stored coded digital speech waveform;
  
  a digital-to-analog converting portion converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said speech coding portion comprises;
  
  a frame dividing portion dividing the digital speech waveform into frames and sub-frames; and
  
  an extracting and coding portion extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said speech decoding portion comprises;
  
  a decoding portion decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  an excitation vector signal generating portion generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  a synthetic speech generating portion generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said extracting and coding portion calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of transforming the second error signal into a signal of a frequency domain, selecting a predetermined number N of frequencies, at which frequencies the signal transformed to the frequency domain has spectrum intensity levels from a maximum level through an Nth level, and codes the selected frequencies and the spectrum coefficients at the selected frequencies.

23. A speech compression coding device, comprising:
- an analog-to-digital converting portion receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  a speech coding portion coding the digital speech waveform in a predetermined coding method;
  
  a storage portion storing the coded digital speech waveform;
  
  a speech decoding portion retrieving and decoding the stored coded digital speech waveform;
  
  a digital-to-analog converting portion converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said speech coding portion comprises;
  
  a frame dividing portion dividing the digital speech waveform into frames and sub-frames; and
  
  an extracting and coding portion extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said speech decoding portion comprises;
  
  a decoding portion decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  an excitation vector signal generating portion generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  a synthetic speech generating portion generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said extracting and coding portion calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of selecting a predetermined number of samples, which have intensity levels from a maximum level through an Nth level, and codes the positions of the selected samples and the intensities of the samples.

24. A speech compression coding device, comprising:
- an analog-to-digital converting portion receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  a speech coding portion coding the digital speech waveform in a predetermined coding method;
  
  a storage portion storing the coded digital speech waveform;
  
  a speech decoding portion retrieving and decoding the stored coded digital speech waveform;
  
  a digital-to-analog converting portion converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said speech coding portion comprises;
  
  a frame dividing portion dividing the digital speech waveform into frames and sub-frames; and
  
  extracting and coding portion extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said speech decoding portion comprises;
  
  a decoding portion decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  an excitation vector signal generating portion generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  a synthetic speech generating portion generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said extracting and coding portion calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of selecting samples, which have intensity levels from a maximum level through an Nth level, and codes the positions of the selected samples and the intensities of the samples, and also, transforming the second error signal into a signal of a frequency domain, selecting N frequencies, at which frequencies the signal transformed to the frequency domain has spectrum intensity levels from a maximum level through an Nth level, and codes the selected frequencies and the spectrum coefficients at the selected frequencies.

25. A speech compression coding device, comprising:
- an analog-to-digital converting portion receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  a speech coding portion coding the digital speech waveform in a predetermined coding method;
  
  a storage portion storing the coded digital speech waveform;
  
  a speech decoding portion retrieving and decoding the stored coded digital speech waveform;
  
  a digital-to-analog converting portion converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said speech coding portion comprises;
  
  a frame dividing portion dividing the digital speech waveform into frames and sub-frames; and
  
  an extracting and coding portion extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said speech decoding portion comprises;
  
  a decoding portion decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  an excitation vector signal generating portion generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  a synthetic speech generating portion generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said extracting and coding portion calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of selecting a predetermined number of samples, which have intensity levels from a maximum level through an Nth level, and codes the positions of the selected samples and the intensities of the samples, and also, transforming the second error signal into a signal of a frequency domain, selecting a predetermined number N of frequencies, at which frequencies the signal transformed to the frequency domain has spectrum intensity levels from a maximum level through an Nth level, and codes the selected frequencies and the spectrum coefficients at the selected frequencies.

26. A speech compression coding device, comprising:
- an analog-to-digital converting portion receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  a speech coding portion coding the digital speech waveform in a predetermined coding method;
  
  a storage portion storing the coded digital speech waveform;
  
  a speech decoding portion retrieving and decoding the stored coded digital speech waveform;
  
  a digital-to-analog converting portion converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said speech coding portion comprises;
  
  a frame dividing portion dividing the digital speech waveform into frames and sub-frames; and
  
  an extracting and coding portion extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said speech decoding portion comprises;
  
  a decoding portion decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  an excitation vector signal generating portion generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  a synthetic speech generating portion generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said extracting and coding portion calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of selecting samples, which have intensity levels from a maximum level through an Nth level, and codes the positions of the selected samples and the intensities of the samples, and also, transforming the second error signal into a signal of a frequency domain, selecting N frequencies, at which frequencies the signal transformed to the frequency domain has spectrum intensity levels from a maximum level through an Nth level, and codes the selected frequencies and the spectrum coefficients at the selected frequencies, further, selecting a predetermined number of sets of codes from among the obtained sets of the codes so that a resulting decoded speech has minimum distortion from the input speech.

27. A computer program product for speech compression coding, comprising:
- program code means a) for dividing the digital speech waveform into frames and sub-frames; and
  
  program code means b) for extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  wherein;
  
  said program code means b) calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of transforming the second error signal into a signal of a frequency domain through a transformation and coding coefficients of the transformed domain.
- View Dependent Claims (28, 29, 30)
- - 28. The computer program product for speech compression coding according to claim 27, wherein the transformation is a discrete cosine transformation.
  - 29. The computer program product for speech compression coding according to claim 27, wherein the transformation is a discrete Fourier transformation.
  - 30. The computer program product for speech compression coding according to claim 27, wherein the transformation is a K-L (Karhunen-Loeve) transformation.

31. A computer program product for speech compression coding, comprising:
- a computer usable medium having computer readable program code means embodied in said medium, said computer readable code means comprising;
  
  program code means a) for receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  program code means b) for coding the digital speech waveform in a predetermined coding method;
  
  program code means c) for storing the coded digital speech waveform;
  
  program code means d) for retrieving and decoding the stored coded digital speech waveform;
  
  program code means e) for converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said program code means b) comprises;
  
  program code means b1) for dividing the digital speech waveform into frames and sub-frames; and
  
  program code means b2) for extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said program code means d) comprises;
  
  program code means d1) for decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  program code means d2) for generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  program code means d3) for generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said program code means b2) calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of transforming the second error signal into a signal of a frequency domain through a transformation and coding coefficients of the transformed domain.
- View Dependent Claims (32, 33, 34)
- - 32. The computer program product for speech compression coding according to claim 31, wherein the transformation is a discrete cosine transformation.
  - 33. The computer program product for speech compression coding according to claim 31, wherein the transformation is a discrete Fourier transformation.
  - 34. The computer program product for speech compression coding according to claim 31, wherein the transformation is a K-L (Karhunen-Loeve) transformation.

35. A computer program product, for speech compression coding, comprising:
- a computer usable medium having computer readable program code means embodied in said medium, said computer program code means comprising;
  
  program code means a) for receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  program code means b) for coding the digital speech waveform in a predetermined coding method;
  
  program code means c) for storing the coded digital speech waveform;
  
  program code means d) for retrieving and decoding the stored coded digital speech waveform;
  
  program code means e) for converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said program code means b) comprises;
  
  program code means b1) for dividing the digital speech waveform into frames and sub-frames; and
  
  program code means b2) for extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said program code means d) comprises;
  
  program code means d1) for decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  program code means d2) for generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  program code means d3) for generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said program code means b2) calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of transforming the second error signal into a signal of a frequency domain, selecting a predetermined number N of frequencies, at which frequencies the signal transformed to the frequency domain has spectrum intensity levels from a maximum level through an Nth level, and codes the selected frequencies and the spectrum coefficients at the selected frequencies.

36. A computer program product, for speech compression coding, comprising:
- a computer usable medium having computer readable program code means embodied in said medium, said computer program code means comprising;
  
  program code means a) for receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  program code means b) for coding the digital speech waveform in a predetermined coding method;
  
  program code means c) for storing the coded digital speech waveform;
  
  program code means d) for retrieving and decoding the stored coded digital speech waveform;
  
  program code means e) for converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said program code means b) comprises;
  
  program code means b1) for dividing the digital speech waveform into frames and sub-frames; and
  
  program code means b2) for extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said program code means d) comprises;
  
  program code means d1) for decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  program code means d2) for generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  program code means d3) for generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said program code means b2) calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of selecting a predetermined number of samples, which have spectrum intensity levels from a maximum level through an Nth level, and codes the positions of the selected samples and the intensities of the samples.

37. A computer program product, for speech compression coding, comprising:
- a computer usable medium having computer readable program code means embodied in said medium, said computer program code means comprising;
  
  program code means a) for receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  program code means b) for coding the digital speech waveform in a predetermined coding method;
  
  program code means c) for storing the coded digital speech waveform;
  
  program code means d) for retrieving and decoding the stored coded digital speech waveform;
  
  program code means e) for converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said program code means b) comprises;
  
  program code means b1) for dividing the digital speech waveform into frames and sub-frames; and
  
  program code means b2) for extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said program code means d) comprises;
  
  program code means d1) for decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  program code means d2) for generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  program code means d3) for generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said program code means b2) calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of selecting samples, which have intensity levels from a maximum level through an Nth level, and codes the positions of the selected samples and the intensities of the samples, and also, transforming the second error signal into a signal of a frequency domain, selecting N frequencies, at which frequencies the signal transformed to the frequency domain has spectrum intensity levels from a maximum level through an Nth level, and codes the selected frequencies and the spectrum coefficients at the selected frequencies.

38. A computer program product, for speech compression coding, comprising:
- a computer usable medium having computer readable program code means embodied in said medium, said computer program code means comprising;
  
  program code means a) for receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  program code means b) for coding the digital speech waveform in a predetermined coding method;
  
  program code means c) for storing the coded digital speech waveform;
  
  program code means d) for retrieving and decoding the stored coded digital speech waveform;
  
  program code means e) for converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said program code means b) comprises;
  
  program code means b1) for dividing the digital speech waveform into frames and sub-frames; and
  
  program code means b2) for extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said program code means d) comprises;
  
  program code means d1) for decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  program code means d2) for generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  program code means d3) for generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said program code means b2) calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of selecting a predetermined number of samples, which have intensity levels from a maximum level through an Nth level, and codes the positions of the selected samples and the intensities of the samples, and also, transforming the second error signal into a signal of a frequency domain, selecting a predetermined number N of frequencies, at which frequencies the signal transformed to the frequency domain has spectrum intensity levels from a maximum level through an Nth level, and codes the selected frequencies and the spectrum coefficients at the selected frequencies.

39. A computer program product, for speech compression coding, comprising:
- a computer usable medium having computer readable program code means embodied in said medium, said computer program code means comprising;
  
  program code means a) for receiving an analog speech waveform and converting said analog speech waveform into a digital speech waveform;
  
  program code means b) for coding the digital speech waveform in a predetermined coding method;
  
  program code means c) for storing the coded digital speech waveform;
  
  program code means d) for retrieving and decoding the stored coded digital speech waveform;
  
  program code means e) for converting the decoded digital speech waveform into an analog speech waveform,wherein;
  
  said program code means b) comprises;
  
  program code means b1) for dividing the digital speech waveform into frames and sub-frames; and
  
  program code means b2) for extracting and coding spectrum envelope elements, pitch elements and stochastic elements for the frames and sub-frames;
  
  said program code means d) comprises;
  
  program code means d1) for decoding the coded spectrum envelope elements, pitch elements and stochastic elements;
  
  program code means d2) for generating an excitation vector signal from the decoded stochastic elements and pitch elements; and
  
  program code means d3) for generating synthetic speech from the excitation vector signal and the decoded spectrum envelope elements;
  
  wherein;
  
  said program code means b2) calculates a second error signal as a result of subtracting, from the sub-frames, pitch component speech generated from the pitch elements and spectrum envelope elements to result in said second error signal isolating the stochastic elements from the envelope elements and pitch elements;
  
  and codes the second error signal so as to obtain the stochastic elements as a result of selecting samples, which have intensity levels from a maximum level through an Nth level, and codes the positions of the selected samples and the intensities of the samples, and also, transforming the second error signal into a signal of a frequency domain, selecting N frequencies, at which frequencies the signal transformed to the frequency domain has spectrum intensity levels from a maximum level through an Nth level, and codes the selected frequencies and the spectrum coefficients at the selected frequencies, further, selecting a predetermined number of sets of codes from among the obtained sets of the codes so that a resulting decoded speech has minimum distortion from the input speech.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Ricoh Company Limited
Original Assignee
Ricoh Company Limited
Inventors
Yamane, Jun, Uchiyama, Hiroki
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Chawan, Vijay B.

Application Number

US08/877,710
Time in Patent Office

797 Days
Field of Search

704/219, 704/233, 704/222, 704/223, 704/207, 704/229, 704/203, 704/217
US Class Current

704/207
CPC Class Codes

G10L 19/0212 using orthogonal transforma...

G10L 25/27 characterised by the analys...

Speech compression coding with discrete cosine transformation of stochastic elements

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

39 Claims

Specification

Solutions

Use Cases

Quick Links

Speech compression coding with discrete cosine transformation of stochastic elements

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

39 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links