Methods for speech quantization and error correction

US 5,226,084 A
Filed: 12/05/1990
Issued: 07/06/1993
Est. Priority Date: 12/05/1990
Status: Expired due to Term

First Claim

Patent Images

1. A method of encoding a speech signal, the method comprising the steps:

breaking said signal into segments, each of said segments representing one of a succession of time intervals and having a spectrum of frequencies;

for each said segment, sampling said spectrum at a set of frequencies, thereby forming a set of actual spectral amplitudes, wherein the frequencies at which said spectra are sampled generally differ from one segment to the next;

producing predicted spectral amplitudes for said current segment based on the spectral amplitudes for at least one previous segment, wherein said predicted spectral amplitudes for said current segment are based at least in part on interpolating the spectral amplitudes of a previous segment to estimate the spectral amplitudes in the previous segment at the frequencies of said current segment;

producing prediction residuals based on a difference between the actual spectral amplitudes for said current segment and the predicted spectral amplitudes for said current segment; and

producing an encoded speech signal based on said prediction residuals.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The redundancy contained within the spectral amplitudes is reduced, and as a result the quantization of the spectral amplitudes is improved. The prediction of the spectral amplitudes of the current segment from the spectral amplitudes of the previous is adjusted to account for any change in the fundamental frequency between the two segments. The spectral amplitudes prediction residuals are divided into a fixed number of blocks each containing approximately the same number of elements. A prediction residual block average (PRBA) vector is formed; each element of the PRBA is equal to the average of the prediction residuals within one of the blocks. The PRBA vector is vector quantized, or it is transformed with a Discrete Cosine Transform (DCT) and scalar quantized. The perceived effect of bit errors is reduced by smoothing the voiced/unvoiced decisions. An estimate of the error rate is made by locally averaging the number of correctable bit errors within each segment. If the estimate of the error rate is greater than a threshold, then high energy spectral amplitudes are declared voiced.

67 Citations

View as Search Results

25 Claims

1. A method of encoding a speech signal, the method comprising the steps:
- breaking said signal into segments, each of said segments representing one of a succession of time intervals and having a spectrum of frequencies;
  
  for each said segment, sampling said spectrum at a set of frequencies, thereby forming a set of actual spectral amplitudes, wherein the frequencies at which said spectra are sampled generally differ from one segment to the next;
  
  producing predicted spectral amplitudes for said current segment based on the spectral amplitudes for at least one previous segment, wherein said predicted spectral amplitudes for said current segment are based at least in part on interpolating the spectral amplitudes of a previous segment to estimate the spectral amplitudes in the previous segment at the frequencies of said current segment;
  
  producing prediction residuals based on a difference between the actual spectral amplitudes for said current segment and the predicted spectral amplitudes for said current segment; and
  
  producing an encoded speech signal based on said prediction residuals.
- View Dependent Claims (2, 17, 18, 19, 20, 21)
- - 2. The method of claim 1 wherein said interpolating of spectral amplitudes is performed using linear interpolation.
  - 17. The method of claim 1, 3, or 5 wherein said difference between the actual spectral amplitudes for said current segment and the predicted spectral amplitudes for said current segment is formed by subtracting a fraction of the predicted spectral amplitudes from the actual spectral amplitudes.
  - 18. The method of claim 1, 3 or 5 wherein the spectral amplitudes are obtained using a Multiband Excitation speech model.
  - 19. The method of claim 1, 3 or 5 wherein the spectral amplitudes are obtained using sinusoidal transform coding.
  - 20. The method of claim 1, 3 or 5 wherein only spectral amplitudes from the most recent previous segment are used in forming the predicted spectral amplitudes of said current segment.
  - 21. The method of claim 1, 3 or 5 wherein said spectrum comprises a fundamental frequency and said set of frequencies for a given segment are multiples of said fundamental frequency.

3. A method of encoding a speech signal, the method comprising the steps:
- breaking said signal into segments, each of said segments representing one of a succession of time intervals and having a spectrum of frequencies;
  
  for each said segment, sampling said spectrum at a set of frequencies, thereby forming a set of spectral amplitudes, wherein the frequencies at which said spectra are sampled generally differ from one segment to the next;
  
  producing predicted spectral amplitudes for said current segment based on the spectral amplitudes for at least one previous segment;
  
  producing prediction residuals based on a difference between the actual spectral amplitudes for said current segment and the predicted spectral amplitudes for said current segment;
  
  grouping said prediction residuals into a predetermined number of blocks, the number of blocks being independent of the number of said prediction residuals grouped into particular blocks; and
  
  producing an encoded speech signal based on said blocks.
- View Dependent Claims (4, 14, 15, 16)
- - 4. The method of claim 3 wherein said predicted spectral amplitudes for said current segment are based at lest in part on interpolating the spectral amplitudes of a previous segment to estimate the spectral amplitudes in the previous segment at the frequencies of said current segment.
  - 14. The method of claim 3, 6 or 7 wherein said predetermined number is equal to six.
  - 15. The method of claim 14 wherein the difference between the number of sampled frequencies grouped into the highest frequency block and the number of sampled frequencies grouped into the lowest frequency block is less than or equal to one.
  - 16. The method of claim 3, 6 or 7 wherein the number of prediction residuals grouped into a lower frequency block is not larger than the number of prediction residuals grouped into a higher frequency block.

5. A method of encoding a speech signal, the method comprising the steps:
- breaking said signal into segments, each of said segments representing one of a succession of time intervals and having a spectrum of frequencies;
  
  for each said segment, sampling said spectrum at a set of frequencies, thereby forming a set of spectral amplitudes, wherein the frequencies at which the spectra are sampled generally differ from one segment to the next;
  
  producing predicted spectral amplitudes for said current segment based on the spectral amplitudes for at least one previous segment;
  
  producing prediction residuals based on a difference between the actual spectral amplitudes for said current segment and the predicted spectral amplitudes for said current segment;
  
  grouping said prediction residuals into blocks;
  
  forming a prediction residual block average (PRBA) vector, each value of said PRBA vector being an average of the prediction residuals of a corresponding block; and
  
  producing an encoded speech signal based on said PRBA.
- View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13)
- - 6. The method of claim 5 wherein said blocks are of a predetermined number, the number of blocks being independent of the number of said prediction residuals grouped into particular blocks.
  - 7. The method of claim 6 wherein the predicted spectral amplitudes for said current segment are based at least in part on interpolating the spectral amplitudes of a previous segment to estimate the spectral amplitudes in the previous segment at the frequencies of said current segment.
  - 8. The method of claim 5, 6 or 7 wherein said average is computed by adding the prediction residuals within the block and dividing by the number of prediction residuals grouped into that block.
  - 9. The method of claim 8 wherein said average is obtained by computing the coefficients of a Discrete Cosine Transform (DCT) of the prediction residuals within a block and using the fist coefficient of the DCT as said average.
  - 10. The method of claim 5, 6 or 7 wherein encoding said PRBA vector comprises performing a linear transform on the PRBA vector producing transform coefficients and scalar quantizing said transform coefficients.
  - 11. The method of claim 10 wherein said linear transform comprises a Discrete Cosine Transform.
  - 12. The method of claim 5, 6 or 7 wherein encoding said PRBA vector comprises vector quantizing said PRBA vector.
  - 13. The method of claims 5, 6 or 7 wherein encoding said PRBA vector is performed using a method comprising the steps of:
    - determining an average of said PRBA vector;
      
      quantizing said average using scalar quantization; and
      
      vector quantizing said PRBA vector using a codebook consisting of code vectors, each of said code vectors having a mean equal to zero.

22. A method of synthesizing speech from a received bit stream, said bit stream representing speech segments and having bit errors, the method comprising the steps of:
- estimating a bit error rate for each speech segment;
  
  for each said speech segment, deciding whether to decode said speech segment as a voided or an unvoiced speech segment;
  
  smoothing the voice/unvoiced decisions based on the estimate of the bit error rate for said speech segment; and
  
  synthesizing a speech signal using said smoothed voiced/unvoiced decisions.
- View Dependent Claims (24, 25)
- - 24. The method of claims 22 or 23 wherein the smoothing step comprises the steps of:
    - comparing the estimate of the bit error rate of said speech segment with a first predetermined threshold;
      
      declaring all high energy spectral amplitudes voiced when said bit error rate is above said first threshold; and
      
      leaving all other voiced/unvoiced decisions unaffected.
  - 25. The method of claim 24 wherein said high energy spectral amplitudes are determined by a method comprising the steps of:
    - for each said speech segment, computing a second threshold which depends on the estimate of the bit error rate for said segment;
      
      comparing each spectral amplitude with said second threshold; and
      
      determining the spectral amplitude to be high energy if it is greater than said second threshold.

23. A method of synthesizing speech from a received bit stream, said bit stream representing frequency bands of speech segments and having bit errors, the method comprising the steps of:
- estimating a bit error rate for each speech segment;
  
  for each frequency band of each said speech segment, deciding whether to decode the frequency band of said speech segment as voiced or unvoiced;
  
  smoothing the voiced/unvoiced decisions based on the estimated of the bit error rate for said speech segment; and
  
  synthesizing a speech signal using said method voiced/unvoiced decisions.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Digital Voice Systems, Inc.
Original Assignee
Digital Voice Systems, Inc.
Inventors
Hardwick, John C., Lim, Jae S.
Primary Examiner(s)
Richardson, Robert L.
Assistant Examiner(s)
Tung, Kee M.

Application Number

US07/624,878
Time in Patent Office

944 Days
Field of Search

381/29-41, 381/46-51, 371/5.1, 371/5.2
US Class Current

704/219
CPC Class Codes

G10L 19/005   Correction of errors induce...

G10L 19/0212   using orthogonal transforma...

G10L 19/038   Vector quantisation, e.g. T...

G10L 19/06   Determination or coding of ...

G10L 19/087   using mixed excitation mode...

G10L 19/10   the excitation function bei...

G10L 21/0232   Processing in the frequency...

G10L 21/0364   for improving intelligibility

H03M 13/35   Unequal or adaptive error p...

Methods for speech quantization and error correction

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

67 Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

Methods for speech quantization and error correction

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

67 Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links