Method of encoding a speech signal

US 6,269,332 B1
Filed: 05/28/1999
Issued: 07/31/2001
Est. Priority Date: 09/30/1997
Status: Expired due to Term

First Claim

Patent Images

1. A method of encoding a speech signal comprising the steps of:

sampling the speech signal;

dividing the sample speech signal into a plurality of frames;

performing multi-band excitation analysis on the signal within each frame to derive a fundamental pitch, a plurality of voiced/unvoiced decisions for frequency bands in the signal and amplitudes of harmonics within said bands;

transforming the harmonic amplitudes to form a plurality of transform coefficients;

vector quantizing the coefficients to form a plurality of indices;

characterised by dividing the harmonic amplitudes into a first group of a fixed number of harmonics and a second group of the remainder of the harmonics, the first and second groups being subject to different transforms to form respective first and second sets of transform coefficients for quantization.

View all claims

7 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of coding speech is disclosed in which the speech signal is sampled and divided into a plurality of frames upon which multi-band excitation analysis is performed to derive a fundamental pitch, a plurality of voiced/unvoiced decisions and amplitudes of harmonics within the bands. The harmonic amplitudes are split into a first group of a fixed number of harmonics and a second group of the remainder of harmonics and these are separately transformed using the Discrete Cosine Transform for the first group and Non-Square Transform for the second group, the resulting transform coefficients being vector quantized to form a plurality of output indices. A decoding method and apparatus for performing both encoding and decoding methods are also disclosed.

36 Citations

View as Search Results

22 Claims

1. A method of encoding a speech signal comprising the steps of:
- sampling the speech signal;
  
  dividing the sample speech signal into a plurality of frames;
  
  performing multi-band excitation analysis on the signal within each frame to derive a fundamental pitch, a plurality of voiced/unvoiced decisions for frequency bands in the signal and amplitudes of harmonics within said bands;
  
  transforming the harmonic amplitudes to form a plurality of transform coefficients;
  
  vector quantizing the coefficients to form a plurality of indices;
  
  characterised by dividing the harmonic amplitudes into a first group of a fixed number of harmonics and a second group of the remainder of the harmonics, the first and second groups being subject to different transforms to form respective first and second sets of transform coefficients for quantization.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. A method as claimed in claim 1 wherein the first group is transformed using a Discrete Cosine Transform.
  - 3. A method as claimed in claim 1 wherein the second group is transformed using a Non-Square Transform.
  - 4. A method as claimed in claim 1 wherein the second group of harmonics is transformed into the same number of transform coefficients as the first group.
  - 5. A method as claimed in claim 1 wherein the first group comprises the first eight harmonics of signal within each frame.
  - 6. A method as claimed in claim 1 wherein the transform coefficients are normalised to form normalised coefficients and a gain value, the gain values being quantized separately from the sets of normalised coefficients.
  - 7. A method of decoding a signal encoded by the method of claim 1 comprising the steps of dequantizing the indices, inverse transforming the transform coefficients to form the harmonic amplitudes and combining the harmonic amplitudes, fundamental pitch and voiced/unvoiced decisions for Multi-Band Excitation synthesis to construct a speech signal.

8. A method of decoding an input data signal for speech synthesis comprising the steps of:
- vector dequantizing a plurality of indices of the data signal to form first and second sets of transform coefficients;
  
  inverse-transforming the first and second sets of coefficients using different transforms to derive respective first and second groups of harmonic amplitudes;
  
  deriving pitch and voiced/unvoiced decision information from the input data signal;
  
  performing multi-band excitation synthesis on the information and the harmonic amplitudes to form a synthesized speech signal; and
  
  constructing a speech signal from the synthesized signal.

9. Speech coding apparatus comprising:
- means for sampling a speech signal and dividing the sampled signal into a plurality of frames;
  
  a multi-band excitation analyzer for deriving a fundamental pitch and a plurality of voiced/unvoiced decisions for frequency bands in each frame and amplitudes of harmonics within said bands;
  
  transformation means for transforming the harmonic amplitudes to form a plurality of transform coefficients;
  
  vector quantization means for quantizing the coefficients to form a plurality of indices;
  
  characterized in that the transformation means comprises first transform means for transforming a first fixed number of harmonics into a first set of transform coefficients and second transform means for transforming the remainder of the harmonic amplitudes into a second set of transform coefficients, the first and second transform means performing different transforms.
- View Dependent Claims (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
- - 10. Apparatus as claimed in claim 9 wherein the first transform means performs a Discrete Cosine Transform.
  - 11. Apparatus as claimed in claim 9 wherein the second transformation means performs a Non-Square Transform.
  - 12. Apparatus as claimed in claim 9 wherein the first transform means performs the transformation on the first eight harmonics of the frame.
  - 13. Apparatus as claimed in claim 9 wherein the second transformation means transforms the remainder of the harmonics into a second set of transform coefficients of the same number as the set of first transform coefficients.
  - 14. Apparatus as claimed in claim 9 wherein the vector quantization means includes codebooks corresponding to each set of transform coefficients.
  - 15. Apparatus as claimed in claim 9 further comprising means for splitting the sets of transform coefficients into sets of normalised coefficients and respective gain values.
  - 16. Apparatus as claimed in claim 15 wherein the vector quantization means includes a separate codebook for the gain values.
  - 17. Apparatus for storing and reproduction of speech including apparatus as claimed in claim 9.
  - 18. A telephone answering machine including apparatus as claimed in claim 9.
  - 19. Apparatus as claimed in claim 9 in combination with a decoding apparatus for decoding an input data signal for speech synthesis, said decoding apparatus comprising vector dequantization means for dequantizing a plurality of indices to form at least two sets of transform coefficients, first and second transform means for transforming respectively the first and second sets of coefficients using different transforms to derive first and second groups of harmonic amplitudes, a multi-band excitation synthesizer for combining the harmonics with pitch and voiced/unvoiced decision information from the input signal and means for constructing a speech signal from the output of the synthesizer.

20. Decoding apparatus for decoding an input data signal for speech synthesis comprising:
- vector dequantization means for dequantizing a plurality of indices to form at least two sets of transform coefficients;
  
  first and second transform means for transforming respectively the first and second sets of coefficients to derive first and second groups of harmonic amplitudes, the first and second transform means performing different transforms;
  
  a multi-band excitation synthesizer for combining the harmonics with pitch and voiced/unvoiced decision information from the input signal; and
  
  means for constructing a speech signal from the output of the synthesizer.
- View Dependent Claims (21, 22)
- - 21. Apparatus for storing and reproduction of speech including apparatus as claimed in claims 20.
  - 22. A telephone answering maching including apparatus as claimed in claim 20.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Lantiq Beteiligungs-GmbH & Company KG (Intel Corporation)
Original Assignee
Siemens AG
Inventors
Koh, Soo Ngee, Choo, Wee Boon
Primary Examiner(s)
Korzuch, William
Assistant Examiner(s)
ABEBE, DANIEL DEMELASH

Application Number

US09/319,103
Time in Patent Office

795 Days
Field of Search

704/203, 704/204, 704/207, 704/208, 704/219, 704/220, 704/222, 704/223, 704/224, 704/229
US Class Current

704/233
CPC Class Codes

G10L 19/0212   using orthogonal transforma...

G10L 19/10   the excitation function bei...

G10L 25/93   Discriminating between voic...

Method of encoding a speech signal

First Claim

7 Assignments

0 Petitions

Accused Products

Abstract

36 Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

Method of encoding a speech signal

First Claim

7 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

36 Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links