Joint quantization of speech subframe voicing metrics and fundamental frequencies

US 6,199,037 B1
Filed: 12/04/1997
Issued: 03/06/2001
Est. Priority Date: 12/04/1997
Status: Expired due to Term

First Claim

Patent Images

1. A method of encoding speech into a frame of bits, the method comprising:

digitizing a speech signal into a sequence of digital speech samples;

dividing the digital speech samples into a sequence of subframes, each of the subframes including multiple digital speech samples;

estimating a fundamental frequency parameter for each subframe;

designating subframes from the sequence of subframes as corresponding to a frame;

jointly quantizing fundamental frequency parameters from subframes of the frame to produce a set of encoder fundamental frequency bits; and

including the encoder fundamental frequency bits in a frame of bits, wherein the joint quantization comprises;

computing fundamental frequency residual parameters as a difference between a transformed average of the fundamental frequency parameters and each fundamental frequency parameter;

combining the residual fundamental frequency parameters from the subframes of the frame; and

quantizing the combined residual parameters.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Speech is encoded into a frame of bits. A speech signal is digitized into a sequence of digital speech samples that are then divided into a sequence of subframes. A set of model parameters is estimated for each subframe. The model parameters include a set of voicing metrics that represent voicing information for the subframe. Two or more subframes from the sequence of subframes are designated as corresponding to a frame. The voicing metrics from the subframes within the frame are jointly quantized. The joint quantization includes forming predicted voicing information from the quantized voicing information from the previous frame, computing the residual parameters as the difference between the voicing information and the predicted voicing information, combining the residual parameters from both of the subframes within the frame, and quantizing the combined residual parameters into a set of encoded voicing information bits which are included in the frame of bits. A similar technique is used to encode fundamental frequency information.

Citations

30 Claims

1. A method of encoding speech into a frame of bits, the method comprising:
- digitizing a speech signal into a sequence of digital speech samples;
  
  dividing the digital speech samples into a sequence of subframes, each of the subframes including multiple digital speech samples;
  
  estimating a fundamental frequency parameter for each subframe;
  
  designating subframes from the sequence of subframes as corresponding to a frame;
  
  jointly quantizing fundamental frequency parameters from subframes of the frame to produce a set of encoder fundamental frequency bits; and
  
  including the encoder fundamental frequency bits in a frame of bits, wherein the joint quantization comprises;
  
  computing fundamental frequency residual parameters as a difference between a transformed average of the fundamental frequency parameters and each fundamental frequency parameter;
  
  combining the residual fundamental frequency parameters from the subframes of the frame; and
  
  quantizing the combined residual parameters.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1, wherein combining the residual parameters from the subframes of the frame includes performing a linear transformation on the residual parameters to produce a set of transformed residual coefficients for each subframe.
  - 3. The method of claim 1, wherein fundamental frequency parameters represent log fundamental frequency estimated for a Multi-Band Excitation (MBE) speech module.
  - 4. The method of claim 1, further comprising producing additional encoder bits by quanitizing additional speech model parameters other than the fundamental frequency parameters and including the additional encoder bits in the frame of bits.
  - 5. The method of claim 4, wherein the additional speech model parameters include parameters representative of spectral magnitudes.

6. A method of encoding speech into a frame of bits, the method comprising:
- digitizing a speech signal into a sequence of digital speech samples;
  
  estimating a set of voicing metrics parameters for a group of digital speech samples, the set including multiple voicing metrics parameters;
  
  jointly quantizing the voicing metrics parameters to produce a set of encoder voicing metrics bits; and
  
  including the encoder voicing metrics bits in a frame of bits.
- View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 7. The method of claim 6, further comprising:
8. The method of claim 7, wherein jointly quantizing multiple voicing metrics parameters comprises jointly quantizing at least one voicing metrics parameter for each of multiple subframes.
9. The method of claim 7, wherein jointly quantizing multiple voicing metrics parameters comprises jointly quantizing multiple voicing metrics parameters for a single subframe.
10. The method of claim 6, wherein the joint quantization comprises:
- computing voicing metrics residual parameters as the transformed ratios of voicing error vectors and voicing energy vectors;
  
  combining the residual voicing metrics parameters; and
  
  quantizing the combined residual parameters.
11. The method of claim 10, wherein combining the residual parameters includes performing a linear transformation on the residual parameters to produce a set of transformed residual coefficients for each subframe.
12. The method of claim 10, wherein quantizing the combined residual parameters includes using at least one vector quantizer.
13. The method of claim 6, wherein the frame of bits includes redundant error control bits protecting at least some of the encoder voicing metrics bits.
14. The method of claim 6, wherein voicing metrics parameters represent voicing states estimated for a Multi-Band Excitation (MBE) speech model.
15. The method of claim 6, further comprising producing additional encoder bits by quantizing additional speech model parameters other than the voicing metrics parameters and including the additional encoder bits in the frame of bits.
16. The method of claim 15, wherein the additional speech model parameters include parameters representative of spectral magnitudes.
17. The method of claim 15, wherein the additional speech model parameters include parameters representative of a fundamental frequency.
18. The method of claim 17, wherein the additional speech model parameters include parameters representative of the spectral magnitudes.

19. A method of encoding speech into a frame of bits, the method comprising:
- digitizing a speech signal into a sequence of digital speech samples;
  
  dividing the digital speech samples into a sequence of subframes, each of the subframes including multiple digital speech samples;
  
  estimating a fundamental frequency parameter for each subframe;
  
  designating subframes from the sequence of subframes as corresponding to a frame;
  
  quantizing a fundamental frequency parameter from one subframe of the frame;
  
  interpolating a fundamental frequency parameter for another subframe of the frame using the quantized fundamental frequency parameter from the one subframe of the frame;
  
  combining the quantized fundamental frequency parameter and the interpolated fundamental frequency parameter to produce a set of encoder fundamental frequency bits; and
  
  including the encoder fundamental frequency bits in a frame of bits.

20. A speech encoder for encoding speech into a frame of bits, the encoder comprising:
- means for digitizing a speech signal into a sequence of digital speech samples;
  
  means for estimating a set of voicing metrics parameters for a group of digital speech samples, the set including multiple voicing metrics parameters;
  
  means for jointly quantizing the voicing metrics parameters to produce a set of encoder voicing metrics bits; and
  
  means for forming a frame of bits including the encoder voicing metrics bits.
- View Dependent Claims (21, 22, 23)
- - 21. The speech encoder of claim 20, further comprising:
22. The speech encoder of claim 21, wherein the means for jointly quantizing multiple voicing metrics parameters jointly quantizes at least one voicing metrics parameter for each of multiple subframes.
23. The speech encoder of claim 21, wherein the means for jointly quantizing multiple voicing metrics parameters jointly quantizes multiple voicing metrics parameters for a single subframe.

24. A method of decoding speech from a frame of bits that has been encoded by digitizing a speech signal into a sequence of digital speech samples, estimating a set of voicing metrics parameters for a group of digital speech samples, the set including multiple voicing metrics parameters, jointly quantizing the voicing metrics parameters to produce a set of encoder voicing metrics bits, and including the encoder voicing metrics bits in a frame of bits, the method of decoding speech comprising:
- extracting decoder voicing metrics bits from the frame of bits;
  
  jointly reconstructing voicing metrics parameters using the decoder voicing metrics bits; and
  
  synthesizing digital speech samples using speech model parameters which include some or all of the reconstructed voicing metrics parameters.
- View Dependent Claims (25, 26)
- - 25. The method of decoding speech of claim 24, wherein the joint reconstruction comprises:
26. The method of claim 25, wherein the computing of the separate residual parameters for each subframe comprises:
- separating the voicing metrics residual parameters for the frame from the combined residual parameters for the frame; and
  
  performing an inverse transformation on the voicing metrics residual parameters for the frame to produce the separate residual parameters for each subframe of the frame.

27. A decoder for decoding speech from a frame of bits that has been encoded by digitizing a speech signal into a sequence of digital speech samples, estimating a set of voicing metrics parameters for a group of digital speech samples, the set including multiple voicing metrics parameters, jointly quantizing the voicing metrics parameters to produce a set of encoder voicing metrics bits, and including the encoder voicing metrics bits in a frame of bits, the decoder comprising:
- means for extracting decoder voicing metrics bits from the frame of bits;
  
  means for jointly reconstructing voicing metrics parameters using the decoder voicing metrics bits; and
  
  means for synthesizing digital speech samples using speech model parameters which include some or all of the reconstructed voicing metrics parameters.

28. Software on a processor readable medium comprising instructions for causing a processor to perform the following operations:
- estimate a set of voicing metrics parameters for a group of digital speech samples, the set including multiple voicing metrics parameters;
  
  jointly quantize the voicing metrics parameters to produce a set of encoder voicing metrics bits; and
  
  form a frame of bits including the encoder voicing metrics bits.
- View Dependent Claims (29)
- - 29. The software of claim 28, wherein the processor readable medium comprises a memory associated with a digital signal processing chip that includes the processor.

30. A communications system comprising:
- a transmitter configured to;
  
  digitize a speech signal into a sequence of digital speech samples;
  
  estimate a set of voicing metrics parameters for a group of digital speech samples, the set including multiple voicing metrics parameters;
  
  jointly quantize the voicing metrics parameters to produce a set of encoder voicing metrics bits;
  
  form a frame of bits including the encoder voicing metrics bits; and
  
  transmit the frame of bits, and a receiver configured to receive and process the frame of bits to produce a speech signal.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Digital Voice Systems, Inc.
Original Assignee
Digital Voice Systems, Inc.
Inventors
Hardwick, John C.
Primary Examiner(s)
SMITS, TALIVALDIS IVARS

Application Number

US08/985,262
Time in Patent Office

1,188 Days
Field of Search

704/207, 704/208, 704/222, 704/230
US Class Current

704/208
CPC Class Codes

G10L 19/0212 using orthogonal transforma...

G10L 19/10 the excitation function bei...

Joint quantization of speech subframe voicing metrics and fundamental frequencies

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

30 Claims

Specification

Solutions

Use Cases

Quick Links

Joint quantization of speech subframe voicing metrics and fundamental frequencies

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

30 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links