Method and system for encoding digital speech information

US 4,718,087 A
Filed: 05/11/1984
Issued: 01/05/1988
Est. Priority Date: 05/11/1984
Status: Expired due to Term

First Claim

Patent Images

1. A method of encoding digital speech information to characterize spoken human speech with an optimally reduced speech data rate while retaining speech quality in the audible reproduction of the encoded digital speech information, said method comprising:

storing digital speech information as digital speech data in the form of quantized speech parameter values comprising a plurality of speech data frames;

determining the transition probabilities for corresponding quantized speech parameter values in the next successive speech data frame in relation to the current speech data frame;

establishing the conditional probabilities as to the quantization values of the speech parameters of successive speech data frames based upon the determination of the transition probabilities; and

representing the respective quantization values of the speech parameters after the conditional probabilities have been established by a digital code wherein digital codewords of variable length represent quantization values in accordance with their probability of occurrence such that more probable quantization values are assigned digital codewords of a shorter bit length while less probable quantization values are assigned digital codewords of a longer bit length.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Method and system for encoding digital speech information to characterize spoken human speech with an optimally reduced speech data rate while retaining speech quality in the audible reproduction of the encoded digital speech information. Markov modeling is applied to quantized speech parameters to represent their time behavior in a probabilistic manner. This is accomplished by representing the quantized speech parameters as finite state machines having predetermined matrices of transitional probabilities from which the conditional probabilities as to the quantized speech parameter values of successive speech data frames are established. The probabilistic description as so obtained is then used to represent the respective quantized values of the speech parameters by a digital code through Huffman coding in which digital codewords of variable length represent the quantized speech parameter values in accordance with their probability of occurrence such that more probable quantized values are assigned digital codewords of a shorter bit length while less probable quantized values are assigned digital codewords of a longer bit length.

Citations

18 Claims

1. A method of encoding digital speech information to characterize spoken human speech with an optimally reduced speech data rate while retaining speech quality in the audible reproduction of the encoded digital speech information, said method comprising:
- storing digital speech information as digital speech data in the form of quantized speech parameter values comprising a plurality of speech data frames;
  
  determining the transition probabilities for corresponding quantized speech parameter values in the next successive speech data frame in relation to the current speech data frame;
  
  establishing the conditional probabilities as to the quantization values of the speech parameters of successive speech data frames based upon the determination of the transition probabilities; and
  
  representing the respective quantization values of the speech parameters after the conditional probabilities have been established by a digital code wherein digital codewords of variable length represent quantization values in accordance with their probability of occurrence such that more probable quantization values are assigned digital codewords of a shorter bit length while less probable quantization values are assigned digital codewords of a longer bit length.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. A method of encoding digital speech information as set forth in claim 1, further includingrepresenting the quantized speech parameter values by finite state machines having predetermined matrices of columns and rows of transitional probabilities prior to the determination of the transition probabilities for the corresponding quantized speech parameter values of the current and at least the next successive speech data frame.
  - 3. A method of encoding digital speech information as set forth in claim 2, wherein the respective quantization values of the speech parameters are represented by a digital code as applied to all of the rows of the transitional probabilities matrix corresponding thereto.
  - 4. A method of encoding digital speech information as set forth in claim 3, wherein the digital codewords representing the respective quantization values of the speech parameters are of shorter bit lengths around the diagonal of the transitional probabilities matrix corresponding thereto reflecting a higher probability of occurrence for a particular quantization value of the speech parameter.
  - 5. A method of encoding digital speech information as set forth in claim 4, wherein the representation of the respective quantization values of the speech parameters is accomplished by assigning a uniquely decodable digital codeword which is distinct from the first part of any other digital codeword for each of the quantization values included in a transitional probabilities matrix.
  - 6. A method of encoding digital spech information as set forth in claim 2, further includingcombining transitional probabilities matrices of the same dimensions to provide a supermatrix from which the transitional probabilities are determined, andrepresenting the respective quantization values of the speech parameters from the combined transitional probabilities matrices by digital codewords based upon said supermatrix.
  - 7. A method of encoding digital speech information as set forth in claim 6, wherein the combining of transitional probabilities matrices is accomplished for every set of matrices [j₁, . . . ,j_m ] having the same dimensions to provide said supermatrix in accordance with ##EQU7## where s is the supermatrix, and n(j,i/k) is the number of transitions occurring in a reference speech data base from which the transitional probabilities matrices are determined.
  - 8. A method of encoding digital speech information as set forth in claim 2, further includingcondensing each of the predetermined matrices of columns and rows of transitional probabilities to a single super row of transitional probabilities indicative of the respective matrix corresponding thereto;
    - andshifting said super row of transitional probabilities in increments of one position in either direction to generate the conditional probability distribution of additional rows of transitional probabilities for the specific matrix corresponding to said super row.
  - 9. A method of encoding digital speech information as set forth in claim 8, wherein said super row is based upon the middle row 2^b(j)-1 of the matrix of transitional probabilities of which it is indicative and has absolute frequencies of occurrence n(j,i) in accordance with ##EQU8## where n(j,i/i₁)=0 for i≦
    - 0 or i≦
      
      2^b(j).

10. A speech encoding system for providing encoded digital speech information in a form producing an optimally reduced speech data rate while retaining speech quality in the subsequent audible reproduction of the encoded digital speech information, said system comprising:
- first memory means storing a plurality of digital codewords representative of the respective quantization values to be attributed to speech parameters as derived from finite state machines having predetermined matrices of columns and rows of transitional probabilities representative of the quantized speech parameter values wherein the digital codewords corresponding to a given predetermined matrix are of variable bit lengths in accordance with the probability of occurrence of a given quantization value such that more probable quantization values are represented by digital codewords of a shorter bit length while less probable quantization values are represented by digital codewords of a longer bit length;
  
  second memory means having a storage capacity sufficient to accept at least a single frame of digital speech data wherein the digital speech parameters included in said frames of speech data are in quantized form; and
  
  being adapted to receive respective frames of digital speech data from a source thereof;
  
  coding means for encoding frames of digital speech data wherein the digital speech parameters thereof are in quantized form, said coding means being operably coupled to said first and second memory means and to a source of digital speech data in quantized form; and
  
  said coding means being responsive to a current frame of digital speech data as input thereto and to at least a single previous frame of digital speech data from said second memory means to access the appropriate digital codewords from said first memory means for assigning a digital codeword from said first memory means to each of the quantized speech parameters included in the current frame of digital speech data as the output therefrom.
- View Dependent Claims (11, 12, 13, 14, 15, 16)
- - 11. A speech encoding system as set forth in claim 10, wherein the source of digital speech data in quantized form comprises linear predictive coded digital speech parameters;
    - the output of said coding means producing digital codewords corresponding to each of the linear predictive coding quantized speech parameters of the current frame of digital speech data but having a reduced bit length as compared thereto.
  - 12. A speech encoding system as set forth in claim 11, further including analyzer means for receiving an analog speech signal representative or oral speech and providing digital speech information indicative thereof in the form of one or more digital speech frames made oup of individual digital speech parameters;
    - andquantizer means for receiving said one or more digital speech frames from said analyzer means and quantizing the speech parameters thereof.
  - 13. A speech encoding system as set forth in claim 10, wherein the plurality of digital codewords stored in said first memory means are derived from combined transitional probabilities matrices of the same dimensions so as to define respective supermatrices on which the plurality of digital codewords are based.
  - 14. A speech encoding system as set forth in claim 13, wherein said plurality of digital codewords stored in said first memory means are derived for every set of matrices [j₁, . . . , j_m ] having the same dimensions to provide said supermatrix in accordance with ##EQU9## where s is the super matrix, and n(j, i/k) is the number of transitions occurring in a reference speech data base from which the transitional probabilities matrices were originally determined.
  - 15. A speech encoding system as set forth in claim 10, wherein said plurality of digital codewords stored in said first memory means are derived from respective single super rows of transitional probabilities indicative of each of the predetermined matrices of columns and rows of transitional probabilities;
    - andsaid coding means including means therein for shifting an accessed super row of transitional probabilities to which digital codewords are assigned in increments of one position in either direction to generate the complete series of digital codewords corresponding to the specific matrix upon which said super row is based.
  - 16. A speech encoding system as set forth in claim 15, wherein said super row is based upon the middle row 2^b(j)-1 of the matrix of transitional probabilities of which it is indicative and has absolute frequencies of occurrence n(j, i) in accordance with ##EQU10## where n(j, i/i₁) equals 0 for i≦
    - 0 or i>
      
      2^b(j).

17. A speech synthesis system for producing audible synthesized speech at a reduced bit rate from encoded digital speech information, said speech synthesis system comprising:
- a source of digital speech information identified as one or more frames of encoded digital speech data having speech parameters defining the respective digital speech frames, wherein each of the speech parameters is represented by a respective digital codeword representative of the quantization value thereof, the digital codewords being derived from finite state machines having predetermined matrices of columns and rows of transitional probabilities representative of the quantized speech parameter values wherein the digital codewords correspond to a given predetermined matrix and are of variable bit lengths in accordance with the probability of occurrence of a given quantization value such that more probable quantization values are represented by digital codewords of a shorter bit length while less probable quantization values are represented by digital codewords of a longer bit length;
  
  first memory means storing a plurality of digital code words representative of speech parameters, wherein each speech parameter in successive speech frames is identified by a codeword of a constant bit length and serving as an address identifying a digital speech parameter of a fixed bit number length;
  
  second memory means having a storage capacity sufficient to accept at least a single frame of digital speech data wherein the digital speech parameters included in said frame of speech data are defined by digital codewords of a constant bit length for respective parameters in successive digital speech frames;
  
  decoding means for decoding frames of digital speech data and being operably coupled to said source of encoded speech data and said first and second memory means, said decoding means being responsive to a current frame of digital speech data as input thereto and to at least a single previous decoded frame of digital speech data from said second memory means to access the appropriate digital codewords of constant bit length for respective speech parameters from said first memory means for assigning a digital codeword from said first memory means to each of said speech parameters included in the current frame of encoded digital speech data as the output therefrom;
  
  parameter memory means connected to the output of said decoder means and having a plurality of digital speech parameter values stored therein identifiable by respective digital codewords from said first memory means and responsive to the output from said decoder means for providing decoded digital speech parameters of a constant bit length greater than the bit lengths of the respective digital codewords included in said first memory means as an output therefrom;
  
  speech synthesizer means connected to said parameter memory means for receiving the decoded digital speech parameters therefrom and providing an analog speech signal representative of synthesized human speech as an output in response thereto; and
  
  audio means coupled to the output of said speech synthesizer means for converting said analog speech signal representative of synthesized human speech into audible speech.
- View Dependent Claims (18)
- - 18. A speech synthesis system as set forth in claim 17, wherein the digital codewords stored in said first memory means are based upon linear predictive coding, and said speech synthesizer means is a linear predictive coding speech synthesizer.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Texas Instruments, Inc.
Original Assignee
Texas Instruments, Inc.
Inventors
Papamichalis, Panagiotis E.
Primary Examiner(s)
KEMENY, EMANUEL

Application Number

US06/609,155
Time in Patent Office

1,334 Days
Field of Search

381/34, 381/51, 381/31, 358/261
US Class Current

704/222
CPC Class Codes

G10L 15/14 using statistical models, e...

H03M 7/42 using table look-up for the...

Method and system for encoding digital speech information

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for encoding digital speech information

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links