Method and system for encoding digital speech information
First Claim
1. A method of encoding digital speech information to characterize spoken human speech with an optimally reduced speech data rate while retaining speech quality in the audible reproduction of the encoded digital speech information, said method comprising:
- storing digital speech information as digital speech data in the form of quantized speech parameter values comprising a plurality of speech data frames;
determining the transition probabilities for corresponding quantized speech parameter values in the next successive speech data frame in relation to the current speech data frame;
establishing the conditional probabilities as to the quantization values of the speech parameters of successive speech data frames based upon the determination of the transition probabilities; and
representing the respective quantization values of the speech parameters after the conditional probabilities have been established by a digital code wherein digital codewords of variable length represent quantization values in accordance with their probability of occurrence such that more probable quantization values are assigned digital codewords of a shorter bit length while less probable quantization values are assigned digital codewords of a longer bit length.
1 Assignment
0 Petitions
Accused Products
Abstract
Method and system for encoding digital speech information to characterize spoken human speech with an optimally reduced speech data rate while retaining speech quality in the audible reproduction of the encoded digital speech information. Markov modeling is applied to quantized speech parameters to represent their time behavior in a probabilistic manner. This is accomplished by representing the quantized speech parameters as finite state machines having predetermined matrices of transitional probabilities from which the conditional probabilities as to the quantized speech parameter values of successive speech data frames are established. The probabilistic description as so obtained is then used to represent the respective quantized values of the speech parameters by a digital code through Huffman coding in which digital codewords of variable length represent the quantized speech parameter values in accordance with their probability of occurrence such that more probable quantized values are assigned digital codewords of a shorter bit length while less probable quantized values are assigned digital codewords of a longer bit length.
-
Citations
18 Claims
-
1. A method of encoding digital speech information to characterize spoken human speech with an optimally reduced speech data rate while retaining speech quality in the audible reproduction of the encoded digital speech information, said method comprising:
-
storing digital speech information as digital speech data in the form of quantized speech parameter values comprising a plurality of speech data frames; determining the transition probabilities for corresponding quantized speech parameter values in the next successive speech data frame in relation to the current speech data frame; establishing the conditional probabilities as to the quantization values of the speech parameters of successive speech data frames based upon the determination of the transition probabilities; and representing the respective quantization values of the speech parameters after the conditional probabilities have been established by a digital code wherein digital codewords of variable length represent quantization values in accordance with their probability of occurrence such that more probable quantization values are assigned digital codewords of a shorter bit length while less probable quantization values are assigned digital codewords of a longer bit length. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A speech encoding system for providing encoded digital speech information in a form producing an optimally reduced speech data rate while retaining speech quality in the subsequent audible reproduction of the encoded digital speech information, said system comprising:
-
first memory means storing a plurality of digital codewords representative of the respective quantization values to be attributed to speech parameters as derived from finite state machines having predetermined matrices of columns and rows of transitional probabilities representative of the quantized speech parameter values wherein the digital codewords corresponding to a given predetermined matrix are of variable bit lengths in accordance with the probability of occurrence of a given quantization value such that more probable quantization values are represented by digital codewords of a shorter bit length while less probable quantization values are represented by digital codewords of a longer bit length; second memory means having a storage capacity sufficient to accept at least a single frame of digital speech data wherein the digital speech parameters included in said frames of speech data are in quantized form; and
being adapted to receive respective frames of digital speech data from a source thereof;coding means for encoding frames of digital speech data wherein the digital speech parameters thereof are in quantized form, said coding means being operably coupled to said first and second memory means and to a source of digital speech data in quantized form; and said coding means being responsive to a current frame of digital speech data as input thereto and to at least a single previous frame of digital speech data from said second memory means to access the appropriate digital codewords from said first memory means for assigning a digital codeword from said first memory means to each of the quantized speech parameters included in the current frame of digital speech data as the output therefrom. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A speech synthesis system for producing audible synthesized speech at a reduced bit rate from encoded digital speech information, said speech synthesis system comprising:
-
a source of digital speech information identified as one or more frames of encoded digital speech data having speech parameters defining the respective digital speech frames, wherein each of the speech parameters is represented by a respective digital codeword representative of the quantization value thereof, the digital codewords being derived from finite state machines having predetermined matrices of columns and rows of transitional probabilities representative of the quantized speech parameter values wherein the digital codewords correspond to a given predetermined matrix and are of variable bit lengths in accordance with the probability of occurrence of a given quantization value such that more probable quantization values are represented by digital codewords of a shorter bit length while less probable quantization values are represented by digital codewords of a longer bit length; first memory means storing a plurality of digital code words representative of speech parameters, wherein each speech parameter in successive speech frames is identified by a codeword of a constant bit length and serving as an address identifying a digital speech parameter of a fixed bit number length; second memory means having a storage capacity sufficient to accept at least a single frame of digital speech data wherein the digital speech parameters included in said frame of speech data are defined by digital codewords of a constant bit length for respective parameters in successive digital speech frames; decoding means for decoding frames of digital speech data and being operably coupled to said source of encoded speech data and said first and second memory means, said decoding means being responsive to a current frame of digital speech data as input thereto and to at least a single previous decoded frame of digital speech data from said second memory means to access the appropriate digital codewords of constant bit length for respective speech parameters from said first memory means for assigning a digital codeword from said first memory means to each of said speech parameters included in the current frame of encoded digital speech data as the output therefrom; parameter memory means connected to the output of said decoder means and having a plurality of digital speech parameter values stored therein identifiable by respective digital codewords from said first memory means and responsive to the output from said decoder means for providing decoded digital speech parameters of a constant bit length greater than the bit lengths of the respective digital codewords included in said first memory means as an output therefrom; speech synthesizer means connected to said parameter memory means for receiving the decoded digital speech parameters therefrom and providing an analog speech signal representative of synthesized human speech as an output in response thereto; and audio means coupled to the output of said speech synthesizer means for converting said analog speech signal representative of synthesized human speech into audible speech. - View Dependent Claims (18)
-
Specification