Apparatus and method for hybrid excited linear prediction speech encoding
First Claim
Patent Images
1. A method of creating an excitation signal associated with a segment of input speech, the method comprising:
- a. forming a spectral signal representative of the spectral parameters of the segment of input speech;
b. creating a set of excitation candidate signals, the set having at least one member, each excitation candidate signal comprised of a sequence of single waveforms, each waveform having a type, the sequence having at least one waveform, wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform;
c. forming a set of error signals, the set having at least one member, each error signal providing a measure of the accuracy with which the spectral signal and a given one of the excitation candidate signals encode the input speech segment;
d. selecting as the excitation signal an excitation candidate for which the corresponding error signal is indicative of sufficiently accurate encoding; and
e. if no excitation signal is selected, recursively creating a set of new excitation candidate signals according to step (b) wherein the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals, and repeating steps (c)-(e).
8 Assignments
0 Petitions
Accused Products
Abstract
A method is given of encoding a speech signal using analysis-by-synthesis to perform a flexible selection of the excitation waveforms in combination with an efficient bit allocation. This approach yields improved speech quality compared to other methods at similar bit rates.
28 Citations
136 Claims
-
1. A method of creating an excitation signal associated with a segment of input speech, the method comprising:
-
a. forming a spectral signal representative of the spectral parameters of the segment of input speech; b. creating a set of excitation candidate signals, the set having at least one member, each excitation candidate signal comprised of a sequence of single waveforms, each waveform having a type, the sequence having at least one waveform, wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform; c. forming a set of error signals, the set having at least one member, each error signal providing a measure of the accuracy with which the spectral signal and a given one of the excitation candidate signals encode the input speech segment; d. selecting as the excitation signal an excitation candidate for which the corresponding error signal is indicative of sufficiently accurate encoding; and e. if no excitation signal is selected, recursively creating a set of new excitation candidate signals according to step (b) wherein the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals, and repeating steps (c)-(e). - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. An excitation signal generator for use in encoding segments of input speech, the generator comprising:
-
a. a spectral signal analyzer for forming a spectral signal representative of the spectral parameters of the segment of input speech; b. an excitation candidate generator for creating a set of excitation candidate signals, the set having at least one member, each excitation candidate signal comprised of a sequence of single waveforms, each waveform having a type, the sequence having at least one waveform, wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform; c. an error signal generator for forming a set of error signals, the set having at least one member, each error signal providing a measure of the accuracy with which the spectral signal and a given one of the excitation candidate signals encode the input speech segment; d. an excitation signal selector for selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate coding; and e. a feedback loop including the excitation candidate generator and the error signal generator configured so that the excitation candidate generator, if no excitation signal is selected, recursively creates a set of new excitation candidate signals such that the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43)
-
-
44. A method of creating an excitation signal associated with a segment of input speech, the method comprising:
-
a. forming a spectral signal representative of the spectral parameters of the segment of input speech; b. filtering the segment of input speech according to the spectral signal to form a perceptually weighted segment of input speech; c. producing a reference signal representative of the segment of input speech by subtracting from the perceptually weighted segment of input speech a signal representative of any previous modeled excitation sequence of the current segment of input speech; d. creating a set of excitation candidate signals, the set having at least one member, each excitation candidate signal comprised of a sequence of single waveforms, each waveform having a type, the sequence having at least one waveform, wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform; e. combining a given one of the excitation candidate signals with the spectral signal to form a set of synthetic speech signals, the set having at least one member, each synthetic speech signal representative of the segment of input speech; f. spectrally shaping each synthetic speech signal to form a set of perceptually weighted synthetic speech signals, the set having at least one member; g. determining a set of error signals by comparing the reference signal representative of the segment of input speech to each member of the set of perceptually weighted synthetic speech signals; h. selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate encoding; and i. if no excitation signal is selected, recursively creating a set of new excitation candidate signals according to step (d) wherein the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals, and repeating steps (e)-(i). - View Dependent Claims (45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67)
-
-
68. An excitation signal generator for use in encoding segments of input speech, the generator comprising:
-
a. a spectral signal analyzer for forming a spectral signal representative of the spectral parameters of the segment of input speech; b. a de-emphasis filter which filters the segment of input speech according to the spectral signal to form a perceptually weighted segment of input speech; c. a reference signal generator which produces a reference signal representative of the segment of input speech by subtracting from the perceptually weighted segment of input speech a signal representative of any previously modeled excitation sequence of the current segment of input speech; d. an excitation candidate generator for creating a set of excitation candidate signals, the set having at least one member, each excitation candidate signal comprised of a sequence of single waveforms, each waveform having a type, the sequence having at least one waveform, wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform; e. a synthesis filter which combines a given one of the excitation candidate signals with the spectral signal to form a set of synthetic speech signals, the set having at least one member, each synthetic speech signal representative of the segment of input speech; f. a spectral shaping filter which shapes each synthetic speech signal to form a set of perceptually weighted synthetic speech signals, the set having at least one member; g. a signal comparator which determines a set of error signals by comparing the reference signal representative of the segment of input speech to each member of the set of perceptually weighted synthetic speech signals; h. an excitation signal selector for selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate encoding; and i. a feedback loop including the excitation candidate generator and the error signal generator configured so that the excitation candidate generator, if no excitation signal is selected, recursively creates a set of new excitation candidate signals such that the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals. - View Dependent Claims (69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89)
-
-
90. A method of creating an excitation signal associated with a segment of input speech, the method comprising:
-
a. forming a spectral signal representative of the spectral parameters of the segment of input speech; b. creating a set of excitation candidate signals, the set having at least one member, each excitation candidate signal composed of members from a plurality of sets of excitation sequences, wherein each excitation sequence is comprised of a sequence of single waveforms, each waveform having a type, the sequence having at least one waveform, wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform; c. forming a set of error signals, the set having at least one member, each error signal providing a measure of the accuracy with which the spectral signal and a given one of the excitation candidate signals encode the input speech segment; d. selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate encoding; and e. if no excitation signal is selected, recursively creating a set of new excitation candidate signals according to step (b) wherein the position of at least one single waveform in at least one of the excitation sequences is modified in response to the error signal, and repeating steps (c)-(e). - View Dependent Claims (91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113)
-
-
114. An excitation signal generator for use in encoding segments of input speech, the generator comprising:
-
a. a spectral signal analyzer for forming a spectral signal representative of the spectral parameters of the segment of input speech; b. an excitation candidate generator for creating a set of excitation candidate signals, the set having at least one member, each excitation candidate signal composed of members from a plurality of sets of excitation sequences, wherein each excitation sequence is comprised of a sequence of single waveforms, each waveform having a type, the sequence having at least one waveform, wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform; c. an error signal generator for forming a set of error signals, the set having at least one member, each error signal providing a measure of the accuracy with which the spectral signal and a given one of the excitation candidate signals encode the input speech segment; d. an excitation signal selector for selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate encoding; and e. a feedback loop including the excitation candidate generator and the error signal generator configured so that the excitation candidate generator, if no excitation signal is selected, recursively creates a set of new excitation candidate signals such that the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals. - View Dependent Claims (115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136)
-
Specification