Apparatus and method for hybrid excited linear prediction speech encoding

US 5,963,897 A
Filed: 02/27/1998
Issued: 10/05/1999
Est. Priority Date: 02/27/1998
Status: Expired due to Term

First Claim

Patent Images

1. A method of creating an excitation signal associated with a segment of input speech, the method comprising:

a. forming a spectral signal representative of the spectral parameters of the segment of input speech;

b. creating a set of excitation candidate signals, the set having at least one member, each excitation candidate signal comprised of a sequence of single waveforms, each waveform having a type, the sequence having at least one waveform, wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform;

c. forming a set of error signals, the set having at least one member, each error signal providing a measure of the accuracy with which the spectral signal and a given one of the excitation candidate signals encode the input speech segment;

d. selecting as the excitation signal an excitation candidate for which the corresponding error signal is indicative of sufficiently accurate encoding; and

e. if no excitation signal is selected, recursively creating a set of new excitation candidate signals according to step (b) wherein the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals, and repeating steps (c)-(e).

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method is given of encoding a speech signal using analysis-by-synthesis to perform a flexible selection of the excitation waveforms in combination with an efficient bit allocation. This approach yields improved speech quality compared to other methods at similar bit rates.

28 Citations

View as Search Results

136 Claims

1. A method of creating an excitation signal associated with a segment of input speech, the method comprising:
- a. forming a spectral signal representative of the spectral parameters of the segment of input speech;
  
  b. creating a set of excitation candidate signals, the set having at least one member, each excitation candidate signal comprised of a sequence of single waveforms, each waveform having a type, the sequence having at least one waveform, wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform;
  
  c. forming a set of error signals, the set having at least one member, each error signal providing a measure of the accuracy with which the spectral signal and a given one of the excitation candidate signals encode the input speech segment;
  
  d. selecting as the excitation signal an excitation candidate for which the corresponding error signal is indicative of sufficiently accurate encoding; and
  
  e. if no excitation signal is selected, recursively creating a set of new excitation candidate signals according to step (b) wherein the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals, and repeating steps (c)-(e).
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
- - 2. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein step (a) further includes composing the spectral signal of linear predictive coefficients.
  - 3. A method of creating an excitation signal associated with a segment of input speech according to claim 1, further including extracting from the segment of input speech selected parameters indicative of redundant information present in the segment of input speech.
  - 4. A method of creating an excitation signal associated with a segment of input speech according to claim 3, wherein in step (b), at least one excitation candidate is further responsive to the selected parameters indicative of redundant information present in the segment of input speech.
  - 5. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein in step (b), the first single waveform in a given one of the excitation candidate signals is positioned with respect to the beginning of the segment of input speech.
  - 6. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein in step (b), the relative positions of subsequent single waveforms are determined dynamically.
  - 7. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein in step (b), the relative positions of subsequent single waveforms are determined by use of a table of allowable positions.
  - 8. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein in step (b), the single waveforms include at least one of:
    - glottal pulse waveforms, sinusoidal period waveforms, and single pulses.
  - 9. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein in step (b), the single waveforms include at least one of:
    - quasi-stationary signal waveforms and non-stationary signal waveforms.
  - 10. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein in step (b), the single waveforms include at least one of:
    - substantially periodic waveforms, speech transition sound waveforms, flat spectra waveforms and non-periodic waveforms.
  - 11. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein in step (b), the types of single waveforms are pre-selected.
  - 12. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein in step (b), the types of single waveforms are dynamically selected.
  - 13. A method of creating an excitation signal associated with a segment of input speech as in claim 12, wherein the dynamic selection of the types of single waveforms is a function of the set of error signals.
  - 14. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein in step (b), the single waveforms are variable in length.
  - 15. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein in step (b), the single waveforms are fixed in length.
  - 16. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein in step (b), the number of single waveforms in the sequence is variable.
  - 17. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein in step (b), the number of single waveforms in the sequence is fixed.
  - 18. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein step (b) further includes applying any portion of a single waveform extending beyond the end of the current segment of input speech to the beginning of the current segment of input speech.
  - 19. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein step (b) further includes applying any portion of a single waveform extending beyond the end of the current segment of input speech to the beginning of the next segment of input speech.
  - 20. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein step (b) further includes ignoring any portion of a single waveform extending beyond the end of the current segment of input speech.
  - 21. A method of creating an excitation signal associated with a segment of input speech according to claim 1, wherein in step (b) at least one single waveform is modulated in accordance with a gain factor.
  - 22. A method of creating an excitation signal associated with a segment of input speech as in claim 1, wherein step (c) employs a synthesis filter.

23. An excitation signal generator for use in encoding segments of input speech, the generator comprising:
- a. a spectral signal analyzer for forming a spectral signal representative of the spectral parameters of the segment of input speech;
  
  b. an excitation candidate generator for creating a set of excitation candidate signals, the set having at least one member, each excitation candidate signal comprised of a sequence of single waveforms, each waveform having a type, the sequence having at least one waveform, wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform;
  
  c. an error signal generator for forming a set of error signals, the set having at least one member, each error signal providing a measure of the accuracy with which the spectral signal and a given one of the excitation candidate signals encode the input speech segment;
  
  d. an excitation signal selector for selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate coding; and
  
  e. a feedback loop including the excitation candidate generator and the error signal generator configured so that the excitation candidate generator, if no excitation signal is selected, recursively creates a set of new excitation candidate signals such that the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals.
- View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43)
- - 24. An excitation signal generator as in claim 23, wherein the spectral signal analyzer forms the spectral signal with linear predictive coefficients.
  - 25. An excitation signal generator as in claim 23 further including an extractor for extracting from the segment of input speech selected parameters indicative of redundant information present in the segment of input speech.
  - 26. An excitation signal generator as in claim 25, wherein the excitation candidate generator is responsive to the selected parameters indicative of redundant information present in the segment of input speech.
  - 27. An excitation signal generator as in claim 23, wherein the excitation candidate generator positions the first single waveform in at least one excitation candidate signal with respect to the beginning of the segment of input speech.
  - 28. An excitation signal generator as in claim 23, wherein the excitation candidate generator determines the relative positions of subsequent single waveforms dynamically.
  - 29. An excitation signal generator as in claim 23, wherein the excitation candidate generator determines the relative positions of subsequent single waveforms by use of a table of allowable positions.
  - 30. An excitation signal generator as in claim 23, wherein the excitation candidate generator uses single waveforms including at least one of:
    - glottal pulse waveforms, sinusoidal period waveforms, and single pulses.
  - 31. An excitation signal generator as in claim 23, wherein the excitation candidate generator uses single waveforms including at least one of:
    - quasi-stationary signal waveforms and non-stationary signal waveforms.
  - 32. An excitation signal generator as in claim 23, wherein the excitation candidate generator uses single waveforms including at least one of:
    - substantially periodic waveforms, speech transition sound waveforms, flat spectra waveforms and non-periodic waveforms.
  - 33. An excitation signal generator as in claim 23, wherein the excitation candidate generator preselects the types of single waveforms.
  - 34. An excitation signal generator as in claim 23, wherein the excitation candidate generator dynamically selects the types of single waveforms.
  - 35. An excitation signal generator as in claim 34, wherein the dynamic selection of the types of single waveforms is a function of the set of error signals.
  - 36. An excitation signal generator as in claim 23, wherein the excitation candidate generator uses variable length single waveforms.
  - 37. An excitation signal generator as in claim 23, wherein the excitation candidate generator uses fixed length single waveforms.
  - 38. An excitation signal generator as in claim 23, wherein the excitation candidate generator uses a variable number of single waveforms.
  - 39. An excitation signal generator as in claim 23, wherein the excitation candidate generator uses a fixed number of single waveforms.
  - 40. An excitation signal generator as in claim 23, wherein the excitation candidate generator applies any portion of a single waveform extending beyond the end of the current segment of input speech to the beginning of the current segment of input speech.
  - 41. An excitation signal generator as in claim 23, wherein the excitation candidate generator applies any portion of a single waveform extending beyond the end of the current segment of input speech to the beginning of the next segment of input speech.
  - 42. An excitation signal generator as in claim 23, wherein the excitation candidate generator ignores any portion of a single waveform extending beyond the end of the current segment of input speech.
  - 43. An excitation signal generator as in claim 23, wherein the excitation candidate generator modulates at least one single waveform in accordance with a gain factor.

44. A method of creating an excitation signal associated with a segment of input speech, the method comprising:
- a. forming a spectral signal representative of the spectral parameters of the segment of input speech;
  
  b. filtering the segment of input speech according to the spectral signal to form a perceptually weighted segment of input speech;
  
  c. producing a reference signal representative of the segment of input speech by subtracting from the perceptually weighted segment of input speech a signal representative of any previous modeled excitation sequence of the current segment of input speech;
  
  d. creating a set of excitation candidate signals, the set having at least one member, each excitation candidate signal comprised of a sequence of single waveforms, each waveform having a type, the sequence having at least one waveform, wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform;
  
  e. combining a given one of the excitation candidate signals with the spectral signal to form a set of synthetic speech signals, the set having at least one member, each synthetic speech signal representative of the segment of input speech;
  
  f. spectrally shaping each synthetic speech signal to form a set of perceptually weighted synthetic speech signals, the set having at least one member;
  
  g. determining a set of error signals by comparing the reference signal representative of the segment of input speech to each member of the set of perceptually weighted synthetic speech signals;
  
  h. selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate encoding; and
  
  i. if no excitation signal is selected, recursively creating a set of new excitation candidate signals according to step (d) wherein the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals, and repeating steps (e)-(i).
- View Dependent Claims (45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67)
- - 45. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein step (a) further includes composing the spectral signal of linear predictive coefficients.
  - 46. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein step (c) further includes subtracting a contribution due to previously modeled excitation in the current segment of input speech.
  - 47. A method of creating an excitation signal associated with a segment of input speech according to claim 44, further including extracting from the segment of input speech selected parameters indicative of redundant information present in the segment of input speech.
  - 48. A method of creating an excitation signal associated with a segment of input speech according to claim 47, wherein in step (d), the set of excitation candidate signals is further responsive to the selected parameters indicative of redundant information present in the segment of input speech.
  - 49. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein in step (d), the first single waveform in a given one of the excitation candidate signals is positioned with respect to the beginning of the segment of input speech.
  - 50. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein in step (d), the relative positions of subsequent single waveforms are determined dynamically.
  - 51. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein in step (d), the relative positions of subsequent single waveforms are determined by use of a table of allowable positions.
  - 52. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein in step (d), the single waveforms include at least one of:
    - glottal pulse waveforms, sinusoidal period waveforms, and single pulses.
  - 53. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein in step (d), the single waveforms include at least one of:
    - quasi-stationary signal waveforms and non-stationary signal waveforms.
  - 54. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein in step (d), the single waveforms include at least one of:
    - substantially periodic waveforms, speech transition sound waveforms, flat spectra waveforms and non-periodic waveforms.
  - 55. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein in step (d), the types of single waveforms are pre-selected.
  - 56. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein in step (d), the types of single waveforms are dynamically selected.
  - 57. A method of creating an excitation signal associated with a segment of input speech as in claim 56, wherein the dynamic selection of the types of single waveforms is a function of the set of error signals.
  - 58. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein in step (d), the single waveforms are variable in length.
  - 59. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein in step (d), the single waveforms are fixed in length.
  - 60. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein in step (d), the number of single waveforms in the sequence is variable.
  - 61. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein in step (d), the number of single waveforms in the sequence is fixed.
  - 62. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein step (d) further includes applying any portion of a single waveform extending beyond the end of the current segment of input speech to the beginning of the current segment of input speech.
  - 63. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein step (d) further includes applying any portion of a single waveform extending beyond the end of the current segment of input speech to the beginning of the next segment of input speech.
  - 64. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein step (d) further includes ignoring any portion of a single waveform extending beyond the end of the current segment of input speech.
  - 65. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein in step (d) at least one single waveform is modulated in accordance with a gain factor.
  - 66. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein step (e) employs a synthesis filter.
  - 67. A method of creating an excitation signal associated with a segment of input speech as in claim 44, wherein step (f) employs a de-emphasis filter.

68. An excitation signal generator for use in encoding segments of input speech, the generator comprising:
- a. a spectral signal analyzer for forming a spectral signal representative of the spectral parameters of the segment of input speech;
  
  b. a de-emphasis filter which filters the segment of input speech according to the spectral signal to form a perceptually weighted segment of input speech;
  
  c. a reference signal generator which produces a reference signal representative of the segment of input speech by subtracting from the perceptually weighted segment of input speech a signal representative of any previously modeled excitation sequence of the current segment of input speech;
  
  d. an excitation candidate generator for creating a set of excitation candidate signals, the set having at least one member, each excitation candidate signal comprised of a sequence of single waveforms, each waveform having a type, the sequence having at least one waveform, wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform;
  
  e. a synthesis filter which combines a given one of the excitation candidate signals with the spectral signal to form a set of synthetic speech signals, the set having at least one member, each synthetic speech signal representative of the segment of input speech;
  
  f. a spectral shaping filter which shapes each synthetic speech signal to form a set of perceptually weighted synthetic speech signals, the set having at least one member;
  
  g. a signal comparator which determines a set of error signals by comparing the reference signal representative of the segment of input speech to each member of the set of perceptually weighted synthetic speech signals;
  
  h. an excitation signal selector for selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate encoding; and
  
  i. a feedback loop including the excitation candidate generator and the error signal generator configured so that the excitation candidate generator, if no excitation signal is selected, recursively creates a set of new excitation candidate signals such that the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals.
- View Dependent Claims (69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89)
- - 69. An excitation signal generator as in claim 68, wherein the spectral signal analyzer forms the spectral signal with linear predictive coefficients.
  - 70. An excitation signal generator as in claim 68, wherein the reference signal generator further includes means for subtracting a contribution due to previously modeled excitation in the current segment of input speech.
  - 71. An excitation signal generator as in claim 68 further including an extractor for extracting from the segment of input speech selected parameters indicative of redundant information present in the segment of input speech.
  - 72. An excitation signal generator as in claim 71, wherein the excitation candidate generator is responsive to the selected parameters indicative of redundant information present in the segment of input speech.
  - 73. An excitation signal generator as in claim 68, wherein the excitation candidate generator positions the first single waveform in a given one of the excitation candidate signals with respect to the beginning of the segment of input speech.
  - 74. An excitation signal generator as in claim 68, wherein the excitation candidate generator determines the relative positions of subsequent single waveforms dynamically.
  - 75. An excitation signal generator as in claim 68, wherein the excitation candidate generator determines the relative positions of subsequent single waveforms by use of a table of allowable positions.
  - 76. An excitation signal generator as in claim 68, wherein the excitation candidate generator uses single waveforms including at least one of:
    - glottal pulse waveforms, sinusoidal period waveforms, and single pulses.
  - 77. An excitation signal generator as in claim 68, wherein the excitation candidate generator uses single waveforms including at least one of:
    - quasi-stationary signal waveforms and non-stationary signal waveforms.
  - 78. An excitation signal generator as in claim 68, wherein the excitation candidate generator uses single waveforms including at least one of:
    - substantially periodic waveforms, speech transition sound waveforms, flat spectra waveforms and non-periodic waveforms.
  - 79. An excitation signal generator as in claim 68, wherein the excitation candidate generator pre-select the types of single waveforms.
  - 80. An excitation signal generator as in claim 68, wherein the excitation candidate generator dynamically selects the types of single waveforms.
  - 81. An excitation signal generator as in claim 80, wherein the dynamic selection of the types of single waveforms is a function of the set of error signals.
  - 82. An excitation signal generator as in claim 68, wherein the excitation candidate generator uses variable length single waveforms.
  - 83. An excitation signal generator as in claim 68, wherein the excitation candidate generator uses fixed length single waveforms.
  - 84. An excitation signal generator as in claim 68, wherein the excitation candidate generator uses a variable number of single waveforms.
  - 85. An excitation signal generator as in claim 68, wherein the excitation candidate generator uses a fixed number of single waveforms.
  - 86. An excitation signal generator as in claim 68, wherein the excitation candidate generator applies any portion of a single waveform extending beyond the end of the current segment of input speech to the beginning of the current segment of input speech.
  - 87. An excitation signal generator as in claim 68, wherein the excitation candidate generator applies any portion of a single waveform extending beyond the end of the current segment of input speech to the beginning of the next segment of input speech.
  - 88. An excitation signal generator as in claim 68, wherein the excitation candidate generator ignores any portion of a single waveform extending beyond the end of the current segment of input speech.
  - 89. An excitation signal generator as in claim 68, wherein the excitation candidate generator modulates at least one single waveform in accordance with a gain factor.

90. A method of creating an excitation signal associated with a segment of input speech, the method comprising:
- a. forming a spectral signal representative of the spectral parameters of the segment of input speech;
  
  b. creating a set of excitation candidate signals, the set having at least one member, each excitation candidate signal composed of members from a plurality of sets of excitation sequences, wherein each excitation sequence is comprised of a sequence of single waveforms, each waveform having a type, the sequence having at least one waveform, wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform;
  
  c. forming a set of error signals, the set having at least one member, each error signal providing a measure of the accuracy with which the spectral signal and a given one of the excitation candidate signals encode the input speech segment;
  
  d. selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate encoding; and
  
  e. if no excitation signal is selected, recursively creating a set of new excitation candidate signals according to step (b) wherein the position of at least one single waveform in at least one of the excitation sequences is modified in response to the error signal, and repeating steps (c)-(e).
- View Dependent Claims (91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113)
- - 91. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein step (a) further includes composing the spectral signal of linear predictive coefficients.
  - 92. A method of creating an excitation signal associated with a segment of input speech according to claim 90, further including extracting from the segment of input speech selected parameters indicative of redundant information present in the segment of input speech.
  - 93. A method of creating an excitation signal associated with a segment of input speech according to claim 92, wherein in step (b), at least one of the excitation sequences is further responsive to the selected parameters indicative of redundant information present in the segment of input speech.
  - 94. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein step (b) further includes positioning the first single waveform in each excitation sequence with respect to the beginning of the segment of input speech.
  - 95. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein in step (b), in at least one excitation sequence the relative positions of subsequent single waveforms are determined dynamically.
  - 96. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein in step (b), in at least one excitation sequence the relative positions of subsequent single waveforms are determined by use of a table of allowable positions.
  - 97. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein in step (b), the single waveforms include at least one of:
    - glottal pulse waveforms, sinusoidal period waveforms, and single pulses.
  - 98. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein in step (b), the single waveforms include at least one of:
    - quasi-stationary signal waveforms and non-stationary signal waveforms.
  - 99. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein in step (b), the single waveforms include at least one of:
    - substantially periodic waveforms, speech transition sound waveforms, flat spectra waveforms and non-periodic waveforms.
  - 100. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein in step (b), the types of single waveforms are pre-selected for at least one of the excitation sequences.
  - 101. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein in step (b), the types of single waveforms are dynamically selected for at least one of the excitation sequences.
  - 102. A method of creating an excitation signal associated with a segment of input speech as in claim 101, wherein the dynamic selection of the types of single waveforms is a function of the set of error signals.
  - 103. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein in step (b), the single waveforms are variable in length.
  - 104. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein in step (b), the single waveforms are fixed in length.
  - 105. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein in step (b), the number of single waveforms in at least one of the excitation sequences is variable.
  - 106. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein in step (b), the number of single waveforms in at least one of the excitation sequences is fixed.
  - 107. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein, for at least one of the excitation sequences, step (b) further includes applying any portion of a single waveform extending beyond the end of the current segment of input speech to the beginning of the current segment of input speech.
  - 108. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein, for at least one of the excitation sequences, step (b) further includes applying any portion of a single waveform extending beyond the end of the current segment of input speech to the beginning of the next segment of input speech.
  - 109. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein, for at least one of the excitation sequences, step (b) further includes ignoring any portion of a single waveform extending beyond the end of the current segment of input speech.
  - 110. A method of creating an excitation signal associated with a segment of input speech according to claim 90, wherein in step (b) at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information.
  - 111. A method of creating an excitation signal associated with a segment of input speech according to claim 110, wherein the preselected redundancy information is pitch related information.
  - 112. A method of creating an excitation signal associated with a segment of input speech according to claim 90, wherein in step (b) at least one single waveform is modulated in accordance with a gain factor.
  - 113. A method of creating an excitation signal associated with a segment of input speech as in claim 90, wherein step (c) employs a synthesis filter.

114. An excitation signal generator for use in encoding segments of input speech, the generator comprising:
- a. a spectral signal analyzer for forming a spectral signal representative of the spectral parameters of the segment of input speech;
  
  b. an excitation candidate generator for creating a set of excitation candidate signals, the set having at least one member, each excitation candidate signal composed of members from a plurality of sets of excitation sequences, wherein each excitation sequence is comprised of a sequence of single waveforms, each waveform having a type, the sequence having at least one waveform, wherein the position of any single waveform subsequent to the first single waveform is encoded relative to the position of a preceding single waveform;
  
  c. an error signal generator for forming a set of error signals, the set having at least one member, each error signal providing a measure of the accuracy with which the spectral signal and a given one of the excitation candidate signals encode the input speech segment;
  
  d. an excitation signal selector for selecting as the excitation signal an excitation candidate signal for which the corresponding error signal is indicative of sufficiently accurate encoding; and
  
  e. a feedback loop including the excitation candidate generator and the error signal generator configured so that the excitation candidate generator, if no excitation signal is selected, recursively creates a set of new excitation candidate signals such that the position of at least one single waveform in the sequence of at least one excitation candidate signal is modified in response to the set of error signals.
- View Dependent Claims (115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136)
- - 115. An excitation signal generator as in claim 114, wherein the spectral signal analyzer forms the spectral signal with linear predictive coefficients.
  - 116. An excitation signal generator as in claim 114 further including an extractor for extracting from the segment of input speech selected parameters indicative of redundant information present in the segment of input speech.
  - 117. An excitation signal generator as in claim 114, wherein the excitation candidate generator is responsive in at least one of the excitation sequences to the selected parameters indicative of redundant information present in the segment of input speech.
  - 118. An excitation signal generator as in claim 114, wherein the excitation candidate generator positions the first single waveform in each excitation sequence with respect to the beginning of the segment of input speech.
  - 119. An excitation signal generator as in claim 114, wherein the excitation candidate generator determines the relative positions of subsequent single waveforms in at least one of the excitation sequences dynamically.
  - 120. An excitation signal generator as in claim 114, wherein the excitation candidate generator determines the relative positions of subsequent single waveforms in at least one of the excitation sequences by use of a table of allowable positions.
  - 121. An excitation signal generator as in claim 114, wherein the excitation candidate generator uses single waveforms including at least one of:
    - glottal pulse waveforms, sinusoidal period waveforms, and single pulses.
  - 122. An excitation signal generator as in claim 114, wherein the excitation candidate generator uses single waveforms including at least one of:
    - quasi-stationary signal waveforms and non-stationary signal waveforms.
  - 123. An excitation signal generator as in claim 114, wherein the excitation candidate generator uses single waveforms including at least one of:
    - substantially periodic waveforms, speech transition sound waveforms, flat spectra waveforms and non-periodic waveforms.
  - 124. An excitation signal generator as in claim 114, wherein the excitation candidate generator pre-select the types of single waveforms for at least one of the excitation sequences.
  - 125. An excitation signal generator as in claim 114, wherein the excitation candidate generator dynamically selects the types of single waveforms for at least one of the excitation sequences.
  - 126. An excitation signal generator as in claim 125, wherein the dynamic selection of the types of single waveforms is a function of the set of error signals.
  - 127. An excitation signal generator as in claim 114, wherein the excitation candidate generator uses variable length single waveforms.
  - 128. An excitation signal generator as in claim 114, wherein the excitation candidate generator uses fixed length single waveforms.
  - 129. An excitation signal generator as in claim 114, wherein the excitation candidate generator uses a variable number of single waveforms in at least one of the excitation sequences.
  - 130. An excitation signal generator as in claim 114, wherein the excitation candidate generator uses a fixed number of single waveforms in at least one of the excitation sequences.
  - 131. An excitation signal generator as in claim 114, wherein the excitation candidate generator in at least one of the excitation sequences applies any portion of a single waveform extending beyond the end of the current segment of input speech to the beginning of the current segment of input speech.
  - 132. An excitation signal generator as in claim 114, wherein the excitation candidate generator in at least one of the excitation sequences applies any portion of a single waveform extending beyond the end of the current segment of input speech to the beginning of the next segment of input speech.
  - 133. An excitation signal generator as in claim 114, wherein the excitation candidate generator in at least one of the excitation sequences ignores any portion of a single waveform extending beyond the end of the current segment of input speech.
  - 134. An excitation signal generator as in claim 114, wherein in the excitation candidate generator at least one of the plurality of sets of excitation sequences is associated with preselected redundancy information.
  - 135. An excitation signal generator as in claim 134, wherein the preselected redundancy information is pitch related information.
  - 136. An excitation signal generator as in claim 132, wherein the excitation candidate generator modulates at least one single waveform in accordance with a gain factor.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Lernout & Hauspie Speech Products NV (Intel Corporation)
Inventors
Alpuente, Manel Guberna, Rasaminjanahary, Jean-Francois, Ferhaoui, Mohand, Van Compernolle, Dirk
Primary Examiner(s)
Dorvil, Richemond

Application Number

US09/031,522
Time in Patent Office

585 Days
Field of Search

704/219, 704/232, 704/229, 704/222, 704/220, 704/200, 704/262, 704/266, 704/223, 704/225
US Class Current

704/219
CPC Class Codes

G10L 19/13 Residual excited linear pre...

G10L 19/18 Vocoders using multiple modes

Apparatus and method for hybrid excited linear prediction speech encoding

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

28 Citations

136 Claims

Specification

Solutions

Use Cases

Quick Links

Apparatus and method for hybrid excited linear prediction speech encoding

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

28 Citations

136 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links