Speech encoding apparatus and speech encoding and decoding apparatus

US 6,052,661 A
Filed: 12/31/1996
Issued: 04/18/2000
Est. Priority Date: 05/29/1996
Status: Expired due to Fees

First Claim

Patent Images

1. A speech encoding apparatus for dividing an input speech into spectrum envelope information and excitation signal information and for encoding said excitation signal information by the frame, said speech encoding apparatus comprising:

target speech generation means for generating from said input speech a target speech vector of a vector length corresponding to a delay parameter;

an adaptive codebook for generating from previously generated excitation signals an adaptive vector of said vector length corresponding to said delay parameter;

adaptive code search means for evaluating the distortion of a synthesis vector obtained from said adaptive vector with respect to said target speech vector so as to search for an adaptive vector conducive to the least distortion; and

frame excitation generation means for generating an excitation signal of a frame length from said adaptive vector conducive to the least distortion,wherein said vector length of said target speech vector and said vector length of said adaptive vector are less than said frame length.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech encoding apparatus capable of averting the deterioration of synthesis speech quality in encoding the input speech and of generating a high-quality synthesis output speech through small quantities of computation. The apparatus includes a target speech generation part for generating from the input speech a target speech vector of a vector length corresponding to a delay parameter; an adaptive codebook for generating from previously generated excitation signals an adaptive vector of the vector length corresponding to the delay parameter; an adaptive code search part for evaluating the distortion of a synthesis vector obtained from the adaptive vector with respect to the target speech vector so as to search for the adaptive vector conducive to the least distortion; and a frame code generation part for generating an excitation signal of a frame length from the adaptive vector conducive to the least distortion.

Citations

47 Claims

1. A speech encoding apparatus for dividing an input speech into spectrum envelope information and excitation signal information and for encoding said excitation signal information by the frame, said speech encoding apparatus comprising:
- target speech generation means for generating from said input speech a target speech vector of a vector length corresponding to a delay parameter;
  
  an adaptive codebook for generating from previously generated excitation signals an adaptive vector of said vector length corresponding to said delay parameter;
  
  adaptive code search means for evaluating the distortion of a synthesis vector obtained from said adaptive vector with respect to said target speech vector so as to search for an adaptive vector conducive to the least distortion; and
  
  frame excitation generation means for generating an excitation signal of a frame length from said adaptive vector conducive to the least distortion,wherein said vector length of said target speech vector and said vector length of said adaptive vector are less than said frame length.
- View Dependent Claims (2, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 2. A speech encoding apparatus according to claim 1, further comprising:
    - second target speech generation means for generating a second target speech vector from said target speech vector and said adaptive vector conducive to the least distortion;
      
      a random codebook for generating a random vector of said vector length corresponding to said delay parameter;
      
      random code search means for evaluating the distortion of a second synthesis vector obtained from said random vector with respect to said second target speech vector so as to search for the random vector conducive to the least distortion; and
      
      second frame excitation generation means for generating a second excitation signal of the frame length from said random vector conducive to the least distortion.
  - 5. A speech encoding apparatus according to claim 1, wherein said vector length corresponding to said delay parameter is a rational number.
  - 6. A speech encoding apparatus according to claim 1, wherein said target speech generation means divides an input speech in a frame into portions each having said vector length corresponding to said delay parameter, and computes a weighted mean of the input speech portions each having said vector length so as to generate said target speech vector.
  - 7. A speech encoding apparatus according to claim 1, wherein said target speech generation means divides an input speech having the length of an integer multiple of said vector length corresponding to said delay parameter, into portions each having said vector length, and computes a weighted mean of the input speech portions so as to generate said target speech vector.
  - 8. A speech encoding apparatus according to claim 7, wherein said length of the integer multiple of said vector length corresponding to said delay parameter is equal to or greater than said frame length.
  - 9. A speech encoding apparatus according to claim 6, wherein said target speech generation means computes a weighted mean of said input speech by said vector length in accordance with the characteristic quantity of said input speech portions each having said vector length corresponding to said delay parameter, thereby determining the weight for generating said target speech vector.
  - 10. A speech encoding apparatus according to claim 9, wherein said characteristic quantity of said input speech portions each having said vector length corresponding to said delay parameter includes at least power information about said input speech.
  - 11. A speech encoding apparatus according to claim 9, wherein said characteristic quantity of said input speech portions each having said vector length corresponding to said delay parameter includes at least correlative information about said input speech.
  - 12. A speech encoding apparatus according to claim 6, wherein said target speech generation means computes a weighted mean of said input speech by said vector length in accordance with the temporal relationship of said input speech portions each having said vector length corresponding to said delay parameter, thereby determining the weight for generating said target speech vector.
  - 13. A speech encoding apparatus according to claim 6, wherein said target speech generation means fine-adjusts the temporal relationship of said input speech by said vector length when computing a weighted mean of said input speech portions each having said vector length corresponding to said delay parameter.
  - 14. A speech encoding apparatus according to claim 1, wherein said frame excitation generation means repeats at intervals of said vector length the excitation vector of said vector length corresponding to said delay parameter in order to acquire a periodical excitation vector, thereby generating said excitation signal of said frame length.
  - 15. A speech encoding apparatus according to claim 1, wherein said frame excitation generation means interpolates between frames the excitation vector of said vector length corresponding to said delay parameter, thereby generating said excitation signal.
  - 16. A speech encoding apparatus according to claim 1, wherein said adaptive code search means includes a synthesis filter and uses an impulse response from said synthesis filter to compute repeatedly the distortion of said synthesis vector obtained from said adaptive vector with respect to said target speech vector.
  - 17. A speech encoding apparatus according to claim 5, further comprising input speech up-sampling means for up-sampling said input speech, wherein said target speech generation means generates said target speech vector from the up-sampled input speech.
  - 18. A speech encoding apparatus according to claim 5, further comprising excitation signal up-sampling means for up-sampling previously generated excitation signals, wherein said adaptive codebook generates said adaptive vector from the up-sampled previously generated excitation signals.
  - 19. A speech encoding apparatus according to claim 17, wherein said input speech up-sampling means changes the up-sampling rate of the up-sampling operation in accordance with said delay parameter.
  - 20. A speech encoding apparatus according to claim 17, wherein said input speech up-sampling means changes the up-sampling rate of the up-sampling operation on either the input speech or the excitation signal only within a range based on said vector length corresponding to said delay parameter.

3. A speech encoding apparatus for dividing an input speech into spectrum envelope information and excitation signal information and for encoding said excitation signal information by the frame, said speech encoding apparatus comprising:
- target speech generation means for generating from said input speech a target speech vector of a vector length corresponding to a delay parameter;
  
  a random codebook for generating a random vector of said vector length corresponding to said delay parameter;
  
  random code search means for evaluating the distortion of a synthesis vector obtained from said random vector with respect to said target speech vector so as to search for a random vector conducive to the least distortion; and
  
  frame excitation generation means for generating an excitation signal of a frame length from said random vector conducive to the least distortion,wherein said vector length of said target speech vector and said vector length of said random vector are less than said length.
- View Dependent Claims (4)
- - 4. A speech encoding apparatus according to claim 3, wherein said delay parameter is determined in accordance with the pitch period of said input speech.

21. A speech encoding and decoding apparatus for dividing an input speech into spectrum envelope information and excitation signal information, encoding said excitation signal information by the frame, and decoding the encoded excitation signal information so as to generate an output speech, the encoding side of said speech encoding and decoding apparatus comprising:
- target speech generation means for generating from said input speech a target speech vector of a vector length corresponding to a delay parameter;
  
  an adaptive codebook for generating from previously generated excitation signals an adaptive vector of said vector length corresponding to said delay parameter;
  
  adaptive code search means for evaluating the distortion of a synthesis vector obtained from said adaptive vector with respect to said target speech vector so as to search for an adaptive vector conducive to the least distortion; and
  
  frame excitation generation means for generating an excitation signal of a frame length from said adaptive vector conducive to the least distortion;
  
  the decoding side of said speech encoding and decoding apparatus comprising;
  
  an adaptive codebook for generating said adaptive vector of said vector length corresponding to said delay parameter; and
  
  frame excitation generation means for generating said excitation signal of said frame length from said adaptive vector,wherein said vector length of said target speech vector and said vector length of said adaptive vector are less than said frame length.
- View Dependent Claims (22)
- - 22. A speech encoding and decoding apparatus according to claim 21, wherein said encoding side further comprises:
    - second target speech generation means for generating a second target speech vector from said target speech vector and said adaptive vector;
      
      a random codebook for generating a random vector of said vector length corresponding to said delay parameter;
      
      random code search means for evaluating the distortion of a second synthesis vector obtained from said random vector with respect to said second target speech vector so as to search for the random vector conducive to the least distortion; and
      
      second frame excitation generation means for generating a second excitation signal of the frame length from said random vector conducive to the least distortion; and
      
      wherein said decoding side further comprises;
      
      a random codebook for generating said random vector of said vector length corresponding to said delay parameter; and
      
      second frame excitation generation means for generating said second excitation signal of said frame length from said random vector.

23. A speech encoding and decoding apparatus for dividing an input speech into spectrum envelope information and excitation signal information, encoding said excitation signal information by the frame, and decoding the encoded excitation signal information so as to generate an output speech, the encoding side of said speech encoding and decoding apparatus comprising:
- target speech generation means for generating from said input speech a target speech vector of a vector length corresponding to a delay parameter;
  
  a random codebook for generating a random vector of said vector length corresponding to said delay parameter;
  
  random code search means for evaluating the distortion of a synthesis vector obtained from said random vector with respect to said target speech vector so as to search for a random vector conducive to the least distortion; and
  
  frame excitation generation means for generating an excitation signal of a frame length from said random vector conducive to the least distortion;
  
  the decoding side of said speech encoding and decoding apparatus comprising;
  
  a random codebook for generating said random vector of said vector length corresponding to said delay parameter; and
  
  frame excitation generation means for generating said excitation signal of said frame length from said random vector,wherein said vector length of said target speech vector and said vector length of said random vector are less than said frame length.

24. A speech encoding apparatus for dividing an input speech into spectrum envelope information and excitation signal information and for encoding said excitation signal information by frame, said speech encoding apparatus comprising:
- an adaptive codebook for generating, from previously generated excitation signals of a frame length, an adaptive vector of a vector length corresponding to a delay parameter; and
  
  adaptive code search means for evaluating the distortion of a synthesis vector from said adaptive vector to determine an adaptive vector conducive to the least distortion of a vector length corresponding to a delay parameter conducive to the least distortion, whereinsaid vector length of said adaptive vector is less than said frame length, andsaid vector length of said adaptive vector conductive to the least distortion is less than said frame length.

25. A speech encoding apparatus for dividing an input speech into spectrum envelope information and excitation signal information and for encoding said excitation signal information by the frame, said speech encoding apparatus comprising:
- target speech generation means for generating from said input speech a target speech vector of a vector length corresponding to a delay parameter;
  
  an adaptive codebook for generating from previously generated excitation signals an adaptive vector of said vector length corresponding to said delay parameter;
  
  adaptive code search means for evaluating the distortion of a synthesis vector obtained from said adaptive vector with respect to said target speech vector so as to search for an adaptive vector conducive to the least distortion; and
  
  frame excitation generation means for generating an excitation signal of a frame length from said adaptive vector conducive to the least distortion,wherein said target speech generation means divides an input speech in a frame into portions each having said vector length corresponding to said delay parameter, and computes a weighted mean of the input speech portions each having said vector length so as to generate said target speech vector.
- View Dependent Claims (26, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43)
- - 26. A speech encoding apparatus according to claim 25, further comprising:
    - second target speech generation means for generating a second target speech vector from said target speech vector and said adaptive vector conducive to the least distortion;
      
      a random codebook for generating a random vector of said vector length corresponding to said delay parameter;
      
      random code search means for evaluating the distortion of a second synthesis vector obtained from said random vector with respect to said second target speech vector so as to search for the random vector conducive to the least distortion; and
      
      second frame excitation generation means for generating a second excitation signal of the frame length from said random vector conducive to the least distortion.
  - 29. A speech encoding apparatus according to claim 25, wherein said vector length corresponding to said delay parameter is a rational number.
  - 30. A speech encoding apparatus according to claim 25, wherein said target speech generation means divides an input speech having the length of an integer multiple of said vector length corresponding to said delay parameter, into portions each having said vector length, and computes a weighted mean of the input speech portions so as to generate said target speech vector.
  - 31. A speech encoding apparatus according to claim 30, wherein said length of the integer multiple of said vector length corresponding to said delay parameter is equal to or greater than said frame length.
  - 32. A speech encoding apparatus according to claim 25, wherein said target speech generation means computes a weighted mean of said input speech by said vector length in accordance with the characteristic quantity of said input speech portions each having said vector length corresponding to said delay parameter, thereby determining the weight for generating said target speech vector.
  - 33. A speech encoding apparatus according to claim 32, wherein said characteristic quantity of said input speech portions each having said vector length corresponding to said delay parameter includes at least power information about said input speech.
  - 34. A speech encoding apparatus according to claim 32, wherein said characteristic quantity of said input speech portions each having said vector length corresponding to said delay parameter includes at least correlative information about said input speech.
  - 35. A speech encoding apparatus according to claim 25, wherein said target speech generation means computes a weighted mean of said input speech by said vector length in accordance with the temporal relationship of said input speech portions each having said vector length corresponding to said delay parameter, thereby determining the weight for generating said target speech vector.
  - 36. A speech encoding apparatus according to claim 25, wherein said target speech generation means fine-adjusts the temporal relationship of said input speech by said vector length when computing a weighted mean of said input speech portions each having said vector length corresponding to said delay parameter.
  - 37. A speech encoding apparatus according to claim 25, wherein said frame excitation generation means repeats at intervals of said vector length the excitation vector of said vector length corresponding to said delay parameter in order to acquire a periodical excitation vector, thereby generating said excitation signal of said frame length.
  - 38. A speech encoding apparatus according to claim 25, wherein said frame excitation generation means interpolates between frames the excitation vector of said vector length corresponding to said delay parameter, thereby generating said excitation signal.
  - 39. A speech encoding apparatus according to claim 25, wherein said adaptive code search means includes a synthesis filter and uses an impulse response from said synthesis filter to compute repeatedly the distortion of said synthesis vector obtained from said adaptive vector with respect to said target speech vector.
  - 40. A speech encoding apparatus according to claim 29, further comprising input speech up-sampling means for up-sampling said input speech, wherein said target speech generation means generates said target speech vector from the up-sampled input speech.
  - 41. A speech encoding apparatus according to claim 29, further comprising excitation signal up-sampling means for up-sampling previously generated excitation signals, wherein said adaptive codebook generates said adaptive vector form the up-sampled previously generated excitation signals.
  - 42. A speech encoding apparatus according to claim 40, wherein said input speech up-sampling means changes the up-sampling rate of the up-sampling operation in accordance with said delay parameter.
  - 43. A speech encoding apparatus according to claim 40, wherein said input speech up-sampling means changes the up-sampling rate of the up-sampling operation on either the input speech or the excitation signal only within a range based on said vector length corresponding to said delay parameter.

27. A speech encoding apparatus for dividing an input speech into spectrum envelope information and excitation signal information and for encoding said excitation signal information by the frame, said speech encoding apparatus comprising:
- target speech generation means for generating from said input speech a target speech vector of a vector length corresponding to a delay parameter;
  
  a random codebook for generating a random vector of said vector length corresponding to said delay parameter;
  
  random code search means for evaluating the distortion of a synthesis vector obtained from said random vector with respect to said target speech vector so as to search for the random vector conducive to the least distortion; and
  
  frame excitation generation means for generating an excitation signal of a frame length from said random vector conducive to the least distortion,wherein said target speech generation means divides an input speech in a frame into portions each having said vector length corresponding to said delay parameter, and computes a weighted mean of the input speech portions each having said vector length so as to generate said target speech vector.
- View Dependent Claims (28)
- - 28. A speech encoding apparatus according to claim 27, wherein said delay parameter is determined in accordance with the pitch period of said input speech.

44. A speech encoding and decoding apparatus for dividing an input speech into spectrum envelope information and excitation signal information, encoding said excitation signal information by the frame, and decoding the encoded excitation signal information so as to generate an output speech, the encoding side of said speech encoding and decoding apparatus comprising:
- target speech generation means for generating from said input speech a target speech vector of a vector length corresponding to a delay parameter;
  
  an adaptive codebook for generating from previously generated excitation signals an adaptive vector of said vector length corresponding to said delay parameter;
  
  adaptive code search means for evaluating the distortion of a synthesis vector obtained from said adaptive vector with respect to said target speech vector so as to search for an adaptive vector conducive to the least distortion; and
  
  frame excitation generation means for generating an excitation signal of a frame length from said adaptive vector conducive to the least distortion;
  
  wherein said target speech generation means divides an input speech in a frame into portions each having said vector length corresponding to said delay parameter, and computes a weighted mean of the input speech portions each having said vector length so as to generate said target speech vector;
  
  the decoding side of said speech encoding and decoding apparatus comprising;
  
  an adaptive codebook for generating said adaptive vector of said vector length corresponding to said delay parameter; and
  
  frame excitation generation means for generating said excitation signal of said frame length from said adaptive vector.
- View Dependent Claims (45)
- - 45. A speech encoding and decoding apparatus according to claim 44, wherein said encoding side further comprises:
    - second target speech generation means for generating a second target speech vector from said target speech vector and said adaptive vector;
      
      a random codebook for generating a random vector of said vector length corresponding to said delay parameter;
      
      random code search means for evaluating the distortion of a second synthesis vector obtained from said random vector with respect to said second target speech vector so as to search for the random vector conducive to the least distortion; and
      
      second frame excitation generation means for generating a second excitation signal of the frame length from said random vector conducive to the least distortion; and
      
      wherein said decoding side further comprises;
      
      a random codebook for generating said random vector of said vector length corresponding to said delay parameter; and
      
      second frame excitation generation means for generating said second excitation signal of said frame length from said random vector.

46. A speech encoding and decoding apparatus for dividing an input speech into spectrum envelope information and excitation signal information, encoding said excitation signal information by the frame, and decoding the encoded excitation signal information so as to generate an output speech, the encoding side of said speech encoding and decoding apparatus comprising:
- target speech generation means for generating from said input speech a target speech vector of a vector length corresponding to a delay parameter;
  
  a random codebook for generating a random vector of said vector length corresponding to said delay parameter;
  
  random code search means for evaluating the distortion of a synthesis vector obtained from said random vector with respect to said target speech vector so as to search for a random vector conducive to the least distortion; and
  
  frame excitation generation means for generating an excitation signal of a frame length from said random vector conducive to the least distortion;
  
  wherein said target speech generation means divides an input speech in a frame into portions each having said vector length corresponding to said delay parameter, and computes a weighted mean of the input speech portions each having said vector length so as to generate said target speech vector;
  
  the decoding side of said speech encoding and decoding apparatus comprising;
  
  a random codebook for generating said random vector of said vector length corresponding to said delay parameter; and
  
  frame excitation generation means for generating said excitation signal of said frame length from said random vector.

47. A speech encoding apparatus for dividing an input speech into spectrum envelope information and excitation signal information and for encoding said excitation signal information by frame, said speech encoding apparatus comprising:
- an adaptive codebook for generating, from previously generated excitation signals of a frame length, an adaptive vector of a vector length corresponding to a delay parameter; and
  
  adaptive code search means for evaluating the distortion of a synthesis vector from said adaptive vector to determine an adaptive vector conducive to the least distortion, of a vector length corresponding to a delay parameter conducive to the least distortion,wherein said target speech generation means divides an input speech in a frame into portions each leaving said vector length corresponding to said delay parameter, and computes a weighted mean of the input speech portions each having said vector length so as to generate said target speech vector.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Mitsubishi Denki Kabushiki Kaisha (Mitsubishi Electric Corporation)
Original Assignee
Mitsubishi Denki Kabushiki Kaisha (Mitsubishi Electric Corporation)
Inventors
Takahashi, Shinya, Yamaura, Tadashi, Tasaki, Hirohisa
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Azad, Abul K.

Application Number

US08/777,874
Time in Patent Office

1,204 Days
Field of Search

704/214, 704/219, 704/211, 704/220, 704/221, 704/222, 704/223, 704/206, 704/207, 704/217, 704/224, 704/225, 704/230
US Class Current

704/222
CPC Class Codes

G10L 19/08 Determination or coding of ...

G10L 2019/0011 Long term prediction filter...

Speech encoding apparatus and speech encoding and decoding apparatus

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

47 Claims

Specification

Solutions

Use Cases

Quick Links

Speech encoding apparatus and speech encoding and decoding apparatus

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

47 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links