Speech synthesizer employing post-processing for enhancing the quality of the synthesized speech

US 5,946,651 A
Filed: 08/18/1998
Issued: 08/31/1999
Est. Priority Date: 06/16/1995
Status: Expired due to Term

- Alert
- Pin

Associated Cases

Associated Defendants

First Claim

Patent Images

1. A Linear Predictive Coding (LPC) synthesiser for speech synthesis, comprising:

an excitation source; and

a LPC decoder comprising post-processing means coupled to an output of said excitation source for operating on a first signal including speech periodicity information derived from said excitation source, wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal.

View all claims

2 Assignments

Timeline View

Assignment View

Litigations

0 Petitions

Accused Products

Abstract

A post-processor 317 and method substantially for enhancing synthesised speech is disclosed. The post-processor 317 operates on a signal ex(n) derived from an excitation generator 211 typically comprising a fixed code book 203 and an adaptive code book 204, the signal ex(n) being formed from the addition of scaled outputs from the fixed code book 203 and adaptive code book 204. The post-processor operates on ex(n) by adding to it a scaled signal pv(n) derived from the adaptive code book 204. A gain or scale factor p is determined by the speech coefficients input to the excitation generator 211. The combined signal ex(n)+pv(n) is normalised by unit 316 and input to an LPC or speech synthesis filter 208, prior to being input to an audio processing unit 209.

68 Citations

View as Search Results

46 Claims

1. A Linear Predictive Coding (LPC) synthesiser for speech synthesis, comprising:
- an excitation source; and
  
  a LPC decoder comprising post-processing means coupled to an output of said excitation source for operating on a first signal including speech periodicity information derived from said excitation source, wherein the post-processing means modifies the speech periodicity information content of the first signal in accordance with a second signal derivable from said excitation source in order to produce an enhanced synthesised speech signal.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
- - 2. A synthesiser according to claim 1, wherein the post-processing means comprises gain control means for scaling the second signal in accordance with a first scaling factor (p) derivable from pitch information associated with the first signal.
  - 3. A synthesiser according to claim 2, wherein the excitation source comprises a fixed code book and an adaptive code book, the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books.
  - 4. A synthesiser according to claim 3, wherein the first scaling factor (p) is derivable from an adaptive code book gain factor (b).
  - 5. A synthesiser according to claim 4, wherein the first scaling factor (p) is derivable in accordance with the following relationship, ##EQU11## where TH represents threshold values, b is the adaptive code book gain factor, p is the first post-processing means scale factor, a_enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b.
  - 6. A synthesiser according to claim 4, wherein the scaling factor (p) is derivable in accordance with ##EQU12## where a_enh is a constant that controls the strength of the enhancement operation, b is the adaptive code book gain factor, TH are threshold values and p is the first post-processing means scale factor.
  - 7. A synthesiser according to claim 3, wherein the second signal originates from the adaptive code book.
  - 8. A synthesiser according to claim 7, wherein the second signal is substantially the same as the second partial excitation signal.
  - 9. A synthesiser according to claim 7, wherein the first signal is a first excitation signal suitable for inputting to a speech synthesis filter, and the second signal is a second excitation signal suitable for inputting to a speech synthesis filter.
  - 10. A synthesiser according to claim 3, wherein the second signal originates from the fixed code book.
  - 11. A synthesiser according to claim 10, wherein the second signal is substantially the same as the first partial excitation signal.
  - 12. A synthesiser according to claim 10, wherein the gain control means scales the second signal in accordance with a second scaling factor (p'"'"') where, ##EQU13## and where g is a fixed code book scaling factor, b is an adaptive code book gain factor and p is the first scaling factor.
  - 13. A synthesiser according to claim 12, wherein the first signal is a first synthesised speech signal output from a first speech synthesis filter, the second signal is the output from a second speech synthesis filter, and the gain control means operates on signals input to the second speech synthesis filter.
  - 14. A synthesiser according to claim 10, wherein the first signal is a first synthesised speech signal output from a first speech synthesis filter, the second signal is the output from a second speech synthesis filter, and the gain control means operates on signals input to the second speech synthesis filter.
  - 15. A synthesiser according to claim 2, wherein the excitation source comprises a fixed code book and an adaptive code book, the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books, the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book, the first signal being modified by combining the second signal with the first signal, and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship, ##EQU14## where TH represents threshold values, b is the adaptive code book gain factor, p is the first post-processing means scale factor, a_enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b.
  - 16. A synthesiser according to claim 2, wherein the excitation source comprises a fixed code book and an adaptive code book, the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books, the second signal being substantially the same as the first partial excitation signal and originating from the fixed code book, the first signal being modified by combining the second signal with the first signal, and the first scaling factor (p) being derivable from an adaptive code book gain factor (b) in accordance with the following relationship, ##EQU15## where TH represents threshold values, b is the adaptive code book gain factor, p is the first post-processing means scale factor, a_enh is a linear scaler and f(b) is a function of the adaptive code book gain factor b.

17. A method for use with Linear Predictive Coding (LPC) for enhancing synthesised speech, comprising steps of:
- deriving a first signal including speech periodicity information from an excitation source,deriving a second signal from the excitation source, andmodifying in a LPC decoder the speech periodicity information content of the first signal in accordance with the second signal in order to produce an enhanced synthesised speech signal.
- View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
- - 18. A method according to claim 17, further comprising scaling the second signal in accordance with a first scaling factor (p) derived from pitch information associated with the first signal.
  - 19. A method according to claim 18, wherein the excitation source comprises a fixed code book and an adaptive code book, the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books.
  - 20. A method according to claim 19, wherein the first scaling factor (p) is derivable from a gain factor (b) for the pitch information of the first signal.
  - 21. A method according to claim 20, wherein the first scaling factor (p) is derivable in accordance with the following relationships, ##EQU16## where TH represents threshold values, b is the gain factor for the pitch information of the first signal, p is the first scaling factor, a_enh is a linear scaler and f(b) is a function of b.
  - 22. A method according to claim 19, wherein the second signal originates from the adaptive code book.
  - 23. A method according to claim 22, wherein the second signal is substantially the same as the second partial excitation signal.
  - 24. A method according to claim 22, wherein the first signal is a first synthesised speech signal output from a first speech synthesis filter and the second signal is the output of a second speech synthesis filter.
  - 25. A method according to claim 19, wherein the second signal originates from the fixed code book.
  - 26. A method according to claim 25, wherein the second signal is substantially the same as the first partial excitation signal.
  - 27. A method according to claim 25, wherein the second signal is scaled in accordance with a second scaling factor (p'"'"') where, ##EQU17## g is a fixed code book scaling factor, b is an adaptive code book scaling factor and p is the first scaling factor.
  - 28. A method according to claim 25, wherein the first signal is a first synthesised speech signal output from a first speech synthesis filter and the second signal is the output of a second speech synthesis filter.
  - 29. A method according to claim 17, wherein the first signal is a first excitation signal suitable for inputting to a first speech synthesis filter, and the second signal is a second excitation signal suitable for inputting to a second speech synthesis filter.

30. A method for use with Linear Predictive Coding (LPC) for enhancing synthesised speech, comprising steps of:
- deriving a first signal including speech periodicity information from an excitation source, comprising a fixed code book and an adaptive code book,the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books,deriving a second signal from the excitation source, andmodifying in a LPC decoder the speech periodicity information content of the first signal in accordance with the second signal in order to produce an enhanced synthesised speech signal,the second signal being substantially the same as the second partial excitation signal and originating from the adaptive code book, the first signal being modified by combining the second signal with the first signal, and a first scaling factor (p) being derivable from an adaptive code book scaling factor (b) in accordance with the following relationship, ##EQU18## where TH represents threshold values, a_enh is a linear scaler and f(b) is a function of b.

31. A method for use with Linear Predictive Coding (LPC) for enhancing synthesised speech, comprising steps of:
- deriving a first signal including speech periodicity information from an excitation source, comprising a fixed code book and an adaptive code book,the first signal comprising a combination of first and second partial excitation signals respectively originating from the fixed and adaptive code books,deriving a second signal from the excitation source, andmodifying in a LPC decoder the speech periodicity information content of the first signal in accordance with the second signal in order to produce an enhanced synthesised speech signal,the second signal being substantially the same as the first partial excitation signal and originating from the fixed code book, the first signal being modified by combining the second signal with the first signal, and a first scaling factor (p) being derivable from an adaptive code book scaling factor (b) in accordance with the following relationship, ##EQU19## where TH represents threshold values, a_enh is a linear scaler and f(b) is a function of b.

32. A Linear Predictive Coding (LPC) synthesiser for speech synthesis, comprising first and second excitation sources for respectively generating first and second excitation signals, and a LPC decoder comprising modifying means for modifying the first excitation signal in accordance with a scaling factor derivable from pitch information associated with the first excitation signal in order to produce an enhanced synthesised speech signal.
- View Dependent Claims (33, 34, 35, 36, 37)
- - 33. A synthesiser according to claim 32, wherein the modifying means scales the first excitation signal in accordance with a scaling factor (a) derivable from pitch information associated with the first signal.
  - 34. A synthesiser according to claim 33, wherein the first excitation source is an adaptive code book and the second excitation source is a fixed code book.
  - 35. A synthesiser according to claim 34, wherein the scaling factor (a) is of the form a=b+p, where b is an adaptive code book gain and p is a perceptual enhancement gain factor derivable in accordance with the following relationships;
    - ##EQU20## where TH represents threshold values, a_enh is a linear scaler and f(b) is a function of gain b.
  - 36. A synthesiser according to claim 35, wherein the first and second excitation signals are combined after modification.
  - 37. A synthesiser according to claim 34, wherein the scaling factor (a) is of the form a=b+p, where b is an adaptive code book gain and p is a perceptual enhancement gain factor, and wherein the perceptual enhancement gain factor p is derivable in accordance with;
    - ##EQU21## where a_enh is a constant that controls the strength of the enhancement operation and TH are threshold values.

38. A Linear Predictive Coding (LPC) synthesiser for speech synthesis, comprising first and second excitation sources for respectively generating first and second excitation signals, and a LPC decoder comprising modifying means for modifying the second excitation signal in accordance with a scaling factor derivable from pitch information associated with the first excitation signal in order to produce an enhanced synthesised speech signal.
- View Dependent Claims (39, 40, 41)
- - 39. A synthesiser according to claim 38, wherein the modifying means scales the second excitation signal in accordance with a scaling factor (a'"'"') derivable from pitch information associated with the first signal.
  - 40. A synthesiser according to claim 39, wherein the first excitation source is an adaptive code book and the second excitation source is a fixed code book.
  - 41. A synthesiser according to claim 40, wherein the scaling factor (a'"'"') satisfies the following relationship;
    - ##EQU22## where g is a fixed code book gain factor, b is an adaptive code gain factor and p is a perceptual enhancement gain factor derivable in accordance with;
      
      ##EQU23## where TH represents threshold values, a_enh is a linear scaler and f(b) is a function of gain b.

42. A method for use with Linear Predictive Coding (LPC) for speech synthesis, comprising steps of:
- generating first and second excitation signals,modifying in a LPC decoder the first excitation signal in accordance with a gain factor associated therewith, andfurther modifying in the LPC decoder the first excitation signal in accordance with a scaling factor derivable from pitch information associated with the first excitation signal in order to produce an enhanced synthesised speech signal.

43. A method for use with Linear Predictive Coding (LPC) for speech synthesis, comprising steps of:
- generating first and second excitation signals,modifying in a LPC decoder the first excitation signal in accordance with a gain factor associated therewith, andmodifying in the LPC decoder the second excitation signal in accordance with a scaling factor derivable from pitch information associated with the first excitation signal in order to produce an enhanced synthesised speech signal.

44. A time domain speech synthesiser, comprising:
- an excitation source providing first and second partial excitation signals having a speech periodicity information content; and
  
  a speech quality enhancement post-processor coupled to said excitation source for operating on one of said first and second partial excitation signals, said post-processor modifying the speech periodicity information content of the operated on partial excitation signal in accordance with a signal derivable from at least one of said first and second partial excitation signals.

45. A synthesiser for speech synthesis, comprising:
- an input unit for inputting a signal and for extracting coded information from said signal, the coded information comprising fixed codebook and adaptive codebook parameters, including an adaptive codebook gain factor;
  
  an excitation source comprising a fixed codebook and an adaptive codebook and having inputs coupled to outputs of said input unit for receiving extracted coded information therefrom, said excitation source being responsive to the received extracted coded information for outputting a first partial excitation signal from said fixed codebook and a second partial excitation signal from said adaptive codebook, said excitation source further comprising means for combining said first and second partial excitation signals into a composite excitation signal; and
  
  a perceptual enhancement post-processor coupled to said excitation source for operating on said composite excitation signal by combining said composite excitation signal with a scaled version of said second partial excitation signal, wherein an amount of scaling of said second partial excitation signal is controlled by a scaling factor having a value that is function of a value of said adaptive codebook gain factor.
- View Dependent Claims (46)
- - 46. A synthesiser as in claim 45, wherein said input unit inputs said signal from a radio channel.

Specification

Resources

Litigation Campaign Assessment

Litigation Data

Current Assignee
Nokia Technologies Oy (Nokia Corporation)
Original Assignee
Nokia Mobile Phones Incorporated (Nokia Corporation)
Inventors
Jarvinen, Kari, Honkanen, Tero
Primary Examiner(s)
Hudspeth, David R.
Assistant Examiner(s)
Chawan, Vijay B.

Application Number

US09/135,936
Time in Patent Office

378 Days
Field of Search

704/200, 704/207, 704/223, 704/219, 704/221, 704/222, 704/220, 704/208
US Class Current

704/223
CPC Class Codes

G10L 19/04 using predictive techniques

G10L 19/26 Pre-filtering or post-filte...

Speech synthesizer employing post-processing for enhancing the quality of the synthesized speech

First Claim

2 Assignments

Litigations

0 Petitions

Accused Products

Abstract

68 Citations

46 Claims

Specification

Solutions

Use Cases

Quick Links

Speech synthesizer employing post-processing for enhancing the quality of the synthesized speech

First Claim

2 Assignments

Subscription Required

Subscription Required

Litigations

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

68 Citations

46 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links