Harmonic speech coding arrangement where a set of parameters for a continuous magnitude spectrum is determined by a speech analyzer and the parameters are used by a synthesizer to determine a spectrum which is used to determine senusoids for synthesis

US 5,179,626 A
Filed: 04/08/1988
Issued: 01/12/1993
Est. Priority Date: 04/08/1988
Status: Expired due to Term

First Claim

Patent Images

1. In a harmonic speech coding arrangement, a method of processing speech signals, said speech signals comprising frames of speech, said method comprisingdetermining from a present one of said frames a magnitude spectrum having a plurality of spectrum points, the frequency of each of said spectrum points being independent of said speech signals,calculating a set of parameters for a continuous magnitude spectrum that models said determined magnitude spectrum at each of said spectrum points, the number of parameters of said set being less than the number of said spectrum points, said continuous magnitude spectrum comprising a sum of a plurality of functions, one of said functions being a magnitude spectrum for a previous one of said frames,encoding said set of parameters as a set of parameter signals representing said speech signals,communicating said set of parameter signals representing said speech signals for use in speech synthesis, andsynthesizing speech based on said communicated set of parameter signals.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A harmonic coding arrangement where the magnitude spectrum of the input speech is modeled at the analyzer by a relatively small set of parameters and, significantly, as a continuous rather than only a line magnitude spectrum. The synthesizer, rather than the analyzer, determines the magnitude, frequency, and phase of a large number of sinusoids which are summed to generate synthetic speech. Rather than receiving information explicitly defining the sinusoids from the analyzer, the synthesizer receives the small set of parameters and uses those parameters to determine a spectrum, which, in turn, is used by the synthesizer to determine the sinusoids for synthesis.

149 Citations

38 Claims

1. In a harmonic speech coding arrangement, a method of processing speech signals, said speech signals comprising frames of speech, said method comprisingdetermining from a present one of said frames a magnitude spectrum having a plurality of spectrum points, the frequency of each of said spectrum points being independent of said speech signals,calculating a set of parameters for a continuous magnitude spectrum that models said determined magnitude spectrum at each of said spectrum points, the number of parameters of said set being less than the number of said spectrum points, said continuous magnitude spectrum comprising a sum of a plurality of functions, one of said functions being a magnitude spectrum for a previous one of said frames,encoding said set of parameters as a set of parameter signals representing said speech signals,communicating said set of parameter signals representing said speech signals for use in speech synthesis, andsynthesizing speech based on said communicated set of parameter signals.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. A method in accordance with claim 1 wherein at least one of said functions is a magnitude spectrum of a periodic pulse train.
  - 3. A method in accordance with claim 1 wherein one of said functions is a magnitude spectrum of a first periodic pulse train and another one of said functions is a magnitude spectrum of a second periodic pulse train.
  - 4. A method in accordance with claim 1 wherein one of said functions is a vector chosen from a codebook.
  - 5. A method in accordance with claim 1 further comprisingdetermining a phase spectrum from a present one of said frames,calculating a second set of parameters modeling said determined phase spectrum by prediction of a phase spectrum for said present frame from a phase spectrum for a previous one of said frames,encoding said second set of parameters as a second set of parameter signals representing said speech signals, andcommunicating said second set of parameter signals representing said speech signals for use in speech synthesis.
  - 6. A method in accordance with claim 1 wherein said determining comprisesdetermining one magnitude spectrum from a present one of said frames, anddetermining another magnitude spectrum from a previous one of said frames, and wherein said method further comprisesdetermining one plurality of sinusoids from said one magnitude spectrum,determining another plurality of sinusoids from said another magnitude spectrum,matching ones of said one plurality of sinusoids with ones of said another plurality of sinusoids based on sinusoidal frequency,determining a phase spectrum from said present frame,calculating a second set of parameters modeling said determined phase spectrum by prediction of a phase spectrum for said present frame from a phase spectrum for a previous one of said frames based on said matched ones of said one and said another pluralities of sinusoids,encoding said second set of parameters as a second set of parameter signals representing said speech signals, andcommunicating said second set of parameter signals representing said speech signals for use in speech synthesis.
  - 7. A method in accordance with claim 1 wherein said determining comprisesdetermining one magnitude spectrum from a present one of said frames, anddetermining another magnitude spectrum from a previous one of said frames, and wherein said method further comprisesdetermining one plurality of sinusoids from said one magnitude spectrum,determining another plurality of sinusoids from said another magnitude spectrum,matching ones of said one plurality of sinusoids with ones of said another plurality of sinusoids based on sinusoidal frequency and amplitude,determining a phase spectrum from said present frame,calculating a second set of parameters modeling said determined phase spectrum by prediction of a phase spectrum for said present frame from a phase spectrum for a previous one of said frames based on said matched ones of said one and said another pluralities of sinusoids,encoding said second set of parameters as a second set of parameter signals representing said speech signals, andcommunicating said second set of parameter signals representing said speech signals for use in speech synthesis.
  - 8. A method in accordance with claim 1 wherein said determining comprisesdetermining one magnitude spectrum from a present one of said frames, anddetermining another magnitude spectrum from a previous one of said frames, and wherein said method further comprisesdetermining one plurality of sinusoids from said one magnitude spectrum,determining another plurality of sinusoids from said another magnitude spectrum,determining a pitch of said present frame,determining a pitch of said frame other than said present frame,determining a ratio of said pitch of said present frame and said pitch of said frame other than said present frame,matching ones of said one plurality of sinusoids with ones of said another plurality of sinusoids based on sinusoidal frequency and said determined ratio,determining a phase spectrum from said present frame,calculating a second set of parameters modeling said determined phase spectrum by prediction of a phase spectrum for said present frame from a phase spectrum for a previous one of said frames based on said matched ones of said one and said another pluralities of sinusoids,encoding said second set of parameters as a second set of parameter signals representing said speech signals, andcommunicating said second set of parameter signals representing said speech signals for use in speech synthesis.
  - 9. A method in accordance with claim 1 wherein said determining comprisesdetermining one magnitude spectrum from a present one of said frames, anddetermining another magnitude spectrum from a previous one of said frames other than said present frame, and wherein said method further comprisesdetermining one plurality of sinusoids from said one magnitude spectrum,determining another plurality of sinusoids from said another magnitude spectrum,determining a pitch of said present frame,determining a pitch of said frame other than said present frame,determining a ratio of said pitch of said present frame and said pitch of said frame other than said present frame,matching ones of said one plurality of sinusoids with ones of said another plurality of sinusoids based on sinusoidal frequency and amplitude and said determined ratio,determining a phase spectrum from said present frame,calculating a second set of parameters modeling said determined phase spectrum by prediction of a phase spectrum for said present frame from a phase spectrum for a previous one of said frames based on said matched ones of said one and said another pluralities of sinusoids,encoding said second set of parameters as a second set of parameter signals representing said speech signals, andcommunicating said second set of parameter signals representing said speech signals for use in speech synthesis.
  - 10. A method in accordance with claim 1 said method further comprisingdetermining a phase spectrum from a present one of said frames,obtaining a first phase estimate by parametric analysis of said present frame,obtaining a second phase estimate by prediction of a phase spectrum for said present frame from a phase spectrum for a previous one of said frames,selecting one of said first and second phase estimates,determining a second set of parameters, said second parameter set being associated with said selected phase estimate and said second parameter set modeling said determined phase spectrum,encoding said second set of parameters as a second set of parameter signals representing said speech signals, andcommunicating said second set of parameter signals representing said speech signals for use in speech synthesis.
  - 11. A method in accordance with claim 1 said method further comprisingdetermining a plurality of sinusoids from said determined magnitude spectrum,determining a phase spectrum from a present one of said frames,obtaining a first phase estimate by parametric analysis of said present frame,obtaining a second phase estimate by prediction of a phase spectrum for said present frame from a phase spectrum for a previous one of said frames,selecting one of said first and second phase estimates in accordance with an error criterion at the frequencies of said determined sinusoids,determining a second set of parameters, said second parameter set being associated with said selected phase estimate and said second parameter set modeling said determined phase spectrum,encoding said second set of parameters as a second set of parameter signals representing said speech signals, andcommunicating said second set of parameter signals representing said speech signals for use in speech synthesis.

12. In a harmonic speech coding arrangement, a method of processing speech signals comprisingdetermining from said speech signals a magnitude spectrum having a plurality of spectrum points, the frequency of each of said spectrum points being independent of said speech signals,calculating a set of parameters for a continuous magnitude spectrum that models said determined magnitude spectrum at each of said spectrum points, the number of parameters of said set being less than the number of said spectrum points,encoding said set of parameters as a set of parameter signals representing said speech signals,communicating said set of parameter signals representing said speech signals for use in speech synthesis, andsynthesizing speech based on said communicated set of parameter signals;
- wherein said calculating comprisescalculating said parameter set to fit said continuous magnitude spectrum to said determined magnitude spectrum in accordance with a minimum mean squared error criterion.

13. In a harmonic speech coding arrangement, a method of processing speech signals comprisingdetermining from said speech signals a magnitude spectrum having a plurality of spectrum points, the frequency of each of said spectrum points being independent of said speech signals,calculating a set of parameters for a continuous magnitude spectrum that models said determined magnitude spectrum at each of said spectrum points, the number of parameters of said set being less than the number of said spectrum points,encoding said set of parameters as a set of parameter signals representing said speech signals,communicating said set of parameter signals representing said speech signals for use in speech synthesis,determining a phase spectrum from said speech signals,calculating a second set of parameters modeling said determined phase spectrum,encoding said second set of parameters as a second set of parameter signals representing said speech signals,communicating said second set of parameter signals representing said speech signals for use in speech synthesis, andsynthesizing speech based on said communicated sets of parameter signals.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. A method in accordance with claim 13 wherein said calculating a second set of parameters comprisescalculating said second parameter set modeling said determined phase spectrum as a sum of a plurality of functions.
  - 15. A method in accordance with claim 14 wherein one of said functions is a vector chosen from a codebook.
  - 16. A method in accordance with claim 13 wherein said calculating a second set of parameters comprisescalculating said second parameter set using pole-zero analysis to model said determined phase spectrum.
  - 17. A method in accordance with claim 13 wherein said calculating a second set of parameters comprisescalculating said second parameter set using all pole analysis to model said determined phase spectrum.
  - 18. A method in accordance with claim 13 wherein said calculating a second set of parameters comprisesusing pole-zero analysis to model said determined phase spectrum,using all pole analysis to model said determined phase spectrum,selecting one of said pole-zero analysis and said all pole analysis, anddetermining said second parameter set based on said selected analysis.

19. In a harmonic speech coding arrangement, a method of processing speech signals comprisingdetermining from said speech signals a magnitude spectrum having a plurality of spectrum points, the frequency of each of said spectrum points being independent of said speech signals,calculating a set of parameters for a continuous magnitude spectrum that models said determined magnitude spectrum at each of said spectrum points, the number of parameters of said set being less than the number of said spectrum points,encoding said set of parameters as a set of parameter signals representing said speech signals,communicating said set of parameter signals representing said speech signals for use in speech synthesis,determining a plurality of sinusoids from said determined magnitude spectrum,determining a phase spectrum from said speech signals,calculating a second set of parameters modeling said determined phase spectrum at the frequencies of said determined sinusoids, andencoding said second set of parameters as a second set of parameter signals representing said speech signals,communicating said second set of parameter signals representing said speech signals for use in speech synthesis, andsynthesizing speech based on said communicated sets of parameter signals.

20. In a harmonic speech coding arrangement, a method of synthesizing speech comprisingreceiving a set of parameters corresponding to input speech comprising frames of input speech,determining a spectrum from said parameter set, said spectrum having amplitude values for a range of frequencies, said determining a spectrum comprisingdetermining an estimated magnitude spectrum for a present one of said frames as a sum of a plurality of functions, one of said functions being an estimated magnitude spectrum for a previous one of said frames, said method further comprisingdetermining a plurality of sinusoids from said spectrum, the sinusoidal frequency of at least one of said sinusoids being determined based on amplitude values of said spectrum, andsynthesizing speech as a sum of said sinusoids.
- View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28)
- - 21. A method in accordance with claim 20 wherein at least one of said functions is a magnitude spectrum of a periodic pulse train, the frequency of said pulse train being defined by said received parameter set.
  - 22. A method in accordance with claim 20 wherein one of said functions is a magnitude spectrum of a first periodic pulse train and another one of said functions is a magnitude spectrum of a second periodic pulse train, the frequencies of said first and second pulse trains being defined by said received parameter set.
  - 23. A method in accordance with claim 20 wherein said determining a spectrum comprisesdetermining an estimated phase spectrum using an all pole model and said received parameter set.
  - 24. A method in accordance with claim 20 wherein said receiving step comprisesreceiving said parameter set for said present frame of speech, and wherein said determining a spectrum comprisesin response to a first value of one parameter of said parameter set, determining an estimated phase spectrum for said present frame using a parametric model and said parameter set, andin response to a second value of said one parameter, determining an estimated phase spectrum for said present frame using a prediction model based on a previous frame of speech.
  - 25. A method in accordance with claim 20 wherein said receiving comprisesreceiving one set of parameters for one of said frames of input speech and another set of parameters for another of said frames of input speech after said one frame, wherein said determining a spectrum comprisesdetermining one spectrum from said one parameter set and another spectrum from said another parameter set, wherein said determining a plurality of sinusoids comprisesdetermining one plurality of sinusoids from said one spectrum and another plurality of sinusoids from said another spectrum, wherein said method further comprisesmatching ones of said one plurality of sinusoids with ones of said another plurality of sinusoids based on sinusoidal frequency, and wherein said synthesizing comprisesinterpolating between matches ones of said one and said another pluralities of sinusoids.
  - 26. A method in accordance with claim 20 wherein said receiving comprisesreceiving one set of parameters for one of said frames of input speech and another set of parameters for another of said frames of input speech after said one frame, wherein said determining a spectrum comprisesdetermining one spectrum from said one parameter set and another spectrum from said another parameter set, wherein said determining a plurality of sinusoids comprisesdetermining one plurality of sinusoids from said one spectrum and another plurality of sinusoids from said another spectrum, wherein said method further comprisesmatching ones of said one plurality of sinusoids with ones of said another plurality of sinusoids based on sinusoidal frequency and amplitude, and wherein said synthesizing comprisesinterpolating between matched ones of said one and said another pluralities of sinusoids.
  - 27. A method in accordance with claim 20 wherein said receiving comprisesreceiving one set of parameters for one of said frames of input speech and another set of parameters for another of said frames of input speech after said one frame, wherein said determining a spectrum comprisesdetermining one spectrum from said one parameter set and another spectrum from said another parameter set, wherein said determining a plurality of sinusoids comprisesdetermining one plurality of sinusoids from said one spectrum and another plurality of sinusoids from said another spectrum, wherein said method further comprisesdetermining a pitch of said present frame,determining a pitch of said frame other than said present frame,determining a ratio of said pitch of said one frame and said pitch of said another frame, andmatching ones of said one plurality of sinusoids with ones of said another plurality of sinusoids based on sinusoidal frequency and said determined ratio, and wherein said synthesizing comprisesinterpolating between matched ones of said one and said another pluralities of sinusoids.
  - 28. A method in accordance with claim 20 wherein said receiving comprisesreceiving one set of parameters for one of said frames of input speech and another set of parameters for another of said frames of input speech after said one frame, wherein said determining a spectrum comprisesdetermining one spectrum from said one parameter set and another spectrum from said another parameter set, wherein said determining a plurality of sinusoids comprisesdetermining one plurality of sinusoids from said one spectrum and another plurality of sinusoids from said another spectrum, wherein said method further comprisesdetermining a pitch of said present frame,determining a pitch of said frame other than said present frame,determining a ratio of said pitch of said one frame and said pitch of said another frame, andmatching ones of said one plurality of sinusoids with ones of said another plurality of sinusoids based on sinusoidal frequency and amplitude and said determined ratio, and wherein said synthesizing comprisesinterpolating between matched ones of said one and said another pluralities of sinusoids.

29. In a harmonic speech coding arrangement, a method of synthesizing speech comprisingreceiving a set of parameters,determining a spectrum having amplitude values for a range of frequencies from said parameter set by estimating a magnitude spectrum as a sum of a plurality of functions, wherein one of said functions is a vector from a codebook, said vector being identified by an index defined by said received parameter set,determining a plurality of sinusoids from said spectrum, the sinusoidal frequency of at least one of said sinusoids being determined based on amplitude values of said spectrum, andsynthesizing speech as a sum of said sinusoids.

30. In a harmonic speech coding arrangement, a method of synthesizing speech comprisingreceiving a set of parameters,determining a spectrum from said parameter set, said spectrum having amplitude values for a range of frequencies,determining a plurality of sinusoids from said spectrum, the sinusoidal frequency of at least one of said sinusoids being determined based on amplitude values of said spectrum, andsynthesizing speech as a sum of said sinusoids;
- wherein said determining a spectrum comprisesdetermining an estimated phase spectrum as a sum of a plurality of functions.
- View Dependent Claims (31)
- - 31. A method in accordance with claim 30 wherein one of said functions is a vector from a codebook, said vector being identified by an index defined by said received parameter set.

32. In a harmonic speech coding arrangement, a method of synthesizing speech comprisingreceiving a set of parameters,determining a spectrum from said parameter set, said spectrum having amplitude values for a range of frequencies,determining a plurality of sinusoids from said spectrum, the sinusoidal frequency of at least one of said sinusoids being determined based on amplitude values of said spectrum, andsynthesizing speech as a sum of said sinusoids;
- wherein said determining a spectrum comprisesdetermining an estimated phase spectrum using a pole-zero model and said received parameter set.

33. In a harmonic speech coding arrangement, a method of synthesizing speech comprisingreceiving a set of parameters,determining a spectrum from said parameter set, said spectrum having amplitude values for a range of frequencies,determining a plurality of sinusoids from said spectrum, the sinusoidal frequency of at least one of said sinusoids being determined based on amplitude values of said spectrum, andsynthesizing speech as a sum of said sinusiods;
- wherein said determining a spectrum comprisesdetermining an estimated magnitude spectrum, wherein said determining a plurality of sinusoids comprisesfinding a peak in said estimated magnitude spectrum,subtracting from said estimated magnitude spectrum a spectral component for a sinusoid with the frequency and amplitude of said peak, andrepeating said finding and said subtracting until the estimated magnitude spectrum is below a threshold for all frequencies.
- View Dependent Claims (34)
- - 34. A method in accordance with claim 33 wherein said spectral component comprises a wide magnitude spectrum window.

35. In a harmonic speech coding arrangement, a method of synthesizing speech comprisingreceiving a set of parameters,determining a spectrum from said parameter set, said spectrum having amplitude values for a range of frequencies,determining a plurality of sinusoids from said spectrum, the sinusoidal frequency of at least one of said sinusoids being determined based on amplitude values of said spectrum, andsynthesizing speech as a sum of said sinusoids;
- wherein said determining a spectrum comprisesdetermining an estimated magnitude spectrum, anddetermining an estimated phase spectrum, wherein said determining a plurality of sinusoids comprisesdetermining sinusoidal amplitude and frequency for each of said sinusoids based on said estimated magnitude spectrum, anddetermining sinusoidal phase for each of said sinusoids based on said estimated phase spectrum.

36. In a harmonic speech coding arrangement, a method of processing speech, said speech comprising frames of speech, said method comprisingdetermining from said speech a magnitude spectrum having a plurality of spectrum points, the frequency of each of said spectrum points being independent of said speech, said magnitude of spectrum having a plurality of points being determined from a present one of said frames,calculating a set of parameters for a continuous magnitude spectrum that models said determined magnitude spectrum at each of said spectrum points, the number of parameters of said set being less than the number of said spectrum points, said continuous magnitude spectrum comprising a sum of a plurality of functions, one of said functions being a magnitude spectrum for a previous one of said frames,communicating said parameter set,receiving said communicated parameter set,determining a spectrum from said received parameter set,determining a plurality of sinusoids from said spectrum determined from said received parameter set, andsynthesizing speech as a sum of said sinusoids.

37. In a harmonic speech coding arrangement, apparatus comprisingmeans responsive to speech signals for determining a magnitude spectrum having a plurality of spectrum points, said speech signals comprising frames of speech, said determining means determining said magnitude spectrum having a plurality of spectrum points from a present one of said frames,means responsive to said determining means for calculating a set of parameters for a continuous magnitude spectrum that models said determined magnitude spectrum at each of said spectrum points, the number of parameters of said set being less than the number of said spectrum points, said continuous magnitude spectrum comprising a sum of a plurality of functions, one of said functions being a magnitude spectrum for a previous one of said frames,means for encoding said set of parameters as a set of parameter signals representing said speech signals,means for communicating said set of parameter signals representing said speech signals for use in speech synthesis, andmeans for synthesizing speech based on said set of parameter signals communicated by said communicating means.

38. In a harmonic speech coding arrangement, a speech synthesizer comprisingmeans responsive to receipt of a set of parameters corresponding to input speech comprising frames of input speech for determining a spectrum, said spectrum having amplitude values for a range of frequencies, said determining means including means for developing an estimated magnitude spectrum for a present one of said frames as a sum of a plurality of functions, one of said functions being an estimated magnitude spectrum for a previous one of said frames,means for determining a plurality of sinusoids from said spectrum, the sinusoidal frequency of at least one said sinusoids being determined based on amplitude values of said spectrum, andmeans for synthesizing speech as a sum of said sinusoids.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
American Telephone & Telegraph Company (AT&T, Inc.), Bell Telephone Laboratories, Inc. (Nokia Corporation)
Original Assignee
AT&T, Inc.
Inventors
Thomson, David L.
Primary Examiner(s)
Fleming, Michael R.
Assistant Examiner(s)
Doerrler, Michelle

Application Number

US07/179,170
Time in Patent Office

1,740 Days
Field of Search

381/29-40, 381/51, 364/513.5
US Class Current

704/200
CPC Class Codes

G10L 19/02 using spectral analysis, e....

Harmonic speech coding arrangement where a set of parameters for a continuous magnitude spectrum is determined by a speech analyzer and the parameters are used by a synthesizer to determine a spectrum which is used to determine senusoids for synthesis

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

149 Citations

38 Claims

Specification

Solutions

Use Cases

Quick Links

Harmonic speech coding arrangement where a set of parameters for a continuous magnitude spectrum is determined by a speech analyzer and the parameters are used by a synthesizer to determine a spectrum which is used to determine senusoids for synthesis

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

149 Citations

38 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links