Digital speech sinusoidal vocoder with transmission of only subset of harmonics
First Claim
1. A processing system for synthesizing voice from encoded information representing speech frames each having a predetermined number of evenly spaced samples of instantaneous amplitude of speech with said encoded information for each frame representing frame energy and a set of speech parameters and a fundamental frequency signal of the speech and offset signals representing the difference between the theoretical harmonic frequencies as derived from a fundamental frequency signal and a subset of the actual harmonic frequencies, said system comprising:
- means responsive to the offset signals and the fundamental frequency signal of one of said frames for calculating a subset of harmonic phase signals corresponding to said offset signals;
means responsive to said fundamental frequency signal for computing the remaining harmonic phase signals for said one of said frames;
means responsive to the frame energy and the set of speech parameters of said one of said frames for determining the amplitudes of said fundamental signal and said subset of said harmonic phase signals and said remaining harmonic phase signals; and
means for generating replicated speech in response to said fundamental signal and said subset of said harmonic phase signals and said remaining harmonic phase signals and the determined amplitudes for said one of said frames.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech analyzer and synthesizer system using a sinusoidal encoding and decoding technique for voiced frames and noise excitation or multipulse excitation for unvoiced frames. For voiced frames, the analyzer transmits the pitch, values for a subset of offsets defining differences between harmonic frequencies and a fundamental frequency, total frame energy, and linear predictive coding, LPC, coefficients. The synthesizer is responsive to that information to determine the harmonic frequencies from the offset information for a subset of the harmonics and to determine the remaining harmonics from the fundamental frequency. The synthesizer then determines the phase for the fundamental frequency and harmonic frequencies and determines the amplitudes of the fundamental and harmonics using the total frame energy and the LPC coefficients. Once the phase and amplitudes have been determined for the fundamental and harmonic frequencies, the synthesizer performs a sinusoidal analysis. In another embodiment, the remaining harmonic frequencies are determined by calculating the theoretical harmonic frequencies for the remaining harmonic frequencies and grouping these theoretical frequencies into groups having the same number as the number of offsets transmitted. The offsets are then added to the corresponding theoretical harmonics of each of the groups of the remaining harmonic frequencies to generate the remaining harmonic frequencies. In a third embodiment, the offset signals are randomly permuted before being added to the groups of theoretical frequencies to generate the remaining harmonic frequencies.
-
Citations
24 Claims
-
1. A processing system for synthesizing voice from encoded information representing speech frames each having a predetermined number of evenly spaced samples of instantaneous amplitude of speech with said encoded information for each frame representing frame energy and a set of speech parameters and a fundamental frequency signal of the speech and offset signals representing the difference between the theoretical harmonic frequencies as derived from a fundamental frequency signal and a subset of the actual harmonic frequencies, said system comprising:
-
means responsive to the offset signals and the fundamental frequency signal of one of said frames for calculating a subset of harmonic phase signals corresponding to said offset signals; means responsive to said fundamental frequency signal for computing the remaining harmonic phase signals for said one of said frames; means responsive to the frame energy and the set of speech parameters of said one of said frames for determining the amplitudes of said fundamental signal and said subset of said harmonic phase signals and said remaining harmonic phase signals; and means for generating replicated speech in response to said fundamental signal and said subset of said harmonic phase signals and said remaining harmonic phase signals and the determined amplitudes for said one of said frames. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A processing system for encoding human speech comprising:
-
means for segmenting the speech into a plurality of speech frames, each having a predetermined number of evenly spaced samples of instantaneous amplitudes of speech and each of which overlaps by a predefined number of samples with the previous and subsequent frames; means for calculating a set of speech parameter signals defining a vocal tract for each frame; means for calculating the frame energy per frame of the speech samples; means for performing a spectral analysis of said speech samples of each frame to produce a spectrum for each frame; means for detecting the fundamental frequency signal for each frame from the spectrum corresponding to each frame; means for determining a subset of harmonic frequency signals for each frame from the spectrum corresponding to each frame; means for determining offset signals representing the difference between each of said harmonic frequency signals and multiples of said fundamental frequency signal; and means for transmitting encoded representations of said frame energy and said set of speech parameters and said fundamental frequency signal and said offset signals for subsequent speech synthesis. - View Dependent Claims (15, 16, 17, 18)
-
-
19. A method for synthesizing voice from encoded information representing speech frames each having a predetermined number of evenly spaced samples of instantaneous amplitude of speech with said encoded information for each frame comprising frame energy and a set of speech parameters and a fundamental frequency of speech and offset signals representing the difference between the theoretical harmonic frequencies as derived from a fundamental frequency signals and a subset of actual harmonic frequencies, comprising the steps of:
-
calculating a subset of harmonic phase signals corresponding to said offset signals; computing the remaining harmonic phase signals for said one of said frames from said fundamental frequency signal; determining the amplitudes of said fundamental signal and said subset of harmonic phase signals and said remaining harmonic phase signals from the frame energy and the set of speech parameters of said one of said frame; and generating replicated speech in response to said fundamental signal and said subset and remaining harmonic phase signals and said determined amplitudes for said one of said frames. - View Dependent Claims (20, 21, 22, 23, 24)
-
Specification