Method and apparatus for pitch estimation using perception based analysis by synthesis
First Claim
1. A method for estimating pitch of a speech signal comprising the steps of:
- inputting a speech signal;
generating a plurality of pitch candidates corresponding to a plurality of sub-ranges within a pitch search range;
generating a first signal based on a segment of said speech signal;
generating a reference speech signal based on the first signal;
generating a synthetic speech signal for each of the plurality of pitch candidates; and
comparing the synthetic speech signal for each of the plurality of pitch candidates with the reference speech signal to determine an optimal pitch estimate.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention provides a method for pitch estimation which utilizes perception based analysis by synthesis for improved pitch estimation over a variety of input speech conditions. Initially, pitch candidates are generated corresponding to a plurality of sub-ranges within a pitch search range. Then a residual spectrum is determined for a segment of speech and a reference speech signal is generated from the residual spectrum using sinusoidal synthesis and linear predictive coding (LPC) synthesis. A synthetic speech signal is generated for each of the pitch candidates using sinusoidal and LPC synthesis. Finally, the synthetic speech signal for each pitch candidate is compared with the reference residual signal to determine an optimal pitch estimate based on a pitch period of a synthetic speech signal that provides a maximum signal to noise ratio.
53 Citations
8 Claims
-
1. A method for estimating pitch of a speech signal comprising the steps of:
-
inputting a speech signal; generating a plurality of pitch candidates corresponding to a plurality of sub-ranges within a pitch search range; generating a first signal based on a segment of said speech signal; generating a reference speech signal based on the first signal; generating a synthetic speech signal for each of the plurality of pitch candidates; and comparing the synthetic speech signal for each of the plurality of pitch candidates with the reference speech signal to determine an optimal pitch estimate. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for estimating pitch of a speech signal comprising the steps of:
-
inputting a speech signal; determining a plurality of pitch candidates each corresponding to a sub-range within a pitch search range; analyzing a segment of a speech signal using linear predictive coding (LPC) to generate LPC filter coefficients for the acoustic signal segment; LPC inverse filtering the speech signal segment using the LPC filter coefficients to provide a residual signal which is spectrally flat; transforming the residual signal into the frequency domain to generate a residual spectrum; analyzing the residual spectrum to determine peak amplitudes and corresponding frequencies and phases of the residual spectrum; generating a reference residual signal from the peak amplitudes, frequencies and phases of the residual spectrum using sinusoidal synthesis; generating a reference speech signal by LPC synthesis filtering the reference residual signal; performing harmonic sampling for each of the plurality of pitch candidates to determine the harmonic components for each of the plurality of the plurality of pitch candidates; generating a synthetic residual signal for each of the plurality of pitch candidates from the harmonic components for each of the plurality of pitch candidates using sinusoidal synthesis; LPC synthesis filtering the synthetic residual signal for each of the plurality of pitch candidates to generate a synthetic speech signal for each of the plurality of pitch candidates; and comparing each of the synthetic speech signal for each of the plurality pitch candidates with the reference residual signal to determine an optimal pitch estimate based on a synthetic speech signal for a pitch that provides a maximum signal to noise ratio.
-
Specification