System for analyzing human speech
First Claim
1. A method of analyzing human speech for determining the pitch of speech segments while using more than one pitch detection algorithm, characterized by comprising the steps of:
- (a) determining an amplitude spectrum of a speech segment in a first elementary pitch meter, and determining significant peak positions in said spectrum,(b) determining an autocorrelation function and significant peak positions therein in a second elementary pitch meter,(c) utilizing said significant peak positions of the amplitude spectrum and the autocorrelation function, respectively, as input data for selecting a value for the pitch and period, respectively, and determining a sequence of consecutive integral multiples of said value, and the determination of intervals around said value and the multiples thereof, these intervals defining apertures of a mask, said apertures corresponding to harmonic multiplication factors,(d) computing a quality figure for each pitch and period, respectively, in accordance with a criterion indicating the degree to which the significant peak positions and mark apertures match,(e) repeating steps (c) and (d) for consecutive higher values of the pitch and period, respectively, up to a predetermined highest value, to provide a sequence of quality figures associated with these pitch and period values, respectively,(f) selecting a predetermined number of values of said pitch and period, respectively, having the highest quality figures,(g) converting the values for the respective periods into values for pitch, and(h) combining the predetermined numbers of selected values for pitch, and for pitch converted from period, with their associated quality figures to form an estimation of the most likely pitch.
1 Assignment
0 Petitions
Accused Products
Abstract
The pitch of human speech segments is analyzed using at least two different pitch detection algorithms, a respective plurality of most likely values of pitch is selected by each of those algorithms, and these values and their respective quality figures are analyzed statistically to determine the most likely pitch. One algorithm operates in the frequency domain, by analyzing an amplitude spectrum, and the other algorithm operates in the time domain using an autocorrelation function. Significant peak positions of the amplitude spectrum or autocorrelation function are evaluated in respective harmonic sieves, to provide respective quality figures indicating the degree to which peak frequency or period periods of the spectrum or autocorrelation function output match the apertures of the harmonic sieve. A predetermined number of values of pitch, and of period, are selected having the highest quality figures. After conversion of the values for period into values of pitch, these values with their associated quality figures are analyzed statistically to form an estimation of the most likely pitch.
38 Citations
8 Claims
-
1. A method of analyzing human speech for determining the pitch of speech segments while using more than one pitch detection algorithm, characterized by comprising the steps of:
-
(a) determining an amplitude spectrum of a speech segment in a first elementary pitch meter, and determining significant peak positions in said spectrum, (b) determining an autocorrelation function and significant peak positions therein in a second elementary pitch meter, (c) utilizing said significant peak positions of the amplitude spectrum and the autocorrelation function, respectively, as input data for selecting a value for the pitch and period, respectively, and determining a sequence of consecutive integral multiples of said value, and the determination of intervals around said value and the multiples thereof, these intervals defining apertures of a mask, said apertures corresponding to harmonic multiplication factors, (d) computing a quality figure for each pitch and period, respectively, in accordance with a criterion indicating the degree to which the significant peak positions and mark apertures match, (e) repeating steps (c) and (d) for consecutive higher values of the pitch and period, respectively, up to a predetermined highest value, to provide a sequence of quality figures associated with these pitch and period values, respectively, (f) selecting a predetermined number of values of said pitch and period, respectively, having the highest quality figures, (g) converting the values for the respective periods into values for pitch, and (h) combining the predetermined numbers of selected values for pitch, and for pitch converted from period, with their associated quality figures to form an estimation of the most likely pitch.
-
-
2. An apparatus for analyzing human speech to determine a pitch of speech segments, using more than one pitch detection algorithm, comprising
a first elementary pitch meter, operating in the frequency domain, for determining a first plurality of significant peak frequencies in a speech segment, means for computing a quality figure for each of said significant peak frequencies, a second elementary pitch meter, operating in the time domain, for determining significant peak periods of said segment, means for computing a quality figure for each of said significant peak periods, means for determining period-derived frequencies corresponding to said significant peak periods, means for selecting a predetermined number of values of said significant peak frequencies and said significant peak period-derived frequencies, respectively having the highest quality figures, and combining said selected values of frequency and period-derived frequency with the associated quality figures to form an estimate of the most likely pitch.
Specification