Processing speech signals
First Claim
Patent Images
1. A method of processing a speech signal in noise, comprising:
- determining a frequency spectrum of a frame of the speech signal;
determining a value of the pitch of the frame of the speech signal;
characterised by;
identifying peaks (12, 14, 16, 22, 28, 32) in the spectrum; and
evaluating the peaks (12, 14, 16, 22, 28, 32) individually to determine respective scores for the peaks (12, 14, 16, 22, 28, 32), the score for a peak (12, 14, 16, 22, 28, 32) being a measure of the likelihood that the peak (12, 14, 16, 22, 28, 32) is a harmonic band of the speech signal.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of processing a speech signal in noise, comprising: determining a frequency spectrum of a frame of the speech signal; determining a value of the pitch of the frame of the speech signal; identifying peakes (12, 14, 16, 22, 28, 32) in the spectrum; and evaluating the peaks individually to determine respective scores for the peaks, the score for a peak being a measure of the likelihood that the peak is a harmonic band of teh speech signal. As a consequence there is: (a) no need for high f0 accuracy as there is no need to predict long sequences of harmonic positions; and (b) no need for an assumption of harmonic integrity at all points.
-
Citations
31 Claims
-
1. A method of processing a speech signal in noise, comprising:
-
determining a frequency spectrum of a frame of the speech signal;
determining a value of the pitch of the frame of the speech signal;
characterised by;
identifying peaks (12, 14, 16, 22, 28, 32) in the spectrum; and
evaluating the peaks (12, 14, 16, 22, 28, 32) individually to determine respective scores for the peaks (12, 14, 16, 22, 28, 32), the score for a peak (12, 14, 16, 22, 28, 32) being a measure of the likelihood that the peak (12, 14, 16, 22, 28, 32) is a harmonic band of the speech signal. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 30, 31)
-
-
28. A method of performing automatic speech recognition on a speech signal in noise, comprising normalising the speech energy level of the signal and deriving a root-cepstrum using the normalised speech energy level.
-
29. A method of identifying peaks (12, 14, 16) in a frequency spectrum of a frame of a speech signal, comprising:
-
differentiating the frequency spectrum with respect to frequency using two scales, the first scale being over a higher number of frequency bins than the second scale, and weighting the results from the two scales such that the differentiation using the first scale identifies significant speech peaks and the differentiation using the second scale improves the precision of the calculation of the frequency position of the identified peak.
-
Specification