Method of estimating voice pitch by rotating two dimensional time-energy region on speech acoustic signal plot
First Claim
1. A method of estimating a pitch of a speech acoustic signal in a time interval in which said speech acoustic signal is a voiced one, characterized in thatthe pitch corresponds to a distance between contact points of a circle and a plot of energy of said speech acoustic signal as a function of time, the plot being normalized to a limit value, said contact points being obtained by rotating said circle on said plot.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of estimating the pitch of a speech acoustic signal in a time interval in which said signal is a voiced one, wherein the pitch corresponds to the distance between the contact points of a circle and a plot, normalized to a limit value, of the energy of said speech acoustic signal as a function of time; said contact points being obtained by rolling said circle on said plot.
-
Citations
18 Claims
-
1. A method of estimating a pitch of a speech acoustic signal in a time interval in which said speech acoustic signal is a voiced one, characterized in that
the pitch corresponds to a distance between contact points of a circle and a plot of energy of said speech acoustic signal as a function of time, the plot being normalized to a limit value, said contact points being obtained by rotating said circle on said plot.
-
2. A method of estimating a pitch of a speech acoustic signal in a first time interval in which said speech acoustic signal is a voiced one, comprising the steps of
a) sampling, according to a sampling period, the energy of the speech acoustic signal to form discrete values and digitizing the discrete values, according to a code, at least in said first time interval, thus obtaining a sequence of binary values, b) normalizing said binary values to a limit value to provide a normalized binary value sequence, c) determining a first relative maximum of said normalized binary value sequence, d) computing h(z) which represents an estimate of pitch of the speech acoustic signal using the formula: -
space="preserve" listing-type="equation">h(z)=sqrt [R.sup.2 -n.sup.2 ]+E(x)-sqrt [R.sup.2 -(z-n).sup.2 ],where x is the position in said sequence of said first maximum, E(x) is the energy of the speech acoustic signal representing the binary value of said first relative maximum, R is a radius of the circle having a predetermined value, n is equal to an initial value, for values of z in an interval [1 . . . . n+R], e) checking if there is at least one value of an variable z such that the conditions
space="preserve" listing-type="equation">E(x+z)≧
E(x+z-1), E(x+z)≧
E(x+z+1) and
space="preserve" listing-type="equation">E(x+z)≧
h(z) are met, andf) repeating steps d) and e) with an increased value of n until such check has a positive outcome of n=R; whereby, if such check has a positive outcome, said pitch corresponds to the value of the variable z so determined. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- 6. A method according to claim 2, characterized in that said limit value is 255 and said step b) is realized according to the formula
- space="preserve" listing-type="equation">En=trunc [(-E*255)/MAX] if E>
0
En=0 if E≧
0where MAX is the absolute value of the negative maximum binary value contemplated by said code. - space="preserve" listing-type="equation">En=trunc [(-E*255)/MAX] if E>
-
-
7. A method according to claim 2, characterized in that said step c) is realized, at first, by individuating all the relative maxima of said binary value sequence and then choosing the one having the maximum binary value.
-
8. A method according to claim 2, characterized in that the method further comprises the steps of using a minimum value INF and a maximum value SUP of the pitch for the human voice, and using an interval in said step d) that corresponds to INF . . . min (SUP,n+R).
-
9. A method according to claim 8, wherein the minimum value INF equals 2.5 milliseconds and the maximum value SUP equals 2.5 milliseconds.
-
10. A method according to claim 2, characterized in that the step of checking whether said first time interval is a voiced one comprises the steps of:
-
a) verifying that said first time interval is of silent type if the energy of the speech acoustic signal does not exceed a first threshold in said interval, and b) verifying that said first time interval is of unvoiced type if for each sub-interval of predetermined length of such interval, an absolute energy of said speech acoustic signal does not exceed a second threshold, and at the same time the energy of said speech acoustic signal is null in a number of time instants greater than a third threshold; whereby said check has a positive outcome if both verifications of steps a) and b) have had a negative outcome.
-
-
11. A method according to claim 2, wherein the radius of the circle R has a value of about 13.25 milliseconds.
-
12. A method according to claim 2, wherein the initial value n of the variable z has a value of about 1.
-
13. A method of estimating a pitch of a voice represented by a plot of energy of a speech acoustic signal as a function of time, comprising the steps of:
-
defining a two-dimensional time-energy region having a tangent contact point (P) on the plot of energy of the speech acoustic signal; rotating the two-dimensional time-energy region to search for peaks in the energy of the speech acoustic signal and to obtain a peak contact point (Q) on the plot of energy of the speech acoustic signal; and corresponding a distance between the tangent contact point (P) and the peak contact point (Q) to the pitch of the voice. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification