Method and system for analyzing voices
First Claim
1. A method for analyzing voices by generating pitch mark information as time reference positions corresponding to a pitch cycle of voice waveforms comprising the steps of:
- temporarily storing a portion of the voice waveforms using voice waveform storing means;
generating rough pitch information from said voice waveforms stored temporarily by using pitch analyzing means;
inputting said voice waveforms stored temporarily to an adaptive filter and changing a cut-off frequency or a center frequency of said adaptive filter according to said rough pitch information, and passing only a fundamental component extracted from the inputted voice waveforms; and
detecting plural maximum points at one side of said fundamental component using peak detecting means, and generating a series of pitch mark information for a whole portion of the voice waveforms.
1 Assignment
0 Petitions
Accused Products
Abstract
It is to assign proper pitch marks to voice waveforms, thereby to obtain smoothly synthesized voices and to control pitches of voices very accurately according to pitch marks of recorded messages.
Any one of the fixed low-pass filters 3002-a to 3002-d is set so as to pass only fundamental component of voices and each of peak detectors 3003-a to 3003-d detects peaks and the channel selector 3004 is selected, thereby to keep taking out of peak information for fundamental waves. The channel selector 3004 decides a channel to be a correct channel if intervals of peaks detected by the peak detectors 3003-a to d are changed smoothly in the channel. According to this peak information, pitches of voices are analyzed, so that the adaptive filter 3005 passes only fundamental component of voices and the peak detector 3006 detects peaks of fundamental waves, thereby to assign pitch marks to voice waveforms.
-
Citations
17 Claims
-
1. A method for analyzing voices by generating pitch mark information as time reference positions corresponding to a pitch cycle of voice waveforms comprising the steps of:
-
temporarily storing a portion of the voice waveforms using voice waveform storing means;
generating rough pitch information from said voice waveforms stored temporarily by using pitch analyzing means;
inputting said voice waveforms stored temporarily to an adaptive filter and changing a cut-off frequency or a center frequency of said adaptive filter according to said rough pitch information, and passing only a fundamental component extracted from the inputted voice waveforms; and
detecting plural maximum points at one side of said fundamental component using peak detecting means, and generating a series of pitch mark information for a whole portion of the voice waveforms. - View Dependent Claims (3, 4, 5, 9, 10, 11, 12, 14, 15)
a value of voice waveform at a position represented by each pitch mark included in said pitch mark information candidates is read from said voice waveform storage; and
said read values are considered wholly, thereby to calculate a peak matching degree, so that a pitch mark candidate that takes the maximum peak matching degree is selected.
-
-
15. A method for analyzing voices according to claim 14, wherein said peak matching degree is a sum of said read values.
-
2. A method for analyzing voices by generating pitch mark information as time reference positions corresponding to a pitch cycle of voice waveforms comprising the steps of:
-
setting cut-off frequencies of plural fixed low-pass filters so that at least one of said plural fixed low-pass filters passes only a fundamental component of input voice waveforms;
outputting from each of said fixed low-pass filters waveforms of low frequency components of the inputted voice waveforms;
detecting, by using peak detecting means, plural maximum points on one side of waveforms of said low frequency components output from said fixed low-pass filters and outputting said detected plural maximum points as peak information;
selecting, by using channel selecting means, a peak detecting channel every predetermined period on basis of a specified selection reference by using the peak information output from said plural peak detecting means; and
generating a series of pitch mark information for the voice waveforms by using the selected peak information output from said selected peak detecting channel. - View Dependent Claims (6, 7, 8, 13, 16, 17)
a peak detecting channel is selected, said selected peak detecting channel having a minimum change rate of said temporary frequency within a specified unit time. -
8. A method for analyzing voices according to claim 2, wherein meaning of the selection of the peak detecting channel on a basis of the specified selection reference is that from a time interval between a specified peak and a peak adjacent to said specified peak, the time interval of which is obtained from the peak information output from each of said peak detecting means, a temporary pitch frequency is obtained, at the specified peak position and
when plural peak positions included in a specified time range and said pitch frequencies corresponding to those peak positions are represented as points on a coordinate system taking peak positions on its abscissa axis and temporary frequencies on its ordinate axis, and those points are connected in an order of peak positions, thereby to form plural lines, and the peak detecting channel is selected so that a variance of an inclination of those plural lines is minimized for said selected peak detecting channel. -
13. A method for analyzing voices according to claim 2, wherein said fixed low-pass filter takes 0 as an actual delay value for every frequency.
-
16. A method for analyzing voices according to claim 2, wherein by using means for collating pitch marks plural pitch mark information candidates are generated by shifting each pitch mark forward or backward with maintaining the interval between those pitch marks at fixed, said each pitch mark being included in said series of pitch mark information which was created before once;
-
a value of voice waveform at a position represented by each pitch mark included in said pitch mark information candidates is read from said voice waveform storage; and
said read values are considered wholly, thereby to calculate a peak matching degree, so that a pitch mark candidate that takes the maximum peak matching degree is selected.
-
-
17. A method for analyzing voices according to claim 16, wherein said peak matching degree is a total of said read values.
-
Specification