Method and system for analyzing voices

US 6,349,277 B1
Filed: 10/29/1999
Issued: 02/19/2002
Est. Priority Date: 04/09/1997
Status: Expired due to Term

First Claim

Patent Images

1. A method for analyzing voices by generating pitch mark information as time reference positions corresponding to a pitch cycle of voice waveforms comprising the steps of:

temporarily storing a portion of the voice waveforms using voice waveform storing means;

generating rough pitch information from said voice waveforms stored temporarily by using pitch analyzing means;

inputting said voice waveforms stored temporarily to an adaptive filter and changing a cut-off frequency or a center frequency of said adaptive filter according to said rough pitch information, and passing only a fundamental component extracted from the inputted voice waveforms; and

detecting plural maximum points at one side of said fundamental component using peak detecting means, and generating a series of pitch mark information for a whole portion of the voice waveforms.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

It is to assign proper pitch marks to voice waveforms, thereby to obtain smoothly synthesized voices and to control pitches of voices very accurately according to pitch marks of recorded messages.

Any one of the fixed low-pass filters 3002-a to 3002-d is set so as to pass only fundamental component of voices and each of peak detectors 3003-a to 3003-d detects peaks and the channel selector 3004 is selected, thereby to keep taking out of peak information for fundamental waves. The channel selector 3004 decides a channel to be a correct channel if intervals of peaks detected by the peak detectors 3003-a to d are changed smoothly in the channel. According to this peak information, pitches of voices are analyzed, so that the adaptive filter 3005 passes only fundamental component of voices and the peak detector 3006 detects peaks of fundamental waves, thereby to assign pitch marks to voice waveforms.

Citations

17 Claims

1. A method for analyzing voices by generating pitch mark information as time reference positions corresponding to a pitch cycle of voice waveforms comprising the steps of:
- temporarily storing a portion of the voice waveforms using voice waveform storing means;
  
  generating rough pitch information from said voice waveforms stored temporarily by using pitch analyzing means;
  
  inputting said voice waveforms stored temporarily to an adaptive filter and changing a cut-off frequency or a center frequency of said adaptive filter according to said rough pitch information, and passing only a fundamental component extracted from the inputted voice waveforms; and
  
  detecting plural maximum points at one side of said fundamental component using peak detecting means, and generating a series of pitch mark information for a whole portion of the voice waveforms.
- View Dependent Claims (3, 4, 5, 9, 10, 11, 12, 14, 15)
- - 3. A method for analyzing voices which assigns pitch marks to said voice waveforms according to the pitch mark information obtained by using said method as defined in claim 1 or 2.
  - 4. A method for analyzing voices which obtains a pitch frequency by using pitch mark information obtained by using said method as defined in claim 1 or 2.
  - 5. A method for analyzing voices according to claim 4, which assumes pitch mark information as temporary pitch marks and calculates a pitch frequency by using intervals of said temporary pitch marks existing just before and just after each specified unit time.
  - 9. A method for analyzing voices according to claim 1 or 2, wherein the peak detecting means detects a maximum point of an amplitude in a positive or negative direction in each portion where the amplitude of waveforms of said low frequency components or said fundamental component exceeds a threshold value which is constant or changed at every specified unit time.
  - 10. A method for analyzing voices according to claim 1 or 2, wherein the peak detecting means assumes as maximum point such a position where a value of a differential fundamental component which is differential of said fundamental component is changed from positive to negative or from negative to positive.
  - 11. A method for analyzing voices according to claim 1 or 2, wherein said peak detecting means assumes as maximum point such a zero-cross point presumed by using linear interpolation method for values before and after a point where a value of a differential fundamental component which is differential of said fundamental component is changed from positive to negative or from negative to positive.
  - 12. A method for analyzing voices according to claim 1, wherein said adaptive filter takes 0 as an actual delay value for every frequency.
  - 14. A method for analyzing voices according to claim 1, wherein by using means for collating pitch marks, plural pitch mark information candidates are generated by shifting each pitch mark forward or backward with maintaining the interval between those pitch marks at fixed, said each pitch mark being included in said series of pitch mark information which was created before once;
15. A method for analyzing voices according to claim 14, wherein said peak matching degree is a sum of said read values.

2. A method for analyzing voices by generating pitch mark information as time reference positions corresponding to a pitch cycle of voice waveforms comprising the steps of:
- setting cut-off frequencies of plural fixed low-pass filters so that at least one of said plural fixed low-pass filters passes only a fundamental component of input voice waveforms;
  
  outputting from each of said fixed low-pass filters waveforms of low frequency components of the inputted voice waveforms;
  
  detecting, by using peak detecting means, plural maximum points on one side of waveforms of said low frequency components output from said fixed low-pass filters and outputting said detected plural maximum points as peak information;
  
  selecting, by using channel selecting means, a peak detecting channel every predetermined period on basis of a specified selection reference by using the peak information output from said plural peak detecting means; and
  
  generating a series of pitch mark information for the voice waveforms by using the selected peak information output from said selected peak detecting channel.
- View Dependent Claims (6, 7, 8, 13, 16, 17)
- - 6. A method for analyzing voices according to claim 2, wherein cut-off frequencies of said plural fixed low-pass filters take a relationship of 1:
    - 2 to each other.
  - 7. A method for analyzing voices according to claim 2, wherein meaning of the selection of the peak detecting channel on a basis of the specified selection reference is that from a time interval between a specified peak and a peak adjacent to said specified peak, the time interval of which is obtained from the peak information output from each of said peak detecting means, a temporary pitch frequency is obtained, at the specified peak position and
- 8. A method for analyzing voices according to claim 2, wherein meaning of the selection of the peak detecting channel on a basis of the specified selection reference is that from a time interval between a specified peak and a peak adjacent to said specified peak, the time interval of which is obtained from the peak information output from each of said peak detecting means, a temporary pitch frequency is obtained, at the specified peak position andwhen plural peak positions included in a specified time range and said pitch frequencies corresponding to those peak positions are represented as points on a coordinate system taking peak positions on its abscissa axis and temporary frequencies on its ordinate axis, and those points are connected in an order of peak positions, thereby to form plural lines, and the peak detecting channel is selected so that a variance of an inclination of those plural lines is minimized for said selected peak detecting channel.
- 13. A method for analyzing voices according to claim 2, wherein said fixed low-pass filter takes 0 as an actual delay value for every frequency.
- 16. A method for analyzing voices according to claim 2, wherein by using means for collating pitch marks plural pitch mark information candidates are generated by shifting each pitch mark forward or backward with maintaining the interval between those pitch marks at fixed, said each pitch mark being included in said series of pitch mark information which was created before once;
  - a value of voice waveform at a position represented by each pitch mark included in said pitch mark information candidates is read from said voice waveform storage; and
    
    said read values are considered wholly, thereby to calculate a peak matching degree, so that a pitch mark candidate that takes the maximum peak matching degree is selected.
- 17. A method for analyzing voices according to claim 16, wherein said peak matching degree is a total of said read values.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Panasonic Intellectual Property Corporation of America (Panasonic Holdings Corporation)
Original Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Inventors
Kamai, Takahiro, Matsui, Kenji
Primary Examiner(s)
Tsang, Fan
Assistant Examiner(s)
Opsasnick, Michael N.

Application Number

US09/429,962
Time in Patent Office

844 Days
Field of Search

704/207, 704/205
US Class Current

704/207
CPC Class Codes

G10L 13/04 Details of speech synthesis...

G10L 25/90 Pitch determination of spee...

Method and system for analyzing voices

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for analyzing voices

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links