Method and apparatus for extracting pitch information from audio signal using morphology
First Claim
1. A method of extracting pitch information from an audio signal, comprising the steps of:
- when the audio signal is input, converting, by a frequency domain converter, the audio signal to a frequency domain(b) determining an optimum window size for extracting a pitch from the converted audio signal;
(c) calculating a maximum value and a minimum value of the converted audio signal in optimum window using the determined optimum window size;
(d) checking a variation between the maximum value and the minimum value and generating a staircase signal that has the minimum value in the variation and is used for filtering;
(e) generating a residual signal by extracting the generated staircase signal from the converted audio signal;
(f) generating pitch information by selecting a highest peak generated by performing a predetermined fold and summation process for folding and summing the residual signal; and
(g) extracting the pitch information from the residual signal corresponding to the extraction result,wherein the staircase signal includes a plurality of flat signals continuously connected, each flat signal having a constant amplitude in a corresponding optimum window for a morphological operation.
1 Assignment
0 Petitions
Accused Products
Abstract
A function of improving accuracy of the extraction of pitch information in an audio signal including voice and sound signals is implemented. To do this, a morphological operation is used. In detail, an input audio signal is converted to an audio signal in a frequency domain, an optimum structuring set size (SSS) is determined, and a morphological operation is performed using the determined SSS. Then, by extracting the highest peak from a signal obtained through a predetermined fold and summation process as pitch information, the pitch information can be used in all audio systems in the latter part when voice coding, recognition, synthesis, and/or robustness are performed.
13 Citations
20 Claims
-
1. A method of extracting pitch information from an audio signal, comprising the steps of:
-
when the audio signal is input, converting, by a frequency domain converter, the audio signal to a frequency domain (b) determining an optimum window size for extracting a pitch from the converted audio signal; (c) calculating a maximum value and a minimum value of the converted audio signal in optimum window using the determined optimum window size; (d) checking a variation between the maximum value and the minimum value and generating a staircase signal that has the minimum value in the variation and is used for filtering; (e) generating a residual signal by extracting the generated staircase signal from the converted audio signal; (f) generating pitch information by selecting a highest peak generated by performing a predetermined fold and summation process for folding and summing the residual signal; and (g) extracting the pitch information from the residual signal corresponding to the extraction result, wherein the staircase signal includes a plurality of flat signals continuously connected, each flat signal having a constant amplitude in a corresponding optimum window for a morphological operation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An apparatus for extracting pitch information from an audio signal, comprising:
-
a frequency domain converter for converting an input audio signal in a time domain to an audio signal in a frequency domain; a determiner for determining an optimum window size for extracting a pitch from the converted audio signal; a calculator for calculating a maximum value and a minimum value of the converted audio signal in an optimum window using the determined optimum window size; a filter for checking a variation between the maximum value and the minimum value, generating a staircase signal that has the minimum value in the variation, and extracting the generated staircase signal from the converted audio signal; and an extractor for extracting pitch information from a residual signal corresponding to the extraction result, wherein the staircase signal includes a plurality of flat signals continuously connected, each flat signal having a constant amplitude in a corresponding optimum window for a morphological operation, the residual signal is a signal obtained by removing the staircase signal from the converted audio signal and the pitch information is a highest peak generated by performing a predetermined fold and summation process for folding and summing the residual signal. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification