Method and system for segmenting phonemes from voice signals
First Claim
Patent Images
1. A method for segmenting phonemes from voice signals, comprising:
- extracting peak information from input voice signals, the peak information including first peak information corresponding to a plurality of first order peaks of the input voice signals and second peak information corresponding to a plurality of second order peaks for the plurality of first order peaks;
determining a length of a frame for calculating peak statistics;
forming a histogram showing a density distribution of the second order peaks with respect to the determined frame length;
calculating the peak statistics using the histogram;
determining two neighboring maxima of the histogram using the calculated peak statistics per each frame; and
determining a valley between the two neighboring maxima as a boundary between phonemes to perform a phoneme segmentation;
wherein the method further comprises;
extracting the peak information from voice signals on a time domain;
defining a peak order with respect to the extracted peak information;
comparing a peak measurement value of the defined peak order with a predetermined critical peak measurement value; and
determining a present peak order as a final peak order, which is used to extract the second peak information, when the peak measurement value is greater than the critical peak measurement value.
1 Assignment
0 Petitions
Accused Products
Abstract
A method and a system for segmenting phonemes from voice signals. A method for accurately segmenting phonemes, in which a histogram showing a peak distribution corresponding to an order is formed by using a high order concept, and a boundary indicating a starting point and an ending point of each phoneme is determined by calculating a peak statistic based on the histogram. The phoneme segmentation method can remarkably reduce an amount of calculation, and has an advantage of being applied to sound signal systems which perform sound coding, sound recognition, sound synthesizing, sound reinforcement, etc.
23 Citations
4 Claims
-
1. A method for segmenting phonemes from voice signals, comprising:
-
extracting peak information from input voice signals, the peak information including first peak information corresponding to a plurality of first order peaks of the input voice signals and second peak information corresponding to a plurality of second order peaks for the plurality of first order peaks; determining a length of a frame for calculating peak statistics; forming a histogram showing a density distribution of the second order peaks with respect to the determined frame length; calculating the peak statistics using the histogram; determining two neighboring maxima of the histogram using the calculated peak statistics per each frame; and determining a valley between the two neighboring maxima as a boundary between phonemes to perform a phoneme segmentation; wherein the method further comprises; extracting the peak information from voice signals on a time domain; defining a peak order with respect to the extracted peak information; comparing a peak measurement value of the defined peak order with a predetermined critical peak measurement value; and determining a present peak order as a final peak order, which is used to extract the second peak information, when the peak measurement value is greater than the critical peak measurement value. - View Dependent Claims (2)
-
-
3. A system for segmenting phonemes from voice signals, comprising:
-
a peak information extractor for extracting peak information from input voice signals, the peak information including first peak information corresponding to a plurality of first order peaks of the input voice signals and second peak information corresponding to a plurality of second order peaks from among the plurality of first order peaks; a peak statistic calculator for determining a length of a frame for calculating peak statistics and calculating the peak statistics using a histogram; a boundary determination unit for determining two neighboring maxima of the histogram using the calculated peak statistics per each frame, and determining a valley between the two neighboring maxima as a boundary between the phonemes in order to segment the phonemes; a frame length determination unit for determining a length of a frame to calculate the peak statistics; and a histogram forming unit for forming the histogram showing a density distribution of the second order peaks with respect to the determined frame length; wherein the system further comprises; a peak order determination unit for extracting peak information on voice signals on a time domain, defining a peak order with respect to the extracted peak information, comparing a peak measurement value of the defined peak order with a predetermined critical peak measurement value, and determining a present peak order as a final peak order, which is used to extract the second peak information, when the peak measurement value is greater than the critical peak measurement value. - View Dependent Claims (4)
-
Specification