Voiced/unvoiced decision based on frequency band ratio
First Claim
1. A method for processing an audio signal, comprising the steps of:
- generating frequency domain data by dividing an input audio signal on a block-by-block basis thereby determining blocks of data, and performing time domain to frequency domain conversion on each of the blocks thereby generating the frequency domain data;
dividing the frequency domain data for at least one of the blocks into plural bands;
deciding, for each of the bands for one of the blocks, whether said each of the bands is voiced or unvoiced;
if at least one of the bands for said one of the blocks is voiced, identifying as a highest frequency voiced band a voiced band whose center frequency is F, where F is the highest center frequency among said at least one of the bands for said one of the blocks which are voiced; and
generating boundary point data indicative of a boundary point between a voiced sound region and an unvoiced sound region of said one of the blocks in accordance with the number BVH of bands of the frequency domain data for said one of the blocks which have center frequency less than the center frequency F.
0 Assignments
0 Petitions
Accused Products
Abstract
Input audio signal is divided on a block-by-block basis. Frequency domain conversion is done on each of the blocks. Voiced bands of the frequency domain data for one of the blocks are searched for a voiced band BVH with the highest center frequency if it is decided that there are one or more shift points of voiced (V)/unvoiced (UV) decision data of all the bands. The number NV of voiced bands having center frequency less than that of the band BVH is found, so as to decide whether a proportion of the voiced bands is equal to or higher than a predetermined threshold Nth, thereby deciding one V/UV boundary point. Thus, it is possible to replace the V/UV decision data for each band by information on one demarcation in all bands, thereby reducing data volume and bit rate.
-
Citations
4 Claims
-
1. A method for processing an audio signal, comprising the steps of:
-
generating frequency domain data by dividing an input audio signal on a block-by-block basis thereby determining blocks of data, and performing time domain to frequency domain conversion on each of the blocks thereby generating the frequency domain data; dividing the frequency domain data for at least one of the blocks into plural bands; deciding, for each of the bands for one of the blocks, whether said each of the bands is voiced or unvoiced; if at least one of the bands for said one of the blocks is voiced, identifying as a highest frequency voiced band a voiced band whose center frequency is F, where F is the highest center frequency among said at least one of the bands for said one of the blocks which are voiced; and generating boundary point data indicative of a boundary point between a voiced sound region and an unvoiced sound region of said one of the blocks in accordance with the number BVH of bands of the frequency domain data for said one of the blocks which have center frequency less than the center frequency F. - View Dependent Claims (2, 3, 4)
-
Specification