Voice analysis-synthesis method using noise having diffusion which varies with frequency band to modify predicted phases of transmitted pitch data blocks
First Claim
1. A voice analysis-synthesis method, comprising the steps of:
- dividing an input voice signal on a block-by-block basis and extracting pitch data from each block;
converting the voice signal, on the block-by-block basis, into frequency-domain data;
dividing the frequency-domain data for each of the blocks into plural bands of data on the basis of the pitch data, each of said bands corresponding to a different range of frequencies;
finding power information for each of the bands of said each of the blocks and voiced/unvoiced decision information for said each of the bands of said each of the blocks;
transmitting the pitch data, the power information for said each of the bands of said each of the blocks, and the voiced/unvoiced decision information for said each of the bands of said each of the blocks;
receiving the pitch data, the power information, and the voiced/unvoiced decision information, and predicting a block terminal edge phase for each block of the received pitch data on the basis of said each block of the received pitch data and a block initial phase for said each block of the received pitch data; and
modifying the predicted block terminal edge phase, using noise having diffusion which varies from band to band for each of the bands.
0 Assignments
0 Petitions
Accused Products
Abstract
A high efficiency encoding method for encoding data on frequency axis obtained by dividing an input audio signal on block-by-block basis and converting the signal onto the frequency axis, wherein V bands are searched for a band BVH with the highest center frequency if it is decided that there are one or more shift points of voiced (V)/unvoiced (UV) decision data of all bands on the frequency axis, and wherein the number of V bands NV up to the band BVH is found, so as to decide whether proportion of the V bands is equal to or higher than a predetermined threshold Nth, thereby deciding one V/UV boundary point. Thus, it is possible to replace the V/UV decision data for each band by information on one demarcation in all bands, thereby to reduce data volume and to reduce bit rate. Also, by using two-stage hierarchical vector quantization in quantizing the data on the frequency axis, operation volume for codebook search and memory capacity of the codebook are reduced.
117 Citations
3 Claims
-
1. A voice analysis-synthesis method, comprising the steps of:
-
dividing an input voice signal on a block-by-block basis and extracting pitch data from each block; converting the voice signal, on the block-by-block basis, into frequency-domain data; dividing the frequency-domain data for each of the blocks into plural bands of data on the basis of the pitch data, each of said bands corresponding to a different range of frequencies; finding power information for each of the bands of said each of the blocks and voiced/unvoiced decision information for said each of the bands of said each of the blocks; transmitting the pitch data, the power information for said each of the bands of said each of the blocks, and the voiced/unvoiced decision information for said each of the bands of said each of the blocks; receiving the pitch data, the power information, and the voiced/unvoiced decision information, and predicting a block terminal edge phase for each block of the received pitch data on the basis of said each block of the received pitch data and a block initial phase for said each block of the received pitch data; and modifying the predicted block terminal edge phase, using noise having diffusion which varies from band to band for each of the bands. - View Dependent Claims (2)
-
-
3. A pitch extraction method for processing an input audio signal comprising frames, each of the frames corresponding to a different time along a time axis, said method comprising the steps of:
-
detecting plural peaks from auto-correlation data of a current frame, where the current frame is one of said frames; and detecting a pitch of the current frame by determining a position of a maximum peak among the detected plural peaks of the current frame when the maximum peak is equal to or larger than a predetermined threshold, and deciding the pitch of the current frame by determining a position of a peak in a pitch range having a predetermined relation with a pitch found in one of the frames other than said current frame when the maximum peak is smaller than the predetermined threshold.
-
Specification