Speech encoding method, apparatus and program
First Claim
1. A background noise/speech classification method comprising the steps of:
- calculating power information and spectral information of an input signal as feature amounts; and
comparing the calculated feature amounts with estimated feature amounts constituted by estimated power information and estimated spectral information in a background noise period, thereby deciding whether the input signal belongs to speech or background noise.
0 Assignments
0 Petitions
Accused Products
Abstract
In a background noise/speech classification method, whether a digital input signal input through an input terminal is background noise or speech is decided by a background noise/speech decision section on the basis of calculated frame power and a calculated LSP coefficient which are obtained by supplying the input signal to a feature amount calculation section and estimated frame power and an estimated LSP coefficient obtained by an estimated feature amount update section. Thereafter, the estimated feature amount update section updates the estimated frame power and the estimated LSP coefficient by using the frame power and the LSP coefficient obtained by the feature amount calculation section to prepare for the next frame.
25 Citations
25 Claims
-
1. A background noise/speech classification method comprising the steps of:
-
calculating power information and spectral information of an input signal as feature amounts; and
comparing the calculated feature amounts with estimated feature amounts constituted by estimated power information and estimated spectral information in a background noise period, thereby deciding whether the input signal belongs to speech or background noise. - View Dependent Claims (2, 3)
-
-
4. A background noise/speech classification method comprising the steps of:
-
calculating power information and spectral information of an input signal as feature amounts;
comparing the calculated feature amounts with estimated feature amounts constituted by estimated power information and estimated spectral information in a background noise period, thereby analyzing power and spectral fluctuation amounts; and
when a result obtained by analyzing the power and spectral fluctuation amounts indicates background noise, deciding that the input signal belongs to background noise, and otherwise, deciding that the input signal belongs to speech. - View Dependent Claims (5, 6, 7, 8)
-
-
9. A voiced/unvoiced classification method comprising the steps of:
-
preparing a voiced appearance probability table and an unvoiced appearance probability table in which voiced and unvoiced appearance probabilities are respectively written in correspondence with speech feature amounts;
obtaining voiced and unvoiced probabilities by referring to said voiced appearance probability table and said unvoiced appearance probability table by using a feature amount calculated from input speech as a key; and
deciding on the basis of the voiced and unvoiced probabilities whether the input speech belongs to voice or unvoice.
-
-
10. A background noise decoding method comprising the steps of:
-
extracting a decoded excitation signal parameter, a gain decoded parameter, and a decoded synthesis filter parameter from decoded parameters obtained by decoding encoded data;
decoding an excitation signal and a gain from the decoded excitation signal parameter and the gain decoded parameter;
smoothing the gain such that the gain changes smoothly; and
generating a synthesized signal by using a signal obtained by multiplying the excitation signal by the smoothed gain and synthesis filter characteristic information based on the decoded synthesis filter parameter. - View Dependent Claims (11)
-
-
12. A speech encoding method comprising the steps of:
-
dividing an input speech signal into frames each having a predetermined length;
obtaining a pitch period of a future frame with respect to a current frame to be encoded; and
encoding the pitch period.
-
-
13. A speech encoding method-comprising the steps of:
-
dividing an input speech signal into frames each having a predetermined length, and further dividing a speech signal of each frame into subframes;
obtaining a predictive pitch period of a subframe in a current frame by using pitch periods of at least two frames of the current frame to be encoded and past and future frames with respect to the current frame; and
obtaining a pitch period of a subframe in the current frame by using the predicted pitch period. - View Dependent Claims (14, 15, 17, 18, 19)
-
-
16. A speech encoding method comprising the steps of:
-
preparing an adaptive codebook storing a plurality of adaptive vectors generated by repeating a past excitation signal series at a period included in a predetermined range;
dividing an input speech signal into frames each having a predetermined length, and further dividing a speech signal of each frame into subframes;
obtaining a predicted pitch period of a subframe in a current frame by using pitch periods of at least two frames of the current frame to be encoded and past and future frames with respect to the current frame; and
determining a search range for subframes in the current frame by using the predicted pitch period to select an adaptive vector with a period that minimizes an error between a target vector and a signal obtained by filtering an adaptive vector extracted from said adaptive codebook through a perceptually weighted synthesis filter.
-
-
20. A speech encoding apparatus comprising:
-
means for dividing an input speech signal into frames each having a predetermined length;
means for obtaining a pitch period of a future frame with respect to a current frame to be encoded; and
means for encoding the pitch period obtained by said means for obtaining the pitch period.
-
-
21. A speech encoding apparatus comprising:
-
a divider section for dividing an input speech signal into frames each having a predetermined length, and further dividing a speech signal of each frame into subframes;
a predicted subframe pitch period calculation section for obtaining a predicted pitch period of a subframe in a current frame by using pitch periods of at least two frames of the current frame to be encoded and past and future frames with respect to the current frame; and
a subframe pitch period calculation section for obtaining a pitch period of a subframe in the current frame by using the predicted pitch period.
-
-
22. A speech encoding apparatus comprising:
-
an adaptive codebook storing a plurality of adaptive vectors generated by repeating a past excitation signal series at a period included in a predetermined range;
a divider section for dividing an input speech signal into frames each having a predetermined length, and further dividing a speech signal of each frame into subframes;
a predicted subframe pitch period calculation section for obtaining a predictive pitch period of a subframe in a current frame by using pitch periods of at least two frames of the current frame to be encoded and past and future frames with respect to the current frame; and
a search range determination section for determining a search range for subframes in the current frame by using the predicted pitch period to select an adaptive vector with a period that minimizes an error between a target vector and a signal obtained by filtering an adaptive vector extracted from said adaptive codebook through a perceptually weighted synthesis filter.
-
-
23. A recording medium on which a program is recorded, said program being used to execute processing of dividing an input speech signal into frames each having a predetermined length, and obtaining a pitch period of a future frame with respect to a current frame to be encoded, and processing of encoding the pitch period.
-
24. A recording medium on which a program is recorded, said program being used to execute processing of dividing an input speech signal into frames each having a predetermined length, further dividing a speech signal of each frame into subframes, and obtaining a predicted pitch period of a subframe in a current frame by using pitch periods of at least two frames of the current frame to be encoded and past and future frames with respect to the current frame, and processing of obtaining a pitch period of a subframe in the current frame by using the predicted pitch period.
-
25. A computer-readable recording medium on which a program for performing speech encoding processing is recorded, the program being used to execute processing of dividing an input speech signal into frames each having a predetermined length, further dividing a speech signal of each frame into subframes, and obtaining a predicted pitch period of a subframe in a current frame by using pitch periods of at least two frames of the current frame to be encoded and past and future frames with respect to the current frame, and processing of determining a search range for subframes in the current frame by using the predicted pitch period to select an adaptive vector with a period that minimizes an error between a target vector and a signal obtained by filtering an adaptive vector extracted from an adaptive codebook through a perceptually weighted synthesis filter, said adaptive codebook storing a plurality of adaptive vectors generated by repeating a past excitation signal series at a period included in a predetermined range.
Specification