Method and apparatus for classifying an audio signal based on frequency spectrum fluctuation
First Claim
1. An audio signal classification method, comprising:
- storing, based on at least one condition being met, data of a frequency spectrum fluctuation parameter of a current audio frame of an audio signal into a memory where data of frequency spectrum fluctuation parameters of a plurality of audio frames are stored, wherein the at least one condition comprises the current audio frame being an active frame, and wherein a frequency spectrum fluctuation parameter denotes an energy fluctuation of a frequency spectrum of the audio signal;
determining whether the current audio frame is an active frame and a last audio frame preceding the current audio frame is an inactive frame;
upon determining that the current audio frame is an active frame and the last audio frame preceding the current audio frame is an inactive frame, modifying data of frequency spectrum fluctuation parameters of audio frames preceding the current audio frame stored in the memory into ineffective data, wherein data of frequency spectrum fluctuation parameters in the memory not having been modified into ineffective data are effective data; and
determining whether a current signal is percussive music, wherein the current signal comprises the current audio frame and a plurality of audio frames preceding the current audio frame;
upon determining that the current signal is percussive music, modifying effective data of the current audio frame and a plurality of audio frames preceding the current audio frame into a value less than or equal to a music threshold;
obtaining statistics of a part or all of the effective data in the memory; and
classifying the current audio frame as a speech frame or a music frame according to the statistics.
1 Assignment
0 Petitions
Accused Products
Abstract
An audio signal classification method and apparatus, where the method includes determining, according to voice activity of a current audio frame, whether to obtain a frequency spectrum fluctuation of the current audio frame and store the frequency spectrum fluctuation in a frequency spectrum fluctuation memory, and updating, according to whether the audio frame is percussive music or activity of a historical audio frame, frequency spectrum fluctuations stored in the frequency spectrum fluctuation memory, and classifying the current audio frame as a speech frame or a music frame according to statistics of a part or all of effective data of the frequency spectrum fluctuations stored in the frequency spectrum fluctuation memory.
31 Citations
13 Claims
-
1. An audio signal classification method, comprising:
-
storing, based on at least one condition being met, data of a frequency spectrum fluctuation parameter of a current audio frame of an audio signal into a memory where data of frequency spectrum fluctuation parameters of a plurality of audio frames are stored, wherein the at least one condition comprises the current audio frame being an active frame, and wherein a frequency spectrum fluctuation parameter denotes an energy fluctuation of a frequency spectrum of the audio signal; determining whether the current audio frame is an active frame and a last audio frame preceding the current audio frame is an inactive frame; upon determining that the current audio frame is an active frame and the last audio frame preceding the current audio frame is an inactive frame, modifying data of frequency spectrum fluctuation parameters of audio frames preceding the current audio frame stored in the memory into ineffective data, wherein data of frequency spectrum fluctuation parameters in the memory not having been modified into ineffective data are effective data; and determining whether a current signal is percussive music, wherein the current signal comprises the current audio frame and a plurality of audio frames preceding the current audio frame; upon determining that the current signal is percussive music, modifying effective data of the current audio frame and a plurality of audio frames preceding the current audio frame into a value less than or equal to a music threshold; obtaining statistics of a part or all of the effective data in the memory; and classifying the current audio frame as a speech frame or a music frame according to the statistics. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An audio signal classification apparatus configured to classify an input audio signal, comprising:
- a memory comprising instructions; and
one or more processors in communication with the memory, wherein the one or more processors execute the instructions to; store, based on at least one condition being met, data of a frequency spectrum fluctuation parameter of a current audio frame of an audio signal into the memory where a plurality of frequency spectrum fluctuation parameters of a plurality of audio frames are stored, wherein the at least one condition comprises the current audio frame being an active frame, and wherein a frequency spectrum fluctuation parameter denotes an energy fluctuation of a frequency spectrum of the audio signal; determine whether the current audio frame is an active frame and a last audio frame preceding the current audio frame is an inactive frame; upon determining that the current audio frame is an active frame and the last audio frame preceding the current audio frame is an inactive frame, modify data of frequency spectrum fluctuation parameters of audio frames preceding the current audio frame stored in the memory into ineffective data, wherein data of frequency spectrum fluctuation parameters in the memory not having been modified into ineffective data are effective data; and determine whether a current signal is percussive music, wherein the current signal comprises the current audio frame and a plurality of audio frames preceding the current audio frame; upon determining that the current signal is percussive music, modify effective data of the current audio frame and a plurality of audio frames preceding the current audio frame into a value less than or equal to a music threshold; obtain statistics of a part or all of the effective data in the memory; and classify the current audio frame as a speech frame or a music frame according to the statistics. - View Dependent Claims (9, 10, 11, 12, 13)
- a memory comprising instructions; and
Specification