Automatic extraction of musical portions of an audio stream
First Claim
Patent Images
1. A method for selectively recording music portions of an audio stream, comprising:
- receiving an audio stream having music and non-music portions;
segmenting the audio stream into successive frames;
passing each of a plurality of the frames through a filter bank, the filter bank including filters with bandwidths approximately proportional to their center frequencies;
computing a modified spectral flux value for at least a subset of the plurality of frames;
identifying a start frame, the start frame being a frame of the plurality having a modified spectral flux value below a threshold value;
identifying a stop frame, the stop frame being a frame of the plurality having a modified spectral flux value above the threshold value; and
recording a portion of the audio stream bounded by the start and stop frames.
1 Assignment
0 Petitions
Accused Products
Abstract
Music and non-music portions in an audio stream are identified. The audio stream is digitized and segmented into frames. Selected frames are passed through a filter bank which includes filters having bandwidths approximately proportional to their center frequencies. The spectral flux for each selected frame is calculated and smoothed. Frames having a smoothed spectral flux below a threshold value are associated with music, and frames having a smoothed spectral flux above a threshold value are associated with non-music.
47 Citations
34 Claims
-
1. A method for selectively recording music portions of an audio stream, comprising:
-
receiving an audio stream having music and non-music portions;
segmenting the audio stream into successive frames;
passing each of a plurality of the frames through a filter bank, the filter bank including filters with bandwidths approximately proportional to their center frequencies;
computing a modified spectral flux value for at least a subset of the plurality of frames;
identifying a start frame, the start frame being a frame of the plurality having a modified spectral flux value below a threshold value;
identifying a stop frame, the stop frame being a frame of the plurality having a modified spectral flux value above the threshold value; and
recording a portion of the audio stream bounded by the start and stop frames. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for selectively recording music portions of a radio broadcast, comprising:
-
receiving a radio broadcast consisting essentially of an audio transmission;
calculating a value of a feature for each of a plurality of frames of the audio transmission;
identifying a start point, the start point being a frame in the audio transmission having a feature value bearing a first relation to a threshold value for the feature;
identifying a stop point, the stop point being a frame in the audio transmission having a feature value bearing a second relation to the threshold value for the feature; and
recording a portion of the audio transmission bounded by the start and stop points. - View Dependent Claims (11, 12)
-
-
13. A machine-readable medium having machine-executable instructions for performing steps comprising:
-
receiving an audio stream having music and non-music portions;
segmenting the audio stream into successive frames;
passing each of a plurality of the frames through a filter bank, the filter bank including filters with bandwidths approximately proportional to their center frequencies;
computing a modified spectral flux value for at least a subset of the plurality of frames;
identifying a start frame, the start frame being a frame of the plurality having a modified spectral flux value below a threshold value;
identifying a stop frame, the stop frame being a frame of the plurality having a modified spectral flux value above the threshold value; and
recording a portion of the audio stream bounded by the start and stop frames. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
-
21. A machine-readable medium having machine-executable instructions for performing steps comprising:
-
receiving a radio broadcast consisting essentially of an audio transmission;
calculating a value of a feature for each of a plurality of frames of the audio transmission;
identifying a start point, the start point being a frame in the audio transmission having a feature value bearing a first relation to a threshold value for the feature;
identifying a stop point, the stop point being a frame in the audio transmission having a feature value bearing a second relation to the threshold value for the feature; and
recording a portion of the audio transmission bounded by the start and stop points. - View Dependent Claims (22, 23)
-
-
24. A recording unit for recording broadcast programming, comprising:
-
a receiver for tuning to broadcast radio frequencies and receiving broadcast programming;
a memory having instructions stored therein; and
a processor coupled to the receiver and to the memory and configured execute the instructions so as to;
receive an audio stream having music and non-music portions, segment the audio stream into successive frames, pass each of a plurality of the frames through a filter bank, the filter bank including filters with bandwidths approximately proportional to their center frequencies, compute a modified spectral flux value for at least a subset of the plurality of frames, identify a start frame, the start frame being a frame of the plurality having a modified spectral flux value below a threshold value, identify a stop frame, the stop frame being a frame of the plurality having a modified spectral flux value above the threshold value, and record a portion of the audio stream bounded by the start and stop frames. - View Dependent Claims (25, 26, 27, 28, 29, 30)
-
-
31. A recording unit for remotely recording broadcast programming, comprising:
-
a receiver for tuning to broadcast radio frequencies and receiving broadcast programming;
a memory having instructions stored therein; and
a processor coupled to the receiver and to the memory and configured execute the instructions so as to;
receive a radio broadcast consisting essentially of an audio transmission, calculate a value of a feature for each of a plurality of frames of the audio transmission, identify a start point in the audio transmission, the start point being a frame in the audio transmission having a feature value bearing a first relation to a threshold value for the feature, identify a stop point in the audio transmission, the stop point being a frame in the audio transmission having a feature value bearing a second relation to the threshold value for the feature, and record a portion of the audio transmission bounded by the start and stop points. - View Dependent Claims (32, 33)
-
-
34. A recording unit for remotely recording broadcast programming, comprising:
-
a receiver for tuning to broadcast radio frequencies and receiving broadcast programming;
a buffer memory;
a storage memory having instructions stored therein;
a network interface; and
a processor coupled to the receiver, to the network interface and to the memories and configured execute the instructions so as to;
receive an audio stream having music and non-music portions, segment the audio stream into successive frames, pass each of a plurality of the frames through a low pass Infinite Impulse Response (IIR) filter, a band pass IIR filter centered at approximately 450 Hz, a band pass IIR filter centered at approximately 900 Hz, a band pass IIR filter centered at approximately 1500 Hz, and a high pass IIR filter, compute a modified spectral flux value for each of the plurality of frames based on the output of the filters, receive, via the network interface, a recording control signal initiated from a remotely located mobile terminal, upon receipt of the recording control signal, identify a start frame in a portion of the audio stream stored in the buffer memory, the start frame being a frame of the plurality having a modified spectral flux value below a threshold value, identify a stop frame, the stop frame being a frame of the plurality having a modified spectral flux value above the threshold value, upon determining that the time elapsed between the start and stop frames exceeds a minimum value, store in the storage memory the part of the audio stream bounded by the start and stop frames, said storing including copying from the buffer memory a part of the audio stream buffered after the start frame.
-
Specification