System and method for querying a music database
First Claim
Patent Images
1. A method for querying a music database, which contains a plurality of pieces of music, the method comprising steps of:
- classifying, using feature extraction on said plurality of pieces of music using steps of;
(i) segmenting each piece of music into a plurality of windows;
(ii) extracting at least one characteristic feature in each of said windows; and
(iii) indexing said each piece of music dependent upon extracted features;
forming a request which specifies (i) at least one of a name of a piece of music and features characterizing the piece of music, and (ii) at least one conditional expression;
comparing the features characterizing the specified piece of music to corresponding features characterizing other pieces of music in the database;
calculating corresponding distances between the specified piece of music and the other pieces of music based on comparisons; and
identifying pieces of music which are at distances from the specified piece of music which satisfy the at least one conditional expression.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for querying a music database, the database containing a plurality of indexed pieces of music, where the query is performed by forming a database request consisting of a conditional expression relating to the name and/or attributes of the desired piece of music. Associated features are derived from the database query, and compared with corresponding features for the other pieces of music in the database. A desired piece of music is determined by searching for a minimum distance between the database query features and those associated with the pieces of music in the database.
324 Citations
55 Claims
-
1. A method for querying a music database, which contains a plurality of pieces of music, the method comprising steps of:
-
classifying, using feature extraction on said plurality of pieces of music using steps of;
(i) segmenting each piece of music into a plurality of windows;
(ii) extracting at least one characteristic feature in each of said windows; and
(iii) indexing said each piece of music dependent upon extracted features;
forming a request which specifies (i) at least one of a name of a piece of music and features characterizing the piece of music, and (ii) at least one conditional expression;
comparing the features characterizing the specified piece of music to corresponding features characterizing other pieces of music in the database;
calculating corresponding distances between the specified piece of music and the other pieces of music based on comparisons; and
identifying pieces of music which are at distances from the specified piece of music which satisfy the at least one conditional expression. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
outputting at least one of (i) the identified pieces of music and (ii) the names of said pieces.
-
-
3. A method according to claim 2, whereby the at least one of (i) the identified pieces of music and (ii) the names of said pieces are in a class of the plurality of pieces of music in the database.
-
4. A method according to claim 1, wherein the calculating step comprises sub-steps of:
-
calculating corresponding distances based on a first relationship between (i) a loudness, a percussivity, and a sharpness of the specified piece of music, and (ii) a loudness, a percussivity, and a sharpness of each said other piece of music; and
sorting, on the basis of a second relationship between (i) a tempo of the specified piece of music, and (ii) a tempo of each said other piece of music.
-
-
5. A method according to claim 1, wherein the comparing step is performed in relation to pieces of music in a class of the plurality of pieces of music in the database.
-
6. A method according to claim 1, wherein a first feature extracted in the extracting step is at least one tempo extracted from a digitized music signal, the extracting step comprising sub-steps of:
-
determining values indicative of an energy in at least one window;
locating peaks of an energy signal derived from energy values in said at least one window;
generating an onset signal comprising pulses having peaks which substantially coincide with peaks of the energy signal;
filtering the onset signal through a plurality of comb filter processes having resonant frequencies located according to frequencies derived from the window segmenting step;
accumulating an energy in each filter process over a duration of the music signal; and
identifying filter processes having Nth highest energies wherein resonant frequencies of the identified processes are representative of at least one tempo in the music signal.
-
-
7. A method according to claim 6, wherein the determining sub-step comprises sub-sub-steps of:
-
determining transform components for the music signal in each window; and
adding amplitudes of the components in each window to form a component sum, said component sum being indicative of energy in a window.
-
-
8. A method according to claim 6, wherein after the locating step and prior to the generating step the method comprises a further sub-step of:
low pass filtering the energy signal.
-
9. A method according to claim 6, whereby the generating step comprises sub-steps of:
-
differentiating the energy signal; and
half-wave rectifying the differentiated signal to form the onset signal.
-
-
10. A method according to claim 6, wherein the generating step comprises sub-sub-steps of:
-
sampling the energy signal;
comparing consecutive samples to determine a positive peak; and
generating a single pulse when each positive peak is detected.
-
-
11. A method according to claim 6, wherein the filter process resonant frequencies span a frequency range substantially between 1 Hz and 4 Hz.
-
12. A method according to claim 1, wherein a second feature extracted in the extracting step is a percussivity of a signal, the extracting step comprising, for each window, sub-steps of:
-
filtering by a plurality of filters;
determining an output for each filter;
determining a function of the filter output values;
determining a gradient for the linear function; and
determining a percussivity as a function of the gradient.
-
-
13. A method according to claim 12, wherein the segmenting sub-step comprises sub-sub-steps of:
-
selecting a window width;
selecting a window overlap extent; and
segmenting the signal into windows having the selected window width, wherein the windows overlap each other to the selected overlap extent.
-
-
14. A method according to claim 12, whereby the filtering sub-step utilises comb filters.
-
15. A method according to claim 12, whereby the gradient determining step determines a straight line of best fit to the linear function.
-
16. A method according to claim 12, whereby percussivity values determined in the percussivity determining step for a window are consolidated in a histogram.
-
17. A method for querying a database according to claim 1, comprising a further step of:
determining said associated parameters for the at least one specified piece of music if the parameters have not been specified.
-
18. A method for querying a music database according to claim 1, wherein the corresponding distances are calculated along orthogonal axes.
-
19. A method for querying a database according to claim 1, wherein said arranging step arranges the features in histograms wherein the histograms are representative of the features over the entire piece of music.
-
20. An apparatus for querying a music database, which contains a plurality of pieces of music, the apparatus comprising:
-
classifying means for classifying, using feature extraction on said plurality of pieces of music, said classifying means comprising;
(i) segmenting means for segmenting each piece of music into a plurality of windows;
(ii) extracting means for extracting at least one characteristic feature in each of said windows; and
(iii) indexing means for indexing said each piece of music dependent upon extracted features;
forming means for forming a request which specifies (i) at least one of a name of a piece of music and features characterizing the piece of music, and (ii) at least one conditional expression;
comparing means for comparing the features characterizing the specified piece of music to corresponding features characterizing other pieces of music in the database;
calculating means for calculating corresponding distances between the specified piece of music and the other pieces of music based on comparisons; and
identifying means for identifying pieces of music which are at distances from the specified piece of music which satisfy the at least one conditional expression. - View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37)
an output means for outputting at least one of (i) the identified pieces of music and (ii) the names of said pieces.
-
-
22. An apparatus according to claim 20, wherein the distance determination means comprises:
-
first distance determination means for calculating corresponding distances based on a first relationship between (i) a loudness, a percussivity, and a sharpness of the specified piece of music, and (ii) a loudness, a percussivity, and a sharpness of each other piece of music; and
a sorting means for sorting, on the basis of a second relationship between (i) a tempo of the specified piece of music, and (ii) a tempo of each other piece of music.
-
-
23. An apparatus according to claim 20, further comprising a means for clustering the pieces of music in the database into classes.
-
24. An apparatus according to claim 20, wherein a first feature extracted by the feature extraction means is at least one tempo extracted from a digitized music signal, and wherein the feature extraction means comprises:
-
energy determination means for determining values indicative of the energy in each window;
peak location determination means for locating peaks of an energy signal derived from the energy values in each window;
onset signal generation means for generating an onset signal comprising pulses having peaks which substantially coincide with peaks of the energy signal;
a plurality of comb filter means for filtering the onset signal wherein the plurality of comb filter means have resonant frequencies located according to frequencies derived by the window segmentation means;
energy accumulation means for accumulating an energy in each filter process over a duration of the music signal; and
identification means for identifying filter processes having Nth highest energies wherein resonant frequencies of the identified processes are representative of at least one tempo in the music signal.
-
-
25. An apparatus according to claim 24, wherein the energy determination means comprise:
-
transform determination means for determining transform components for the music signal in at least one window; and
addition means for adding amplitudes of components in each window to form a component sum, said component sum being indicative of energy in a window.
-
-
26. An apparatus according to claim 24, further comprising low pass filtering means for low pass filtering the energy signal located by the peak location determination means prior to forming the onset signal by the onset signal generation means.
-
27. An apparatus according to claim 24, wherein the onset signal generation means comprise:
-
differentiating means for differentiating the energy signal; and
rectification means for half-wave rectifying a differentiated signal to form the onset signal.
-
-
28. An apparatus according to claim 24, wherein the onset signal generation means comprise:
-
sampling means for sampling the energy signal;
comparator means for comparing consecutive samples to determine a positive peak; and
pulse generation means for generating a single pulse when a positive peak is detected.
-
-
29. An apparatus according to claim 24, wherein the comb filter means resonant frequencies span a frequency range substantially between 1 Hz and 4 Hz.
-
30. An apparatus according to claim 20, wherein a second feature extracted by the feature extraction means is a percussivity of a signal, and wherein the feature extraction means comprise:
-
filtering means for filtering by a plurality of filters;
filter output determination means for determining an output for each filter;
function determination means for determining a function of the filter output values;
gradient determination means for determining a gradient for the linear function; and
percussivity determination means for determining a percussivity as a function of the gradient.
-
-
31. An apparatus according to claim 30, wherein the segmentation means comprise:
-
selection means for selecting a window width;
overlap determination means for selecting a window overlap extent; and
segmentation means for segmenting the signal into windows, a window having the selected window width and the windows overlapping each other to the selected overlap extent.
-
-
32. An apparatus according to claim 30, wherein the filtering means are comb filters.
-
33. An apparatus according to claim 30, wherein the gradient determination means comprises means for determining a straight line of best fit to the linear function.
-
34. An apparatus according to claim 30, wherein the percussivity determination means consolidate the percussivity for each window into a histogram.
-
35. An apparatus for querying a music database according to claim 20, further comprising a parameter determination means for determining associated parameters for the specified pieces of music if the parameters have not been specified.
-
36. An apparatus according to claim 20, wherein said arranging means arrange the features in histograms representative of the features over the entire piece of music.
-
37. An apparatus for querying a music database according to claim 20, wherein the distance determination means calculate the distances along orthogonal axes.
-
38. A computer readable memory medium for storing a program for apparatus for querying a music database which contains a plurality of pieces of music, said program comprising:
-
code for a classifying step for classifying, using feature extraction on said plurality of pieces of music, said code for said classifying step comprising;
(i) code for a segmenting step for segmenting each piece of music into a plurality of windows;
(ii) code for an extracting step for extracting at least one characteristic feature in each of said windows; and
(iii) code for an indexing step for indexing said each piece of music dependent upon extracted features;
code for a forming step for forming a request which specifies (i) at least one of a name of a piece of music and features characterizing the piece of music, and (ii) at least one conditional expression;
code for a comparing step for comparing the features characterizing the specified piece of music to corresponding features characterizing other pieces of music in the database;
code for a calculating step for calculating corresponding distances between the specified piece of music and the other pieces of music based on comparisons; and
code for an identifying step for identifying pieces of music which are at distances from the specified piece of music which satisfy the at least one conditional expression. - View Dependent Claims (39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55)
code for an outputting step for outputting at least one of (i) the identified pieces of music and (ii) the names of said pieces.
-
-
40. A computer readable memory medium according to claim 38, wherein the code for the distance calculating step comprises:
-
code for a calculating step for calculating corresponding distances based on a first relationship between (i) a loudness, a percussivity, and a sharpness of the specified piece of music, and (ii) a loudness, a percussivity, and a sharpness of each other piece of music; and
code for a sorting step for sorting, on the basis of a second relationship between (i) a tempo of the specified piece of music, and (ii) a tempo of each other piece of music.
-
-
41. A computer readable memory medium according to claim 38, wherein the code for the comparing step further comprises code for a clustering step for clustering the pieces of music in the database into classes.
-
42. A computer readable memory medium according to claim 38, wherein a first feature extracted in the feature extraction step is at least one tempo extracted from a digitized music signal, and wherein said program further comprises:
-
code for an energy determination step for determining values indicative of the energy in each window;
code for a peak location determination step for locating the peaks of an energy signal which is derived from the energy values in each window;
code for an onset signal generation step for generating an onset signal comprising pulses having peaks which substantially coincide with the peaks of the energy signal;
code for a plurality of comb filter steps for filtering the onset signal wherein the plurality of comb filter steps effect resonant frequencies located according to frequencies derived from the window segmentation step;
code for an energy accumulation step for accumulating an energy in each filter process over a duration of the music signal; and
code for an identification step for identifying the filter processes having Nth highest energies wherein resonant frequencies of the identified processes are representative of at least one tempo in the music signal.
-
-
43. A computer readable memory medium according to claim 42, wherein said code relating to the energy determination step comprises:
-
code for a transform determination step for determining transform components for the music signal in each window; and
code for an addition step for adding amplitudes of the components in each window to form a component sum, said component sum being indicative of energy in a window.
-
-
44. A computer readable memory medium according to claim 42, said program further comprising code for a low pass filtering step for low pass filtering the energy signal after locating the peaks of an energy signal in the peak location determination step and prior to the onset signal generation step.
-
45. A computer readable memory medium according to claim 42, wherein said code relating to the onset signal generation step comprises:
-
code for a differentiating step for differentiating the energy signal; and
code for a rectification step for half-wave rectifying the differentiated signal to form the onset signal.
-
-
46. A computer readable memory medium according to claim 42, wherein said code relating to the onset signal generation step comprises:
-
code for a sampling step for sampling the energy signal;
code for a comparator step for comparing consecutive samples to determine a positive peak; and
code for a pulse generation step for generating a single pulse when each positive peak is detected.
-
-
47. A computer readable memory medium according to claim 42, said program relating to the filter means resonant frequencies spaning a frequency range substantially between 1 Hz and 4 Hz.
-
48. A computer readable memory medium according to claim 38, wherein a second feature extracted in the feature extraction step is a percussivity of a signal, and wherein said feature extraction step comprises:
-
code for a filtering step for filtering by a plurality of filters;
code for a filter output determination step for determining an output for each filter;
code for a function determination step for determining a function of the filter output values;
code for a gradient determination step for determining a gradient for the linear function; and
code for a percussivity determination step for determining a percussivity as a function of the gradient.
-
-
49. A computer readable memory medium according to claim 48, wherein said code relating to the segmentation step comprises:
-
code for a selection step for selecting a window width;
code for an overlap determination step for selecting a window overlap extent; and
code for a segmentation step for segmenting the signal into windows each window having a selected window width and the windows overlapping each other to a selected overlap extent.
-
-
50. A computer readable memory medium according to claim 48, wherein the filtering step performs comb filtering.
-
51. A computer readable memory medium according to claim 48, wherein said gradient determination step determines a straight line of best fit to the linear function.
-
52. A computer readable memory medium according to claim 48, wherein said percussivity determination step consolidates the percussivity for each window into a histogram.
-
53. A computer readable memory medium according to claim 38, further comprising code for a determining step for determining said associated parameters for the at least one specified piece of music if the parameters have not been specified.
-
54. A computer readable memory medium according to claim 38, wherein the code for a distance calculation step calculate corresponding distances along orthogonal axes.
-
55. A computer readable memory medium according to claim 38, wherein said code for the arranging step arranges the features in histograms wherein the histograms are representative of the features over the entire piece of music.
Specification