Extracting classifying data in music from an audio bitstream
First Claim
1. A method of extracting classifying data from an audio signal, the method comprising the steps of:
- transforming a perceptual representation of the audio signal into a learning representation of the audio signal;
transmitting the learning representation to a multi-stage classifier, the multi-stage classifier comprising;
a first stage having a plurality of support vector machine classifiers, each support vector machine classifier trained to identify one out of a plurality of audio classification categories and generate a metalearner vector value reflecting how closely the audio signal conforms to the one out of the plurality of audio classification categories, anda final stage having a metalearner classifier, the metalearner classifier using the generated metalearner vector to classify the audio signal into one out of the plurality of audio classification categories; and
generating classification category information for the audio signal based on results produced by the metalearner classifier.
3 Assignments
0 Petitions
Accused Products
Abstract
The method of the present invention utilizes machine-learning techniques, particularly Support Vector Machines in combination with a neural network, to process a unique machine-learning enabled representation of the audio bitstream. Using this method, a classifying machine is able to autonomously detect characteristics of a piece of music, such as the artist or genre, and classify it accordingly. The method includes transforming digital time-domain representation of music into a frequency-domain representation, then dividing that frequency data into time slices, and compressing it into frequency bands to form multiple learning representations of each song. The learning representations that result are processed by a group of Support Vector Machines, then by a neural network, both previously trained to distinguish among a given set of characteristics, to determine the classification.
-
Citations
15 Claims
-
1. A method of extracting classifying data from an audio signal, the method comprising the steps of:
-
transforming a perceptual representation of the audio signal into a learning representation of the audio signal; transmitting the learning representation to a multi-stage classifier, the multi-stage classifier comprising; a first stage having a plurality of support vector machine classifiers, each support vector machine classifier trained to identify one out of a plurality of audio classification categories and generate a metalearner vector value reflecting how closely the audio signal conforms to the one out of the plurality of audio classification categories, and a final stage having a metalearner classifier, the metalearner classifier using the generated metalearner vector to classify the audio signal into one out of the plurality of audio classification categories; and generating classification category information for the audio signal based on results produced by the metalearner classifier. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer readable storage medium, storing therein a program of instructions for causing a computer to execute a process of extracting classifying data from an audio signal, the process comprising the steps of:
-
processing a perceptual representation of the audio signal into a learning representation of the audio signal; and inputting the learning representation into a multi-stage classifier, the multi-stage classifier comprising a first stage of support vector machine classifiers and a final stage metalearner classifier, each support vector machine classifier trained to identify one out of a plurality of audio classification categories and where the support vector machine classifiers are used to generate a metalearner vector that allows the final stage metalearner classifier to classify the audio signal into one out of the plurality of audio classification categories, each support vector machine classifier outputting a value reflecting how closely the audio signal conforms to the one out of the plurality of audio classification categories, each value then used in the metalearner vector. - View Dependent Claims (7, 8, 9, 10)
-
-
11. An apparatus for classifying an audio signal comprising:
-
means for processing a perceptual representation of the audio signal into a learning representation of the audio signal; and a multi-stage classifier, the multi-stage classifier further comprising a first stage of support vector machine classifiers and a final stage metalearner classifier, each support vector machine classifier trained to identify one out of a plurality of audio classification categories from the learning representation of the audio signal and where the support vector machine classifiers are used to generate a metalearner vector that allows the final stage metalearner classifier to classify the audio signal into one out of the plurality of audio classification categories, each support vector machine classifier outputting a value reflecting how closely the audio signal conforms to the one out of the plurality of audio classification categories, each value then used in the metalearner vector. - View Dependent Claims (12, 13, 14, 15)
-
Specification