Extracting classifying data in music from an audio bitstream

US 7,295,977 B2
Filed: 08/27/2001
Issued: 11/13/2007
Est. Priority Date: 08/27/2001
Status: Expired due to Fees

First Claim

Patent Images

1. A method of extracting classifying data from an audio signal, the method comprising the steps of:

transforming a perceptual representation of the audio signal into a learning representation of the audio signal;

transmitting the learning representation to a multi-stage classifier, the multi-stage classifier comprising;

a first stage having a plurality of support vector machine classifiers, each support vector machine classifier trained to identify one out of a plurality of audio classification categories and generate a metalearner vector value reflecting how closely the audio signal conforms to the one out of the plurality of audio classification categories, anda final stage having a metalearner classifier, the metalearner classifier using the generated metalearner vector to classify the audio signal into one out of the plurality of audio classification categories; and

generating classification category information for the audio signal based on results produced by the metalearner classifier.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The method of the present invention utilizes machine-learning techniques, particularly Support Vector Machines in combination with a neural network, to process a unique machine-learning enabled representation of the audio bitstream. Using this method, a classifying machine is able to autonomously detect characteristics of a piece of music, such as the artist or genre, and classify it accordingly. The method includes transforming digital time-domain representation of music into a frequency-domain representation, then dividing that frequency data into time slices, and compressing it into frequency bands to form multiple learning representations of each song. The learning representations that result are processed by a group of Support Vector Machines, then by a neural network, both previously trained to distinguish among a given set of characteristics, to determine the classification.

Citations

15 Claims

1. A method of extracting classifying data from an audio signal, the method comprising the steps of:
- transforming a perceptual representation of the audio signal into a learning representation of the audio signal;
  
  transmitting the learning representation to a multi-stage classifier, the multi-stage classifier comprising;
  
  a first stage having a plurality of support vector machine classifiers, each support vector machine classifier trained to identify one out of a plurality of audio classification categories and generate a metalearner vector value reflecting how closely the audio signal conforms to the one out of the plurality of audio classification categories, anda final stage having a metalearner classifier, the metalearner classifier using the generated metalearner vector to classify the audio signal into one out of the plurality of audio classification categories; and
  
  generating classification category information for the audio signal based on results produced by the metalearner classifier.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1 wherein the final stage metalearner classifier is a neural network classifier.
  - 3. The method of claim 1 wherein said audio classification categories comprises classifications by musical artist.
  - 4. The method of claim 1 wherein the learning representation comprises dividing the perceptual representation of the audio signal into a plurality of time slices.
  - 5. The method of claim 1 wherein the learning representation comprises dividing the perceptual representation of the audio signal into a plurality of frequency bands.

6. A computer readable storage medium, storing therein a program of instructions for causing a computer to execute a process of extracting classifying data from an audio signal, the process comprising the steps of:
- processing a perceptual representation of the audio signal into a learning representation of the audio signal; and
  
  inputting the learning representation into a multi-stage classifier, the multi-stage classifier comprising a first stage of support vector machine classifiers and a final stage metalearner classifier, each support vector machine classifier trained to identify one out of a plurality of audio classification categories and where the support vector machine classifiers are used to generate a metalearner vector that allows the final stage metalearner classifier to classify the audio signal into one out of the plurality of audio classification categories, each support vector machine classifier outputting a value reflecting how closely the audio signal conforms to the one out of the plurality of audio classification categories, each value then used in the metalearner vector.
- View Dependent Claims (7, 8, 9, 10)
- - 7. The computer readable storage medium of claim 6 wherein the final stage metalearner classifier is a neural network classifier.
  - 8. The computer readable storage medium of claim 6 wherein said audio classification categories comprises classifications by musical artist.
  - 9. The computer readable storage medium of claim 6 wherein the learning representation comprises dividing the perceptual representation of the audio signal into a plurality of time slices.
  - 10. The computer readable storage medium of claim 6 wherein the learning representation comprises dividing the perceptual representation of the audio signal into a plurality of frequency bands.

11. An apparatus for classifying an audio signal comprising:
- means for processing a perceptual representation of the audio signal into a learning representation of the audio signal; and
  
  a multi-stage classifier, the multi-stage classifier further comprising a first stage of support vector machine classifiers and a final stage metalearner classifier, each support vector machine classifier trained to identify one out of a plurality of audio classification categories from the learning representation of the audio signal and where the support vector machine classifiers are used to generate a metalearner vector that allows the final stage metalearner classifier to classify the audio signal into one out of the plurality of audio classification categories, each support vector machine classifier outputting a value reflecting how closely the audio signal conforms to the one out of the plurality of audio classification categories, each value then used in the metalearner vector.
- View Dependent Claims (12, 13, 14, 15)
- - 12. The apparatus of claim 11 wherein the final stage metalearner classifier is a neural network classifier.
  - 13. The apparatus of claim 11 wherein said audio classification categories comprises classifications by musical artist.
  - 14. The apparatus of claim 11 wherein the learning representation comprises dividing the perceptual representation of the audio signal into a plurality of time slices.
  - 15. The apparatus of claim 11 wherein the learning representation comprises dividing the perceptual representation of the audio signal into a plurality of frequency bands.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
NEC Corporation
Original Assignee
NEC Laboratories America Inc (NEC Corporation)
Inventors
Lawrence, Stephen R., Flake, Gary W., Whitman, Brian
Primary Examiner(s)
Edouard; Patrick N.
Assistant Examiner(s)
Wozniak; James S.

Application Number

US09/939,954
Publication Number

US 20030040904A1
Time in Patent Office

2,269 Days
Field of Search

742/12, 704/205, 704/207, 704/232, 704/270, 704/212, 704/245, 704236-239, 707/6, 707/707, 846/16
US Class Current

704/236
CPC Class Codes

G10L 17/02   Preprocessing operations, e...

G10L 17/26   Recognition of special voic...

G10L 25/30   using neural networks

Extracting classifying data in music from an audio bitstream

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Extracting classifying data in music from an audio bitstream

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links