System and methods for providing automatic classification of media entities according to consonance properties

US 7,756,874 B2
Filed: 11/12/2004
Issued: 07/13/2010
Est. Priority Date: 07/06/2000
Status: Expired due to Fees

First Claim

Patent Images

1. A method of classifying data according to consonance of the data, the method comprising:

determining an initial classification for a data set and assigning each media entity of a plurality of media entities in the data set to at least one consonance class comprising a perceived harmony or agreement of media entities as identified by a trained human classifier based on at least one of a song-level attribute or a voice-level attribute as defined by a human user;

processing each media entity of said data set to extract at least one consonance characteristic based on digital signal processing of each media entity, wherein said at least one consonance characteristic relates to a correspondence or a recurrence of sounds in each of said plurality of media entities;

generating a plurality of consonance vectors for said plurality of media entities, wherein each consonance vector includes (1) said at least one consonance class based on the at least one of the song-level attribute or the voice-level attribute as identified by the trained human classifier and (2) said at least one consonance characteristic based on said digital signal processing, and wherein said consonance vectors include a mean energy of a ratio between peaks for all frames in said plurality of media entities and wherein each consonance vector contains the consonance characteristic and the consonance class attributes assigned to the media entity being classified;

forming a classification chain based upon said plurality of consonance vectors;

creating a simple rule when a plurality of classification chains are created that each meet a certain criteria;

testing the simple rule against a pre-defined set of identified media entities to create a general rule which is subjected to analysis by a trained human classifier to determine a classification accuracy of the general rule;

utilizing feedback from the trained human classifier regarding the classification accuracy of the general rule to identify at least one consonance class to create a relational rule; and

storing each created relational rule in a computer memory for later retrieval and use in classification actions.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In connection with a classification system for classifying media entities that merges perceptual classification techniques and digital signal processing classification techniques for improved classification of media entities, a system and methods are provided for automatically classifying and characterizing musical consonance properties of media entities. Such a system and methods may be useful for the indexing of a database or other storage collection of media entities, such as media entities that are audio files, or have portions that are audio files. The methods also help to determine media entities that have similar consonance by utilizing classification chain techniques that test distances between media entities in terms of their properties. For example, a neighborhood of songs may be determined within which each song has a similar consonance.

Citations

8 Claims

1. A method of classifying data according to consonance of the data, the method comprising:
- determining an initial classification for a data set and assigning each media entity of a plurality of media entities in the data set to at least one consonance class comprising a perceived harmony or agreement of media entities as identified by a trained human classifier based on at least one of a song-level attribute or a voice-level attribute as defined by a human user;
  
  processing each media entity of said data set to extract at least one consonance characteristic based on digital signal processing of each media entity, wherein said at least one consonance characteristic relates to a correspondence or a recurrence of sounds in each of said plurality of media entities;
  
  generating a plurality of consonance vectors for said plurality of media entities, wherein each consonance vector includes (1) said at least one consonance class based on the at least one of the song-level attribute or the voice-level attribute as identified by the trained human classifier and (2) said at least one consonance characteristic based on said digital signal processing, and wherein said consonance vectors include a mean energy of a ratio between peaks for all frames in said plurality of media entities and wherein each consonance vector contains the consonance characteristic and the consonance class attributes assigned to the media entity being classified;
  
  forming a classification chain based upon said plurality of consonance vectors;
  
  creating a simple rule when a plurality of classification chains are created that each meet a certain criteria;
  
  testing the simple rule against a pre-defined set of identified media entities to create a general rule which is subjected to analysis by a trained human classifier to determine a classification accuracy of the general rule;
  
  utilizing feedback from the trained human classifier regarding the classification accuracy of the general rule to identify at least one consonance class to create a relational rule; and
  
  storing each created relational rule in a computer memory for later retrieval and use in classification actions.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, further comprising:
    - processing an unclassified media entity to extract at least one consonance characteristic based on digital signal processing of the unclassified media entity;
      
      generating a vector for the unclassified media entity including said at least one digital signal processing consonance characteristic;
      
      presenting the vector for the unclassified media entity to the classification chain; and
      
      classifying the unclassified entry with an estimate of the consonance class by calculating a representative consonance class of a subset of the plurality of vectors of the classification chain located in a neighborhood of the vector for the unclassified entity.
  - 3. The method of claim 2, further including calculating a neighborhood distance that defines a distance within which two vectors in a classification chain space are determined to be in a same neighborhood.
  - 4. The method of claim 2, wherein said classifying of the unclassified entry includes classifying the unclassified entry with a median consonance class represented by the neighborhood.
  - 5. The method claim 2, wherein said consonance class is described by a numerical value and said classifying of the unclassified entry includes classifying the unclassified entry with a mean of numerical consonance values found in the neighborhood.
  - 6. The method of claim 2, wherein said classifying includes determining at least one number indicating a level of confidence of the consonance class estimate and storing said at least one number.

7. A computer readable storage medium bearing computer executable instructions implemented on a computer, the computer readable storage medium comprising:
- an instruction that determines an initial classification and assigns to each media entity of a plurality of media entities in a data set to at least one consonance class comprising a perceived harmony or agreement of media entities as identified by a trained human classifier based on at least one of a song-level attribute or a voice-level attribute defined by a human user;
  
  an instruction that processes each media entity of said data set to extract at least one consonance characteristic based on digital signal processing of each media entity, wherein said at least one consonance characteristic relates to a correspondence or recurrence of sounds in each of said plurality of media entities;
  
  an instruction that generates a plurality of consonance vectors for said plurality of media entities, wherein each consonance vector includes (1) said at least one consonance class based on the at least one of the song-level attribute or the voice-level attribute as identified by the trained human classifier and (2) said at least one consonance characteristic based on said digital signal processing, and wherein said consonance vectors include a mean energy of a ratio between peaks for all frames in said plurality of media entities and wherein each consonance vector contains the consonance characteristic and the consonance class attributes assigned to the media entity being classified;
  
  an instruction that forms a classification chain based upon said plurality of consonance vectors;
  
  an instruction that creates a simple rule when a plurality of classification chains are created that each meet a certain criteria;
  
  an instruction that tests the simple rule against a pre-defined set of identified media entities to create a general rule which is subjected to analysis by a trained human classifier to determine a classification accuracy of the general rule;
  
  an instruction to utilize feedback from the trained human classifier regarding the classification accuracy of the general rule to identify at least one consonance class to create a relational rule; and
  
  an instruction that stores each created relational rule in a computer memory.

8. At least one computing device comprising:
- means for determining an initial classification and assigning to each media entity of a plurality of media entities in a data set to at least one consonance class comprising a perceived harmony or agreement of media entities as identified by a trained human classifier based on at least one of a song-level attribute or a voice-level attribute as defined by a human user;
  
  means for processing each media entity of said data set to extract at least one consonance characteristic based on digital signal processing of each media entity, wherein said at least one consonance characteristic relates to a correspondence or a recurrence of sounds in each of said plurality of media entities;
  
  means for generating a plurality of consonance vectors for said plurality of media entities, wherein each consonance vector includes (1) said at least one consonance class based on the at least one of the song-level attribute or the voice-level attribute as identified by the trained human classifier and (2) said at least one consonance characteristic based on digital signal processing, and wherein said consonance vectors include a mean energy of a ratio between peaks for all frames in said plurality of media entities and wherein each consonance vector contains the consonance characteristic and the consonance class attributes assigned to the media entity being classified;
  
  means for forming a classification chain based upon said plurality of consonance vectors;
  
  means for creating a simple rule when a plurality of classification chains are created that each meet a certain criteria;
  
  means for testing the simple rule against a pre-defined set of identified media entities to create a general rule which is subjected to analysis by a trained human classifier to determine a classification accuracy of the general rule;
  
  means for utilizing feedback from the trained human classifier regarding the classification accuracy of the general rule to identify at least one consonance class to create a relational rule; and
  
  means for storing each created relation rule in a computer memory.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Hoekman, Jeffrey S., Weare, Christopher B.
Primary Examiner(s)
Jalil; Neveen Abel
Assistant Examiner(s)
Pulliam; Christyann R

Application Number

US10/986,975
Publication Number

US 20050097075A1
Time in Patent Office

2,069 Days
Field of Search

707/737
US Class Current

707/737
CPC Class Codes

G06F 16/48   Retrieval characterised by ...

G06F 16/634   Query by example, e.g. quer...

G06F 16/635   Filtering based on addition...

G06F 16/639   using playlists

G06F 16/683   using metadata automaticall...

G10L 17/26   Recognition of special voic...

G10L 25/48   specially adapted for parti...

G11B 27/105   of operating discs

G11B 27/28   by using information signal...

Y10S 707/99931   Database or file accessing

Y10S 707/99932   Access augmentation or opti...

Y10S 707/99934   Query formulation, input pr...

System and methods for providing automatic classification of media entities according to consonance properties

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

System and methods for providing automatic classification of media entities according to consonance properties

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links