System And Method For Automatically Remixing Digital Music

US 20130170670A1
Filed: 02/18/2011
Published: 07/04/2013
Est. Priority Date: 02/18/2010
Status: Active Grant

First Claim

Patent Images

1. A method for generating a transformed digital media, including:

decoding a target media file and one or more source media files into a target sample set and a plurality of source sample sets;

processing the target sample set to form a target time frequency distribution (TFD);

processing the source sample sets to form at least one source TFD;

extracting musical features from the target TFD to generate target feature extraction (FE) data containing one or more separate target features;

extracting musical features from the source TFD to generate source FE data containing one or more separate source features, and assigning each of the source features a media ID for identifying the associated source;

segmenting the target FE data into temporal portions to form a plurality of target feature segments for each target feature, each of the target feature segments being time aligned to a corresponding target feature;

comparing each of the target feature segments with a plurality of the source features and generating a triple for each substantial match between the target feature segment and one of the source features, the triple including (id, s, d), where id is the media ID, s is a location within the source, d is the distance of the matched source to the respective target segment; and

generating the transformed media from the target sample sets and the source sample sets based upon the triples according to a probabilistic mixing algorithm based on a match distance function.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods augment a target media with a plurality of source media. The target media and source media are processed to form time frequency distributions (TFDs). Target features are extracted from the associated TFD and source features are extracted from each of the associated source TFDs. The target features are segmented into temporal portions that are compared with each of the plurality of source features to determine one or more matched source features having nearest matches to the target feature segments. Portions of the source media associated with the matched source features are mixed with the target media to form an augmented target media, wherein the mixing is based upon a probabilistic mixing algorithm that uses a distance between the matched target feature and source features to define an amplitude of each portion of the source media.

22 Citations

View as Search Results

27 Claims

1. A method for generating a transformed digital media, including:
- decoding a target media file and one or more source media files into a target sample set and a plurality of source sample sets;
  
  processing the target sample set to form a target time frequency distribution (TFD);
  
  processing the source sample sets to form at least one source TFD;
  
  extracting musical features from the target TFD to generate target feature extraction (FE) data containing one or more separate target features;
  
  extracting musical features from the source TFD to generate source FE data containing one or more separate source features, and assigning each of the source features a media ID for identifying the associated source;
  
  segmenting the target FE data into temporal portions to form a plurality of target feature segments for each target feature, each of the target feature segments being time aligned to a corresponding target feature;
  
  comparing each of the target feature segments with a plurality of the source features and generating a triple for each substantial match between the target feature segment and one of the source features, the triple including (id, s, d), where id is the media ID, s is a location within the source, d is the distance of the matched source to the respective target segment; and
  
  generating the transformed media from the target sample sets and the source sample sets based upon the triples according to a probabilistic mixing algorithm based on a match distance function.

2. A method for generating a transformed digital media from a target media and one or more source media, comprising:
- decoding the target media and the source media;
  
  generating a first time frequency distribution for the target media and a second time frequency distribution for each of the source media;
  
  extracting a plurality of first musical features from the first time frequency distribution and a plurality of second musical features from each of the second time frequency distributions;
  
  segmenting the first musical features into a plurality of temporal segments;
  
  comparing the first musical features of each of the plurality of temporal segments with the second musical features to generate substantial matches;
  
  generating a triple for each of the substantial matches; and
  
  generating the transformed digital media by mixing the target media and portions of the source media identified by the triples.

3. A method for generating a transformed media by re-synthesizing one or more time frequency distribution (TFD) processed media features and combining with a target media, including:
- performing a reverse TFD process on one or more musical features extracted from at least one source media to generate a re-synthesized audio feature having a reduced amount of extraneous audio elements; and
  
  combining the re-synthesized audio feature with the target media to generate the transformed media.

4. A method for mixing at least one source media with a target media, including:
- determining matches between each of a plurality of time segmented portions of the target media and each of a plurality of time windowed portions of the source media;
  
  generating a probabilistic distribution of N closest determined said matches;
  
  generating, for each of the N closest matches, a mixing coefficient based upon the generated probabilistic distribution; and
  
  mixing the time windowed portions associated with the N closest matches with the associated time segmented portion using the associated mixing coefficients.
- View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 5. The method of claim 4, wherein N is an integer between 2 and 10.
  - 6. The method of claim 5, wherein N is a user defined parameter.
  - 7. The method of claim 4, the step of determining matches comprising determining a distance between each time segmented portion and each time windowed portion using a distance function.
  - 8. The method of claim 7, wherein the distance function is a normed Euclidean distance function.
  - 9. The method of claim 7, the step of determining matches comprising using locality sensitive hashing (LSH).
  - 10. The method of claim 7, wherein the probabilistic distribution is a probabilistic distribution of the distance.
  - 11. The method of claim 7, wherein determining the closest matches is determining the ‘
    - N’
      
      closest matches.
  - 12. The method of claim 4, further comprising the step of generating a triple, (id, s, d), for each match, where id is an identifier for the source media, s is a location of the time windowed portion of the source media identified, and d is the determined distance between the time windowed portion and the time segmented portion.
  - 13. The method of claim 12, wherein each time windowed portion has a period of between one quarter of a second and one second.

14. A method for augmenting a target media with a plurality of source media, comprising the steps of:
- processing the target media to form a target time frequency distribution (TFD);
  
  processing each of the source media to form a plurality of source TFDs;
  
  extracting target features from the target TFD;
  
  extracting source features from each of the source TFDs;
  
  segmenting the target features into temporal portions to form a plurality of target feature segments;
  
  comparing each of the target feature segments with each of the plurality of source features to determine a matched source feature having a nearest match between each one of the source features and a respective one of the target feature segments; and
  
  mixing portions of the source media associated with said matched source features with the target media to form an augmented target media, wherein the mixing is based upon a probabilistic mixing algorithm that uses a distance between the matched target feature and source feature to define an amplitude of each said portion of the source media.
- View Dependent Claims (15, 16)
- - 15. The method of claim 14, further comprising generating, for each match, a triple having an id representing an ID of the source media, an s representing a location of the matching feature within the source media, and a d representing the distance, wherein the triple us used in the step of mixing.
  - 16. The method of claim 14, wherein the distance is determined by a distance function, the target feature and the source feature.

17. A method for augmenting a target media with a plurality of source media, comprising the steps ofprocessing the target media to form a target time frequency distribution (TFD);
- separating the target TFD into a plurality components;
  
  extracting target features from each of the components;
  
  processing each of the source media to form a plurality of source TFDs;
  
  separating each of the source TFDs into a plurality of source components;
  
  extracting source features from each of the source components;
  
  segmenting the target features into temporal portions to form a plurality of target feature segments;
  
  comparing each of the target feature segments with each of the source features to determine N closest matches;
  
  mixing first audio portions of the target media corresponding to the target feature segment and second audio portions of the source media corresponding to the source segments of the N closest matches.
- View Dependent Claims (18, 19, 20)
- - 18. The method of claim 17, the step of separating each of the source TFDs comprising processing each of the source TFDs using probabilistic latent component analysis.
  - 19. The method of claim 17, further comprising generating a sparseness for each of the source features, wherein source features having said sparseness greater than a first threshold and less than a second threshold are ignored.
  - 20. The method of claim 19, wherein said sparseness has a range of between 0 and 1 and said first threshold has a range of between 0.2 and 0.4 and said second threshold has a range of between 0.6 and 0.8.

21. A system for automatically remixing digital music, comprising:
- a time-frequency distribution analyzer for processing (a) a target media to form a target time frequency distribution and (b) a plurality of source media to form a plurality of source time frequency distributions;
  
  a feature extractor for (a) extracting target features from the target time frequency distribution and (b) extracting source features from each of the source time frequency distributions;
  
  a feature store for storing the target features and the source features;
  
  a segmenter for segmenting the target features into a plurality of temporal target segments;
  
  a matcher for matching, for each temporal target segment, a plurality of the source features nearest to the temporal target segment; and
  
  a compiler for generating a transformed digital media based upon the target media and the matched source features.
- View Dependent Claims (22, 23, 24, 25)
- - 22. The system of claim 21, wherein the time-frequency distribution analyzer, feature extractor, and segmenter, process the target media and each source media only once.
  - 23. The system of claim 21, the matcher, and the compiler, cooperating to generate the transformed digital media in real time.
  - 24. The system of claim 21, further comprising:
    - a media store for storing the target media and the source media; and
      
      a media detector for detecting media newly added to the store and initiating preprocessing of that media by one or more of the time-frequency distribution analyzer, the feature extractor, and the segmenter.
  - 25. The system of claim 21, further comprising a user interface for allowing the user to select one or more of the target media and the plurality of source media.

26. A method for separating a media into a set of components, comprising:
- processing the media using a short Fast Fourier Transform algorithm to form a time frequency distribution (TFD);
  
  separating the TFD into a plurality of components using a probabilistic latent component analysis algorithm;
  
  determining a sparseness of each of the components; and
  
  removing any one or more of the components that have said sparseness greater than a first threshold and less than a second threshold, the set of components comprising the remaining components.
- View Dependent Claims (27)
- - 27. The method of claim 26, wherein the sparseness has a range of 0 to 1, and the first threshold is in a range of between 0.2 and 0.4 and the second threshold is in a range of between 0.6 and 0.8.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
The Trustees of Dartmounth College (Dartmouth College)
Original Assignee
The Trustees of Dartmounth College (Dartmouth College)
Inventors
Casey, Michael

Granted Patent

US 9,774,948 B2
Time in Patent Office

Days
Field of Search
US Class Current

381/119
CPC Class Codes

G10H 1/0025   Automatic or semi-automatic...

G10H 2210/066   for pitch analysis as part ...

G10H 2210/125   Medley, i.e. linking parts ...

G10H 2240/141   Library retrieval matching,...

G11B 27/034   on discs G11B27/036, G11B27...

G11B 27/28   by using information signal...

H04R 3/00   Circuits for transducers , ...

System And Method For Automatically Remixing Digital Music

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

22 Citations

27 Claims

Specification

Solutions

Use Cases

Quick Links

System And Method For Automatically Remixing Digital Music

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

22 Citations

27 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links