System And Method For Automatically Remixing Digital Music
First Claim
1. A method for generating a transformed digital media, including:
- decoding a target media file and one or more source media files into a target sample set and a plurality of source sample sets;
processing the target sample set to form a target time frequency distribution (TFD);
processing the source sample sets to form at least one source TFD;
extracting musical features from the target TFD to generate target feature extraction (FE) data containing one or more separate target features;
extracting musical features from the source TFD to generate source FE data containing one or more separate source features, and assigning each of the source features a media ID for identifying the associated source;
segmenting the target FE data into temporal portions to form a plurality of target feature segments for each target feature, each of the target feature segments being time aligned to a corresponding target feature;
comparing each of the target feature segments with a plurality of the source features and generating a triple for each substantial match between the target feature segment and one of the source features, the triple including (id, s, d), where id is the media ID, s is a location within the source, d is the distance of the matched source to the respective target segment; and
generating the transformed media from the target sample sets and the source sample sets based upon the triples according to a probabilistic mixing algorithm based on a match distance function.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods augment a target media with a plurality of source media. The target media and source media are processed to form time frequency distributions (TFDs). Target features are extracted from the associated TFD and source features are extracted from each of the associated source TFDs. The target features are segmented into temporal portions that are compared with each of the plurality of source features to determine one or more matched source features having nearest matches to the target feature segments. Portions of the source media associated with the matched source features are mixed with the target media to form an augmented target media, wherein the mixing is based upon a probabilistic mixing algorithm that uses a distance between the matched target feature and source features to define an amplitude of each portion of the source media.
22 Citations
27 Claims
-
1. A method for generating a transformed digital media, including:
-
decoding a target media file and one or more source media files into a target sample set and a plurality of source sample sets; processing the target sample set to form a target time frequency distribution (TFD); processing the source sample sets to form at least one source TFD; extracting musical features from the target TFD to generate target feature extraction (FE) data containing one or more separate target features; extracting musical features from the source TFD to generate source FE data containing one or more separate source features, and assigning each of the source features a media ID for identifying the associated source; segmenting the target FE data into temporal portions to form a plurality of target feature segments for each target feature, each of the target feature segments being time aligned to a corresponding target feature; comparing each of the target feature segments with a plurality of the source features and generating a triple for each substantial match between the target feature segment and one of the source features, the triple including (id, s, d), where id is the media ID, s is a location within the source, d is the distance of the matched source to the respective target segment; and generating the transformed media from the target sample sets and the source sample sets based upon the triples according to a probabilistic mixing algorithm based on a match distance function.
-
-
2. A method for generating a transformed digital media from a target media and one or more source media, comprising:
-
decoding the target media and the source media; generating a first time frequency distribution for the target media and a second time frequency distribution for each of the source media; extracting a plurality of first musical features from the first time frequency distribution and a plurality of second musical features from each of the second time frequency distributions; segmenting the first musical features into a plurality of temporal segments; comparing the first musical features of each of the plurality of temporal segments with the second musical features to generate substantial matches; generating a triple for each of the substantial matches; and generating the transformed digital media by mixing the target media and portions of the source media identified by the triples.
-
-
3. A method for generating a transformed media by re-synthesizing one or more time frequency distribution (TFD) processed media features and combining with a target media, including:
-
performing a reverse TFD process on one or more musical features extracted from at least one source media to generate a re-synthesized audio feature having a reduced amount of extraneous audio elements; and combining the re-synthesized audio feature with the target media to generate the transformed media.
-
-
4. A method for mixing at least one source media with a target media, including:
-
determining matches between each of a plurality of time segmented portions of the target media and each of a plurality of time windowed portions of the source media; generating a probabilistic distribution of N closest determined said matches; generating, for each of the N closest matches, a mixing coefficient based upon the generated probabilistic distribution; and mixing the time windowed portions associated with the N closest matches with the associated time segmented portion using the associated mixing coefficients. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method for augmenting a target media with a plurality of source media, comprising the steps of:
-
processing the target media to form a target time frequency distribution (TFD); processing each of the source media to form a plurality of source TFDs; extracting target features from the target TFD; extracting source features from each of the source TFDs; segmenting the target features into temporal portions to form a plurality of target feature segments; comparing each of the target feature segments with each of the plurality of source features to determine a matched source feature having a nearest match between each one of the source features and a respective one of the target feature segments; and mixing portions of the source media associated with said matched source features with the target media to form an augmented target media, wherein the mixing is based upon a probabilistic mixing algorithm that uses a distance between the matched target feature and source feature to define an amplitude of each said portion of the source media. - View Dependent Claims (15, 16)
-
-
17. A method for augmenting a target media with a plurality of source media, comprising the steps of
processing the target media to form a target time frequency distribution (TFD); -
separating the target TFD into a plurality components; extracting target features from each of the components; processing each of the source media to form a plurality of source TFDs; separating each of the source TFDs into a plurality of source components; extracting source features from each of the source components; segmenting the target features into temporal portions to form a plurality of target feature segments; comparing each of the target feature segments with each of the source features to determine N closest matches; mixing first audio portions of the target media corresponding to the target feature segment and second audio portions of the source media corresponding to the source segments of the N closest matches. - View Dependent Claims (18, 19, 20)
-
-
21. A system for automatically remixing digital music, comprising:
-
a time-frequency distribution analyzer for processing (a) a target media to form a target time frequency distribution and (b) a plurality of source media to form a plurality of source time frequency distributions; a feature extractor for (a) extracting target features from the target time frequency distribution and (b) extracting source features from each of the source time frequency distributions; a feature store for storing the target features and the source features; a segmenter for segmenting the target features into a plurality of temporal target segments; a matcher for matching, for each temporal target segment, a plurality of the source features nearest to the temporal target segment; and a compiler for generating a transformed digital media based upon the target media and the matched source features. - View Dependent Claims (22, 23, 24, 25)
-
-
26. A method for separating a media into a set of components, comprising:
-
processing the media using a short Fast Fourier Transform algorithm to form a time frequency distribution (TFD); separating the TFD into a plurality of components using a probabilistic latent component analysis algorithm; determining a sparseness of each of the components; and removing any one or more of the components that have said sparseness greater than a first threshold and less than a second threshold, the set of components comprising the remaining components. - View Dependent Claims (27)
-
Specification