Language informed source separation
First Claim
1. A non-transitory computer-readable storage medium storing program instructions, the program instructions being computer-executable to implement:
- for a first source, generating a model for each word of a plurality of words, each model includes including;
a plurality of dictionaries, each of the plurality of dictionaries including one or more spectral components; and
probabilities of transition between the plurality of dictionaries; and
constraining the models according to high level information that defines valid transitions, the constrained models being usable to perform source separation on a sound mixture that includes multiple sources.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems for non-negative hidden Markov modeling of signals are described. For example, techniques disclosed herein may be applied to signals emitted by one or more sources. The modeling may be constrained according to high level information. In some embodiments, methods and systems may enable the separation of a signal'"'"'s various components. As such, the systems and methods disclosed herein may find a wide variety of applications. In audio-related fields, for example, these techniques may be useful in music recording and processing, source separation/extraction, noise reduction, teaching, automatic transcription, electronic games, audio search and retrieval, and many other applications.
30 Citations
20 Claims
-
1. A non-transitory computer-readable storage medium storing program instructions, the program instructions being computer-executable to implement:
-
for a first source, generating a model for each word of a plurality of words, each model includes including; a plurality of dictionaries, each of the plurality of dictionaries including one or more spectral components; and probabilities of transition between the plurality of dictionaries; and constraining the models according to high level information that defines valid transitions, the constrained models being usable to perform source separation on a sound mixture that includes multiple sources. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A non-transitory computer-readable storage medium storing program instructions, the program instructions being computer-executable to implement:
-
receiving a sound mixture including a first source and a second source; receiving a model including; a first plurality of dictionaries corresponding to a first source, the first plurality of dictionaries including multiple dictionaries for each word of a plurality of words; a first transition matrix corresponding to the first source, the transition matrix including probabilities of transition among the first plurality of dictionaries, at least some of the probabilities of transition are based on high level information that defines valid transitions; a second plurality of dictionaries corresponding to the second source, the second plurality of dictionaries including multiple other dictionaries for each word of the plurality of words; and a second transition matrix corresponding to the second source, the second transition matrix including probabilities of transition among the second plurality of dictionaries, at least some of the probabilities of transition in the second transition matrix being based on the high level information; and calculating contributions to the sound mixture from respective plurality of dictionaries for each of the first and second sources, said calculating is based on the model. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A method, comprising:
-
for each source of a plurality of sources, generating a plurality of word level models, each word level model corresponding to a respective one word of a plurality of words, each word level model including; a plurality of dictionaries, each of the plurality of dictionaries including one or more spectral components, and probabilities of transition between the dictionaries; for each source, combining the word level models into a single source specific model; and constraining the single source specific models according to high level information that defines valid transitions, the constrained single source specific models being usable to perform source separation on a sound mixture that includes multiple sources. - View Dependent Claims (17, 18, 19, 20)
-
Specification