System and method for distinguishing source from unconstrained acoustic signals emitted thereby in context agnostic manner
First Claim
1. A system for distinguishing between a plurality of acoustic sources generating sound in the form of unconstrained acoustic signals captured therefrom, comprising:
- a transformation unit applying a spectrographic transformation upon each time-captured segment of unconstrained acoustic signal generated by one of a plurality of distinct acoustic sources, said transformation unit generating a spectral vector for each said segment;
a sparse decomposition unit coupled to said transformation unit, said sparse decomposition unit selectively executing in at least a training system mode a simultaneous sparse approximation upon a joint corpus of spectral vectors for a plurality of unconstrained acoustic signal segments from at least a subset of the distinct acoustic sources, at least one of said spectral vectors generated by the spectrographic transformation, said sparse decomposition unit generating at least one sparse decomposition defined in a multi-dimensional space for each said spectral vector in terms of a representative set of decomposition atoms;
a discriminant reduction unit coupled to said sparse decomposition unit, said discriminant reduction unit being executable during the training system mode to down-select from said representative set of decomposition atoms an optimal combination of atoms for cooperatively distinguishing acoustic signals emitted by different ones of the distinct acoustic sources; and
,a classification unit coupled to said sparse decomposition unit, said classification unit being executable in a classification system mode to;
project a spectral vector of an input acoustic signal segment onto said multi-dimensional space to generate a sparse decomposition therefor as a coefficient weighted sum of said representative set of decomposition atoms,discover for said sparse decomposition of an input acoustic signal segment a degree of similarity relative to each of the distinct acoustic sources, anddetermine one of the distinct acoustic sources to have generated the input acoustic signal segment as sound, according to the degree of similarity.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method are provided for distinguishing between a plurality of sources based upon unconstrained acoustic signals captured therefrom. A spectrographic transformation is applied to time-captured segments of acoustic signals to generate a spectral vector for each. A selectively executed sparse decomposition includes in a training system mode simultaneous sparse approximation upon a joint corpus of spectral vectors for a plurality of acoustic signal segments from distinct sources. At least one sparse decomposition is executed for each spectral vector in terms of a representative set of decomposition atoms. Discriminant reduction executes during the training system mode to down-select from the representative set an optimal combination of atoms for cooperatively distinguishing acoustic signals emitted by different distinct sources. Classification is subsequently executed upon the sparse decomposition of an input acoustic signal segment unit to discover a degree of correlation for the input acoustic signal segment relative to each distinct source.
57 Citations
32 Claims
-
1. A system for distinguishing between a plurality of acoustic sources generating sound in the form of unconstrained acoustic signals captured therefrom, comprising:
-
a transformation unit applying a spectrographic transformation upon each time-captured segment of unconstrained acoustic signal generated by one of a plurality of distinct acoustic sources, said transformation unit generating a spectral vector for each said segment; a sparse decomposition unit coupled to said transformation unit, said sparse decomposition unit selectively executing in at least a training system mode a simultaneous sparse approximation upon a joint corpus of spectral vectors for a plurality of unconstrained acoustic signal segments from at least a subset of the distinct acoustic sources, at least one of said spectral vectors generated by the spectrographic transformation, said sparse decomposition unit generating at least one sparse decomposition defined in a multi-dimensional space for each said spectral vector in terms of a representative set of decomposition atoms; a discriminant reduction unit coupled to said sparse decomposition unit, said discriminant reduction unit being executable during the training system mode to down-select from said representative set of decomposition atoms an optimal combination of atoms for cooperatively distinguishing acoustic signals emitted by different ones of the distinct acoustic sources; and
,a classification unit coupled to said sparse decomposition unit, said classification unit being executable in a classification system mode to; project a spectral vector of an input acoustic signal segment onto said multi-dimensional space to generate a sparse decomposition therefor as a coefficient weighted sum of said representative set of decomposition atoms, discover for said sparse decomposition of an input acoustic signal segment a degree of similarity relative to each of the distinct acoustic sources, and determine one of the distinct acoustic sources to have generated the input acoustic signal segment as sound, according to the degree of similarity. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method for distinguishing between a plurality of acoustic sources generating sound in the form of unconstrained acoustic signals captured therefrom, comprising the steps of:
-
applying a spectrographic transformation upon a plurality of time-captured segments of unconstrained acoustic signals to generate a spectral vector for each said segment, said unconstrained acoustic signals generated by one of a plurality of distinct acoustic sources; selectively executing in a processor a sparse decomposition of each said spectral vector, said sparse decomposition including in a training system mode a simultaneous sparse approximation upon a joint corpus of spectral vectors for a plurality of unconstrained acoustic signal segments from at least a subset of the distinct acoustic sources, at least one of said spectral vectors generated by the spectrographic transformation, executing at least one sparse decomposition defined in an multi-dimensional space for each said spectral vector in terms of a representative set of decomposition atoms; executing discriminant reduction in a processor during the training system mode to down-select from said representative set of decomposition atoms an optimal combination of atoms for cooperatively distinguishing acoustic signals emitted by different ones of the distinct acoustic sources; and
,executing classification upon said sparse decomposition of an input acoustic signal segment during a classification system mode, said classification including executing a processor to; project a spectral vector of said input acoustic signal segment onto said multi-dimensional space to generate a sparse decomposition therefor as a coefficient weighted sum of said representative set of decomposition atoms, discover a degree of similarity for said input acoustic signal segment relative to each of the distinct acoustic sources, and determine one of the distinct acoustic sources to have generated the input acoustic signal segment as sound, according to the degree of similarity. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A system for distinguishing a source from unconstrained acoustic signals captured thereby in context-agnostic manner, comprising:
-
(a) a transformation unit applying a Short-Time-Fourier-Transform (STFT) process upon each time-captured segment of unconstrained acoustic signal generated by one of a plurality of distinct acoustic sources, said transformation unit generating a spectral vector defined in a time-frequency plane for each said segment; (b) a training unit coupled to said transformation unit, said training unit including; a cepstral decomposition portion executing a simultaneous sparse approximation upon a joint corpus of spectral vectors for a plurality of unconstrained acoustic signal segments from at least a subset of the distinct acoustic sources, at least one of said spectral vectors generated by the STFT process, said simultaneous sparse approximation including a greedy adaptive decomposition (GAD) process referencing a Gabor dictionary, said cepstral decomposition portion generating for each said spectral vector in said joint corpus at least one cepstral decomposition defined on a cepstral-frequency plane as a coefficient weighted sum of a representative set of decomposition atoms; and
,a discriminant reduction portion coupled to said cepstral decomposition portion, said discriminant reduction portion being executable to down-select from said representative set of decomposition atoms an optimal combination of atoms for cooperatively distinguishing acoustic signals emitted by different ones of the distinct acoustic sources; (c) a classification unit coupled to said transformation unit, said classification unit including; a cepstral projection portion projecting a spectral vector of an input acoustic signal segment onto said cepstral-frequency plane to generate a cepstral decomposition therefor as a coefficient weighted sum of said representative set of decomposition atoms; and
,a classification decision portion coupled to said cepstral projection portion, said classification decision portion being executable to discover for said cepstral decomposition of said input acoustic signal segment a degree of similarity relative to each of the distinct acoustic sources, and to thereby determine one of the distinct acoustic sources to have generated the input acoustic signal segment as sound, according to the degree of similarity. - View Dependent Claims (27, 28, 29)
-
-
30. A system for distinguishing between a plurality of acoustic sources generating sound in the form of unconstrained acoustic signals captured therefrom, comprising:
-
a transformation unit applying a spectrographic transformation upon each time-captured segment of unconstrained acoustic signal generated by one of a plurality of distinct acoustic sources, said transformation unit generating a spectral vector for each said segment; a sparse decomposition unit coupled to said transformation unit, said sparse decomposition unit selectively executing in at least a training system mode a simultaneous sparse approximation upon a joint corpus of spectral vectors for a plurality of unconstrained acoustic signal segments from at least a subset of the distinct acoustic sources, at least one of said spectral vectors generated by the spectrographic transformation, said sparse decomposition unit generating at least one sparse decomposition defined on a two-dimensional plane for each said spectral vector in terms of a representative set of decomposition atoms; a discriminant reduction unit coupled to said sparse decomposition unit, said discriminant reduction unit being executable during the training system mode to down-select from said representative set of decomposition atoms an optimal combination of atoms for cooperatively distinguishing acoustic signals emitted by different ones of the distinct acoustic sources based on characteristics of the acoustic signals independent of contextually-determined data content; and
,a classification unit coupled to said sparse decomposition unit, said classification unit being executable in a classification system mode to; project a spectral vector of an input acoustic signal segment onto said two-dimensional plane to generate a sparse decomposition therefor as a coefficient weighted sum of said representative set of decomposition atoms, discover for said sparse decomposition of an input acoustic signal segment a degree of similarity between the representative set of decomposition atoms of the signal segment and the optimal combination of atoms of each of the distinct acoustic sources, and determine one of the distinct acoustic sources to have generated the input acoustic signal segment as sound, according to the degree of similarity. - View Dependent Claims (31, 32)
-
Specification