METHOD AND APPARATUS FOR AUTOMATICALLY RECOGNIZING INPUT AUDIO AND/OR VIDEO STREAMS
First Claim
1. Apparatus for recognizing an input data stream, comprising:
- a receiver for receiving the input data stream;
an interface for randomly selecting any one portion of the received data stream, and forming a first plurality of feature time series waveforms respectively corresponding to distinct portions of the received data stream;
a memory for storing a second plurality of feature time series waveforms; and
processor structure for correlating the first plurality of feature time series waveforms with the second plurality of feature time series waveforms, for designating a recognition when a feature correlation between the first plurality of feature time series waveforms and at least one of the second plurality of feature time series waveforms reaches a predetermined value, and for outputting a recognition signal after the feature correlation reaches the predetermined value.
0 Assignments
0 Petitions
Accused Products
Abstract
A method and system for the automatic identification of audio, video, multimedia, and/or data recordings based on immutable characteristics of these works. The invention does not require the insertion of identifying codes or signals into the recording. This allows the system to be used to identify existing recordings that have not been through a coding process at the time that they were generated. Instead, each work to be recognized is “played” into the system where it is subjected to an automatic signal analysis process that locates salient features and computes a statistical representation of these properties. These features are then stored as patterns for later recognition of live input signal streams. A different set of features is derived for each audio or video work to be identified and stored. During real-time monitoring of a signal stream, a similar automatic signal analysis process is carried out, and many features are computed for comparison with the patterns stored in a large feature database. For each particular pattern stored in the database, only the relevant characteristics are compared with the real-time feature set. Preferably, during analysis and generation of reference patterns, data are extracted from all time intervals of a recording. This allows a work to be recognized from a single sample taken from any part of the recording.
89 Citations
6 Claims
-
1. Apparatus for recognizing an input data stream, comprising:
-
a receiver for receiving the input data stream;
an interface for randomly selecting any one portion of the received data stream, and forming a first plurality of feature time series waveforms respectively corresponding to distinct portions of the received data stream;
a memory for storing a second plurality of feature time series waveforms; and
processor structure for correlating the first plurality of feature time series waveforms with the second plurality of feature time series waveforms, for designating a recognition when a feature correlation between the first plurality of feature time series waveforms and at least one of the second plurality of feature time series waveforms reaches a predetermined value, and for outputting a recognition signal after the feature correlation reaches the predetermined value.
-
-
2. Apparatus for forming audio features from an input audio stream, comprising:
-
a receiver for receiving the input audio stream and separating the received audio stream into a plurality of different frequency bands; and
processor structure for (i) extracting energy from each of the plurality of frequency bands, (ii) summing the energy extracted from each of the plurality of frequency bands, (iii) forming multiple feature time series waveforms from the summed energy, (iv) determining the information content of each feature from each of a plurality of time interval segments, (v) rank-ordering each of the features of the time interval segments according to their information content, and (vi) transforming each of the rank-ordered features of the time interval segments to produce complex spectra; and
a memory for storing the transformed complex spectra.
-
-
3. A method for recognizing an input data stream, comprising the steps of:
-
receiving the input data stream;
randomly selecting any one time interval from the received data stream;
forming a first plurality of feature time series waveforms respectively corresponding to distinct portions of the received data stream;
rank ordering features of the first plurality of waveforms according to their information content;
retrieving a second plurality of feature time series waveforms;
correlating the first plurality of feature time series waveforms with the second plurality of feature time series waveforms in an order corresponding to (i) a map of candidate patterns from the second plurality of feature time series waveforms that best match the rank ordering of the first plurality of feature time series waveforms, and (ii) the rank ordering of second plurality of feature time series waveforms; and
designating a recognition when a joint correlation probability value between the first plurality of feature time series waveforms and at least one of the second plurality of feature time series waveforms reaches a predetermined value.
-
-
4. A method for forming audio features from an audio stream, comprising the steps of:
-
receiving the input audio stream and separating the received audio stream into a plurality of different frequency bands;
extracting energy from the plurality of frequency bands;
summing the energy extracted from each of the plurality of frequency bands;
forming multiple feature waveforms from the summed energy;
determining the most distinctive information from each of a plurality of time interval segments;
rank-ordering features of the time interval segments according to their distinctiveness; and
storing data corresponding to the rank-ordered features.
-
-
5. A computer readable storage medium for storing a program which causes one or more computers to recognize an input data stream, the stored program causing the one or more computers to:
-
receive the input data stream;
randomly select any time interval of the received data stream;
form a first plurality of feature time series waveforms from the received data stream which respectively correspond to spectrally distinct portions of the received data stream;
store a second plurality of feature time series waveforms;
correlate the first plurality of feature time series waveforms with the second plurality of feature time series waveforms in an order corresponding to (i) a map of candidate patterns from the second plurality of feature time series waveforms that best match the rank ordering of the first plurality of feature time series waveforms and (ii) the rank ordering of second plurality of feature time series waveforms; and
designate a recognition when a joint correlation probability value between the first plurality of feature time series waveforms and at least one of the second plurality of feature time series waveforms reaches a predetermined value.
-
-
6. A method of using recognition features from an input data stream to achieve automatic signal identification, comprising the steps of:
-
receiving the input data stream;
forming a plurality of time series waveforms which correspond to all features of the received input data stream;
forming multiple feature streams from the plurality of feature time series waveforms;
correlating the most distinctive feature of plural stored candidate patterns with the multiple feature streams formed from the unknown input data stream in an order corresponding to a map of candidate patterns that best match the rank ordering of the plurality of feature time series waveforms; and
designating recognition of the input data stream when a joint probability of correlations between the input data stream and the stored candidate patterns indicates that random detection is not probable.
-
Specification