Apparatus and method for generating an audio fingerprint and using a two-stage query
First Claim
Patent Images
1. A method comprising:
- transforming a sample of a recording to a time-frequency domain and storing each time-frequency pair in a matrix array;
setting all elements of the matrix array that are below a predetermined noise floor to zero and setting all elements of the matrix array that are outside a predetermined frequency range to zero;
detecting a plurality of local maxima for a predetermined number of time slices;
selecting a predetermined number of largest-magnitude maxima from the plurality of local maxima detected by said detecting;
generating one or more hash values corresponding to the predetermined number of largest-magnitude maxima;
comparing the one or more hash values to corresponding hash values of known recordings in a first stage query to identify a set of possible matches; and
comparing the set of possible matches to a full-recording fingerprint of the sample in a second stage query, the full-recording fingerprint being a recording fingerprint of a substantial length or the entire length of a known recording.
10 Assignments
0 Petitions
Accused Products
Abstract
An audio fingerprint is generated by transforming an audio sample of a recording to a time-frequency domain and storing each time-frequency pair in a matrix array, detecting a plurality of local maxima for a predetermined number of time slices, selecting a predetermined number of largest-magnitude maxima from the plurality of local maxima detected by said detecting, and generating one or more hash values corresponding to the predetermined number of largest-magnitude maxima.
100 Citations
21 Claims
-
1. A method comprising:
-
transforming a sample of a recording to a time-frequency domain and storing each time-frequency pair in a matrix array; setting all elements of the matrix array that are below a predetermined noise floor to zero and setting all elements of the matrix array that are outside a predetermined frequency range to zero; detecting a plurality of local maxima for a predetermined number of time slices; selecting a predetermined number of largest-magnitude maxima from the plurality of local maxima detected by said detecting; generating one or more hash values corresponding to the predetermined number of largest-magnitude maxima; comparing the one or more hash values to corresponding hash values of known recordings in a first stage query to identify a set of possible matches; and comparing the set of possible matches to a full-recording fingerprint of the sample in a second stage query, the full-recording fingerprint being a recording fingerprint of a substantial length or the entire length of a known recording. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus comprising:
-
at least one processor configured to; transform a sample of a recording to a time-frequency domain and storing each time-frequency pair in a matrix array; set all elements of the matrix array that are below a predetermined noise floor to zero and set all elements of the matrix array that are outside a predetermined frequency range to zero; detect a plurality of local maxima for a predetermined number of time slices; select a predetermined number of largest-magnitude maxima from the plurality of local maxima detected by said detection; generate one or more hash values corresponding to the predetermined number of largest-magnitude maxima; compare the one or more hash values to corresponding hash values of known recordings in a first stage query to identify a set of possible matches; and compare the set of possible matches to a full-recording fingerprint of the sample in a second stage query, the full-recording fingerprint being a recording fingerprint of a substantial length or the entire length of a known recording. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable medium having stored thereon sequences of instructions, the sequences of instructions including instructions which when executed by a computer system causes the computer system to perform:
-
transforming a sample of a recording to a time-frequency domain and storing each time-frequency pair in a matrix array; setting all elements of the matrix array that are below a predetermined noise floor to zero and setting all elements of the matrix array that are outside a predetermined frequency range to zero; detecting a plurality of local maxima for a predetermined number of time slices; selecting a predetermined number of largest-magnitude maxima from the plurality of local maxima detected by said detecting; generating one or more hash values corresponding to the predetermined number of largest magnitude maxima; comparing the one or more hash values to corresponding hash values of known recordings in a first stage query to identify a set of possible matches; and comparing the set of possible matches to a full-recording fingerprint of the sample in a second stage query, the full-recording fingerprint being a recording fingerprint of a substantial length or the entire length of a known recording. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification