Generating and matching hashes of multimedia content
First Claim
1. A method to match a hash value representing an unidentified information signal with a plurality of hash values stored in a database and to identify a respective one of a plurality of information signals, the method comprising:
- receiving said hash value in the form of a plurality of reliable hash bits and unreliable hash bits;
searching in the database the stored hash values for which holds that the reliable bits of the applied hash value match the corresponding bits of the stored hash value while ignoring unreliable bits of the applied hash value and corresponding bits of the stored hash value;
for each stored hash value found in response to the searching, calculating the bit error rate between the reliable bits of the hash value representing the unidentified information signal and the corresponding bits of the stored hash value;
determining for which stored hash values the bit error rate is minimal; and
returning an identification of the respective one of the plurality of information signals that corresponds to the minimal bit error rate.
9 Assignments
0 Petitions
Accused Products
Abstract
Hashes are short summaries or signatures of data files which can be used to identify the file. Hashing multimedia content (audio, video, images) is difficult because the hash of original content and processed (e.g. compressed) content may differ significantly.
The disclosed method generates robust hashes for multimedia content, for example, audio clips. The audio clip is divided (12) into successive (preferably overlapping) frames. For each frame, the frequency spectrum is divided (15) into bands. A robust property of each band (e.g. energy) is computed (16) and represented (17) by a respective hash bit. An audio clip is thus represented by a concatenation of binary hash words, one for each frame. To identify a possibly compressed audio signal, a block of hash words derived therefrom is matched by a computer (20) with a large database (21). Such matching strategies are also disclosed. In an advantageous embodiment, the extraction process also provides information (19) as to which of the hash bits are the least reliable. Flipping these bits considerably improves the speed and performance of the matching process.
215 Citations
28 Claims
-
1. A method to match a hash value representing an unidentified information signal with a plurality of hash values stored in a database and to identify a respective one of a plurality of information signals, the method comprising:
-
receiving said hash value in the form of a plurality of reliable hash bits and unreliable hash bits; searching in the database the stored hash values for which holds that the reliable bits of the applied hash value match the corresponding bits of the stored hash value while ignoring unreliable bits of the applied hash value and corresponding bits of the stored hash value; for each stored hash value found in response to the searching, calculating the bit error rate between the reliable bits of the hash value representing the unidentified information signal and the corresponding bits of the stored hash value; determining for which stored hash values the bit error rate is minimal; and returning an identification of the respective one of the plurality of information signals that corresponds to the minimal bit error rate. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method to match a hash signal representing an unidentified information signal with a plurality of hash signals stored in a database and to identify a respective one of a plurality of information signals, the method comprising:
-
receiving said hash signal in the form of a series of hash values, each hash value having reliable hash bits and unreliable hash bits; applying one of the hash values of said series to the database; searching in the database the stored hash values for which holds that the reliable bits of the applied hash value match the corresponding bits of the stored hash value while ignoring unreliable bits of the applied hash value and corresponding bits of the stored hash value; for each stored hash value found in response to the searching; selecting in the database the corresponding series of stored hash values; calculating the bit error rate between the reliable bits of the series of hash values representing the unidentified information signal and the corresponding bits of the selected series of hash values in the database while ignoring unreliable bits of the series of hash values and corresponding bits of the selected series of hash values in the database; and determining for which series of stored hash values the bit error rate is minimal; and returning an identification of the respective one of the plurality of information signals that corresponds to the minimal bit error rate. - View Dependent Claims (14)
-
-
15. A system comprising:
-
a receiving module to receive a subject hash value in the form of a plurality of reliable hash bits and unreliable hash bits, the hash value representing an unidentified information signal; a searching module to search, in a database, stored hash values for which holds that the reliable bits of the hash value representing the unidentified information signal match the corresponding bits of the stored hash value while ignoring unreliable bits of the hash value representing the unidentified information signal and corresponding bits of the stored hash value; a bit error evaluator to; calculate the bit error rate between the reliable bits of the hash value representing the unidentified information signal and the corresponding bits of the stored hash value for each stored hash value found in response to the search, and determine for which stored hash values the bit error rate is minimal; and a return module to return an identification of the respective one of the plurality of information signals that corresponds to the minimal bit error rate. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. A system to match a hash signal representing an unidentified information signal with a plurality of hash signals stored in a database and to identify a respective one of a plurality of information signals, the system comprising:
-
a receiving module to receive said hash signal in the form of a series of hash values, each hash value having reliable hash bits and unreliable hash bits; an applying module to apply one of the hash values of said series to the database; a searching module to search in the database the stored hash values for which holds that the reliable bits of the applied hash value match the corresponding bits of the stored hash value while ignoring unreliable bits of the applied hash value and corresponding bits of the stored hash value; a selecting module to select in the database the corresponding series of stored hash values for each stored hash value found in response to the searching; a calculating module to calculate the bit error rate between the reliable bits of the hash value representing the unidentified information signal and the corresponding bits of the selected series of hash values in the database while ignoring unreliable bits of the series of hash values and corresponding bits of the selected series of hash values in the database; a determining module to determine for which series of stored hash values the bit error rate is minimal; and a returning module to return an identification of the respective one of the plurality of information signals that corresponds to the minimal bit error rate. - View Dependent Claims (28)
-
Specification