Generating and matching hashes of multimedia content
First Claim
1. A method of generating a hash signal identifying an information signal, the method comprising the steps of:
- dividing the information signal into frames, computing a hash word for each frame, and concatenating successive hash words to constitute the hash signal.
9 Assignments
0 Petitions
Accused Products
Abstract
Hashes are short summaries or signatures of data files which can be used to identify the file. Hashing multimedia content (audio, video, images) is difficult because the hash of original content and processed (e.g. compressed) content may differ significantly.
The disclosed method generates robust hashes for multimedia content, for example, audio clips. The audio clip is divided (12) into successive (preferably overlapping) frames. For each frame, the frequency spectrum is divided (15) into bands. A robust property of each band (e.g. energy) is computed (16) and represented (17) by a respective hash bit. An audio clip is thus represented by a concatenation of binary hash words, one for each frame. To identify a possibly compressed audio signal, a block of hash words derived therefrom is matched by a computer (20) with a large database (21). Such matching strategies are also disclosed. In an advantageous embodiment, the extraction process also provides information (19) as to which of the hash bits are the least reliable. Flipping these bits considerably improves the speed and performance of the matching process.
431 Citations
25 Claims
-
1. A method of generating a hash signal identifying an information signal, the method comprising the steps of:
-
dividing the information signal into frames, computing a hash word for each frame, and concatenating successive hash words to constitute the hash signal. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 15, 16, 17, 20)
-
-
12. A method of generating a hash signal to identify an information signal, comprising the steps of:
-
dividing the information signal into blocks;
extracting for each block a feature of the information signal within said block;
comparing the value of the extracted feature with a threshold;
generating for each block a hash bit indicating whether the value of the extracted feature is larger or smaller than said threshold;
determining for each block reliability information indicating whether the value of the extracted feature differs substantially from said threshold;
combining said hash bits and said reliability information of the blocks into a hash value having reliable hash bits for which the extracted feature differs substantially from said threshold, and unreliable bits for which the extracted feature does not differ substantially from said threshold.
-
-
14. A method of matching an input block of hash words representing at least a part of an information signal with hash signals identifying respective information signals stored in a database, the method comprising the steps of:
-
(a) selecting a hash word of said input block of hash words;
(b) searching said hash word in the database;
(c) calculating a difference between the input block of hash words and a stored block of hash words in which the hash word found in step (b) has the same position as the selected hash word in the input block;
(d) repeating steps (a) to (c) for a further selected hash word until said difference is lower than a predetermined threshold. - View Dependent Claims (21)
-
-
18. A method of matching a hash value representing an unidentified information signal with a plurality of hash values stored in a database and identifying a respective plurality of information signals, the method comprising the steps of:
-
(a) receiving said hash value in the form of a plurality of reliable hash bits and unreliable hash bits;
(b) searching in the database the stored hash values for which holds that the reliable bits of the applied hash value match the corresponding bits of the stored hash value;
(c) for each stored hash value found in step (b), calculating the bit error rate between the reliable bits of the hash value representing the unidentified information signal and the corresponding bits of the stored hash value; and
(d) determining for which stored hash values the bit error rate is minimal and sufficiently small.
-
-
19. A method of matching a hash signal representing an unidentified information signal with a plurality of hash signals stored in a database and identifying a respective plurality of information signals, the method comprising the steps of:
-
(a) receiving said hash signal in the form of a series of hash values, each hash value having reliable hash bits and unreliable hash bits;
(b) applying one of the hash values of said series to the database;
(c) searching in the database the stored hash values for which holds that the reliable bits of the applied hash value match the corresponding bits of the stored hash value;
(d) for each stored hash value found in step (c);
selecting in the database the corresponding series of stored hash values;
calculating the bit error rate between the reliable bits of the series of hash values representing the unidentified information signal and the corresponding bits of the selected series of hash values in the database; and
(f) determining for which series of stored hash values the bit error rate is minimal and sufficiently small.
-
- 22. A method of redirecting a receiver of an information signal to an Internet website, the method comprising the steps of deriving a hash signal from said information signal, and matching said hash signal with hash signals identifying Internet websites stored in a database.
-
23. A method of measuring the quality of an information signal, the method comprising the steps of deriving a hash signal from said information signal, matching said hash signal with a hash signal identifying said information signal stored in a database, and calculating the difference between the derived hash signal and the stored hash signal.
-
24. A method of identifying a multimedia signal, the method comprising the steps of receiving and/or recording at least a part of said multimedia signal, deriving a hash signal from said multimedia signal, sending said hash signal to a database for matching it with hash signals stored in said database, and receiving from said database an identifier of the multimedia signal.
Specification