Efficient storage of fingerprints
First Claim
1. A method of storing fingerprints identifying audio-visual media signals in a database, the method comprising, for each audio-visual signal:
- dividing said audio-visual media signal into a sequence of frames;
sub-sampling said sequence of frames by a factor M to obtain a sub-sampled sequence of frames, each frame of the sub-sampled sequence of frames overlapping in time with an adjacent frame of the sub-sampled sequence of frames, and the factor M being a positive integer;
extracting, for each frame of said sub-sampled sequence of frames, a hash word derived from a perceptually essential property of the signal within said frame, to obtain a respective sub-sampled sequence of hash words, each hash word of the sub-sampled sequence of hash words being nonrandomly positioned within the sub-sampled sequence of hash words; and
storing said sub-sampled sequence of hash words as a fingerprint in said database, the fingerprint being a digital summary of the audio visual media signal.
13 Assignments
0 Petitions
Accused Products
Abstract
A robust fingerprinting system is disclosed. Such a system can recognize unknown multimedia content (U(t)) by extracting a fingerprint (a series of hash words) from said content, and searching a resembling fingerprint in a database in which fingerprints of a plurality of known contents (K(t)) are stored. In order to more efficiently store the fingerprints in the database and to speed up the search, the hash words (H(n)) of known signals (K(t)) are sub-sampled (13) by a factor M prior to storage in the database (14). The hash words (H(n)) of unknown signals (U(t)) are divided (16) into M interleaved sub-series (H0(n) . . . HM−1(n)). The interleaved sub-series are selectively (17) applied to the database (14) under the control of a computer (15). If only one of the sub-series sufficiently matches a stored fingerprint, the signal is identified.
-
Citations
8 Claims
-
1. A method of storing fingerprints identifying audio-visual media signals in a database, the method comprising, for each audio-visual signal:
-
dividing said audio-visual media signal into a sequence of frames; sub-sampling said sequence of frames by a factor M to obtain a sub-sampled sequence of frames, each frame of the sub-sampled sequence of frames overlapping in time with an adjacent frame of the sub-sampled sequence of frames, and the factor M being a positive integer; extracting, for each frame of said sub-sampled sequence of frames, a hash word derived from a perceptually essential property of the signal within said frame, to obtain a respective sub-sampled sequence of hash words, each hash word of the sub-sampled sequence of hash words being nonrandomly positioned within the sub-sampled sequence of hash words; and storing said sub-sampled sequence of hash words as a fingerprint in said database, the fingerprint being a digital summary of the audio visual media signal.
-
-
2. An arrangement to store fingerprints identifying audio-visual media signals (K(t)) in a database, the arrangement comprising:
-
framing means for dividing said audio-visual media signals into a sequence of overlapping frames; sub-sampling means for sub-sampling said sequence of frames by a factor M to obtain a sub-sampled sequence of frames, each frame of the sub-sampled sequence of frames overlapping in time with an adjacent frame of the sub-sampled sequence of frames, and, the factor M being a positive integer; means for extracting, for each frame of said sub-sampled sequence of frames, a hash word (H(n)) derived from a perceptually essential property of the signal within said frame, to obtain a respective sub-sampled sequence of hash words, each hash word of the sub-sampled sequence of hash words being nonrandomly positioned within the sub-sampled sequence of hash words; and a database for storing said sub-sampled sequence of hash words as fingerprint in said database, the fingerprint being a digital summary of the audio visual media signal.
-
-
3. A method of identifying an unknown audio-visual media signal, the method comprising:
-
dividing at least a part of the unknown audio-visual media signal into a series of frames; extracting, for each frame, a hash word representing a perceptually essential property of the signal within said frame, to obtain a respective series of hash words; dividing said series of hash words into M interleaved sub-series of hash words, each hash word of each of the M interleaved sub-series of hash words being extracted from a different frame of the series of frames, each frame of the series of frames overlapping in time with an adjacent frame of the series of frames, from which an adjacent hash word of the M interleaved sub-series of hash words is extracted, and each hash word of each of the M interleaved sub-series of hash words being nonrandomly positioned within each of the M interleaved sub-series of hash words; successively applying each of said M interleaved sub-series to a database in which, for a plurality of multi-media signals, a sub-sampled sequence of hash words has been stored; and identifying the unknown signal as the multi-media signal based on whether a difference between at least a part of the stored sub-sampled sequence of hash words and at least one of the M applied interleaved sub-series of hash words is less than a specific threshold value.
-
-
4. An arrangement to identify an unknown audio-visual media signal, the arrangement comprising:
-
framing means for dividing at least a part of the unknown audio-visual media signal (U(t)) into a series of frames; means for extracting, for each frame, a hash word derived from a perceptually essential property of the signal within said frame, to obtain a respective series of hash words; interleaving means for dividing said series of hash words into M interleaved sub-series of hash words, each hash word of each of the M interleaved sub-series of hash words being extracted from a different frame of the series of frames, each frame of the series of frames overlapping in time with an adjacent frame of the series of frames, from which an adjacent hash word of the M interleaved sub-series of hash words is extracted, and each hash word of each of the M interleaved sub-series of hash words being nonrandomly positioned within each of the M interleaved sub-series of hash words; selection means for successively applying each of said M interleaved sub-series to a database in which for a plurality of multi-media signals, a sub-sampled sequence of hash words has been stored; and computer means for identifying the unknown signal as the multi-media signal based on whether a difference between at least a part of the stored sub-sampled sequence of hash words and at least one of the M applied interleaved sub-series of hash words is less than a specific threshold value.
-
-
5. A method of identifying an unknown audio-visual media signal, the method comprising:
-
receiving, from a remote station, a series of hash words generated by dividing at least a part of the unknown audio-visual media signal into a series of frames, and extracting, for each frame, a hash word based on a perceptually essential property of the signal within said frame; dividing said series of hash words into M interleaved sub-series of hash words, each hash word of each of the M interleaved sub-series of hash words being extracted from a different frame of the series of frames, each frame of the series of frames overlapping in time with an adjacent frame of the series of frames, from which an adjacent hash word of the M interleaved sub-series of hash words is extracted, and each hash word of each of the M interleaved sub-series of hash words being nonrandomly positioned within each of the M interleaved sub-series of hash words; successively applying each of said M interleaved sub-series to a database in which, for a plurality of multi-media signals, a sub-sampled sequence of hash words has been stored; and identifying the unknown signal as the multi-media signal based on whether a difference between at least a part of the stored sub-sampled sequence of hash words and at least one of the M applied interleaved sub-series of hash words is less than a specific threshold value. - View Dependent Claims (6)
-
-
7. A system to store fingerprints identifying audio-visual media signals (K(t)) in a database, the arrangement comprising:
-
a framing circuit to divide said audio-visual media signals into a sequence of overlapping frames; sub-sampler to sub-sample said sequence of frames by a factor M to obtain a sub-sampled sequence of frames, each frame overlapping in time with an adjacent frame of the sub-sampled sequence of frames, and the factor M being a positive integer; a hash extracting circuit to extract for each frame of said sub-sampled sequence of frames, a hash word (H(n)) derived from a perceptually essential property of the signal within said frame, to obtain a respective sub-sampled sequence of hash words, each hash word of the sub-sampled sequence of hash words being nonrandomly positioned within the sub-sampled sequence of hash words; and a database for storing said sub-sampled sequence of hash words as a fingerprint in said database, the fingerprint being a digital summary of the audio visual media signal.
-
-
8. A system to identify an unknown audio-visual media signal, the arrangement comprising:
-
a framing circuit to divide at least a part of the unknown audio-visual media signal (U(t)) into a series of frames; a hash extracting circuit, to extract for each frame, a hash word derived from a perceptually essential property of the signal within said frame, to obtain a respective series of hash words; an interleaving circuit to divide said series of hash words into M interleaved sub-series of hash words, each hash word of each of the M interleaved sub-series of hash words being extracted from a different frame of the series of frames, each frame of the series of frames overlapping in time with an adjacent frame of the series of frames, from which an adjacent hash word of the M interleaved sub-series of hash words is extracted, and each hash word of each of the M interleaved sub-series of hash words being nonrandomly positioned within each of the M interleaved sub-series of hash words; a selection circuit to successively apply each of said M interleaved sub-series to a database in which for a plurality of multi-media signals, a sub-sampled sequence of hash words has been stored; and a computer to identify the unknown signal as the multi-media signal based on whether a difference between at least a part of the stored sub-sampled sequence of hash words and at least one of the M applied interleaved sub-series of hash words is less than a specific threshold value.
-
Specification