Efficient storage of fingerprints

US 7,477,739 B2
Filed: 01/21/2003
Issued: 01/13/2009
Est. Priority Date: 02/05/2002
Status: Active Grant

First Claim

Patent Images

1. A method of storing fingerprints identifying audio-visual media signals in a database, the method comprising, for each audio-visual signal:

dividing said audio-visual media signal into a sequence of frames;

sub-sampling said sequence of frames by a factor M to obtain a sub-sampled sequence of frames, each frame of the sub-sampled sequence of frames overlapping in time with an adjacent frame of the sub-sampled sequence of frames, and the factor M being a positive integer;

extracting, for each frame of said sub-sampled sequence of frames, a hash word derived from a perceptually essential property of the signal within said frame, to obtain a respective sub-sampled sequence of hash words, each hash word of the sub-sampled sequence of hash words being nonrandomly positioned within the sub-sampled sequence of hash words; and

storing said sub-sampled sequence of hash words as a fingerprint in said database, the fingerprint being a digital summary of the audio visual media signal.

View all claims

13 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A robust fingerprinting system is disclosed. Such a system can recognize unknown multimedia content (U(t)) by extracting a fingerprint (a series of hash words) from said content, and searching a resembling fingerprint in a database in which fingerprints of a plurality of known contents (K(t)) are stored. In order to more efficiently store the fingerprints in the database and to speed up the search, the hash words (H(n)) of known signals (K(t)) are sub-sampled (13) by a factor M prior to storage in the database (14). The hash words (H(n)) of unknown signals (U(t)) are divided (16) into M interleaved sub-series (H0(n) . . . HM−1(n)). The interleaved sub-series are selectively (17) applied to the database (14) under the control of a computer (15). If only one of the sub-series sufficiently matches a stored fingerprint, the signal is identified.

Citations

8 Claims

1. A method of storing fingerprints identifying audio-visual media signals in a database, the method comprising, for each audio-visual signal:
- dividing said audio-visual media signal into a sequence of frames;
  
  sub-sampling said sequence of frames by a factor M to obtain a sub-sampled sequence of frames, each frame of the sub-sampled sequence of frames overlapping in time with an adjacent frame of the sub-sampled sequence of frames, and the factor M being a positive integer;
  
  extracting, for each frame of said sub-sampled sequence of frames, a hash word derived from a perceptually essential property of the signal within said frame, to obtain a respective sub-sampled sequence of hash words, each hash word of the sub-sampled sequence of hash words being nonrandomly positioned within the sub-sampled sequence of hash words; and
  
  storing said sub-sampled sequence of hash words as a fingerprint in said database, the fingerprint being a digital summary of the audio visual media signal.

2. An arrangement to store fingerprints identifying audio-visual media signals (K(t)) in a database, the arrangement comprising:
- framing means for dividing said audio-visual media signals into a sequence of overlapping frames;
  
  sub-sampling means for sub-sampling said sequence of frames by a factor M to obtain a sub-sampled sequence of frames, each frame of the sub-sampled sequence of frames overlapping in time with an adjacent frame of the sub-sampled sequence of frames, and, the factor M being a positive integer;
  
  means for extracting, for each frame of said sub-sampled sequence of frames, a hash word (H(n)) derived from a perceptually essential property of the signal within said frame, to obtain a respective sub-sampled sequence of hash words, each hash word of the sub-sampled sequence of hash words being nonrandomly positioned within the sub-sampled sequence of hash words; and
  
  a database for storing said sub-sampled sequence of hash words as fingerprint in said database, the fingerprint being a digital summary of the audio visual media signal.

3. A method of identifying an unknown audio-visual media signal, the method comprising:
- dividing at least a part of the unknown audio-visual media signal into a series of frames;
  
  extracting, for each frame, a hash word representing a perceptually essential property of the signal within said frame, to obtain a respective series of hash words;
  
  dividing said series of hash words into M interleaved sub-series of hash words, each hash word of each of the M interleaved sub-series of hash words being extracted from a different frame of the series of frames, each frame of the series of frames overlapping in time with an adjacent frame of the series of frames, from which an adjacent hash word of the M interleaved sub-series of hash words is extracted, and each hash word of each of the M interleaved sub-series of hash words being nonrandomly positioned within each of the M interleaved sub-series of hash words;
  
  successively applying each of said M interleaved sub-series to a database in which, for a plurality of multi-media signals, a sub-sampled sequence of hash words has been stored; and
  
  identifying the unknown signal as the multi-media signal based on whether a difference between at least a part of the stored sub-sampled sequence of hash words and at least one of the M applied interleaved sub-series of hash words is less than a specific threshold value.

4. An arrangement to identify an unknown audio-visual media signal, the arrangement comprising:
- framing means for dividing at least a part of the unknown audio-visual media signal (U(t)) into a series of frames;
  
  means for extracting, for each frame, a hash word derived from a perceptually essential property of the signal within said frame, to obtain a respective series of hash words;
  
  interleaving means for dividing said series of hash words into M interleaved sub-series of hash words, each hash word of each of the M interleaved sub-series of hash words being extracted from a different frame of the series of frames, each frame of the series of frames overlapping in time with an adjacent frame of the series of frames, from which an adjacent hash word of the M interleaved sub-series of hash words is extracted, and each hash word of each of the M interleaved sub-series of hash words being nonrandomly positioned within each of the M interleaved sub-series of hash words;
  
  selection means for successively applying each of said M interleaved sub-series to a database in which for a plurality of multi-media signals, a sub-sampled sequence of hash words has been stored; and
  
  computer means for identifying the unknown signal as the multi-media signal based on whether a difference between at least a part of the stored sub-sampled sequence of hash words and at least one of the M applied interleaved sub-series of hash words is less than a specific threshold value.

5. A method of identifying an unknown audio-visual media signal, the method comprising:
- receiving, from a remote station, a series of hash words generated by dividing at least a part of the unknown audio-visual media signal into a series of frames, and extracting, for each frame, a hash word based on a perceptually essential property of the signal within said frame;
  
  dividing said series of hash words into M interleaved sub-series of hash words, each hash word of each of the M interleaved sub-series of hash words being extracted from a different frame of the series of frames, each frame of the series of frames overlapping in time with an adjacent frame of the series of frames, from which an adjacent hash word of the M interleaved sub-series of hash words is extracted, and each hash word of each of the M interleaved sub-series of hash words being nonrandomly positioned within each of the M interleaved sub-series of hash words;
  
  successively applying each of said M interleaved sub-series to a database in which, for a plurality of multi-media signals, a sub-sampled sequence of hash words has been stored; and
  
  identifying the unknown signal as the multi-media signal based on whether a difference between at least a part of the stored sub-sampled sequence of hash words and at least one of the M applied interleaved sub-series of hash words is less than a specific threshold value.
- View Dependent Claims (6)
- - 6. A method as claimed in claim 5, wherein each frame of said series of frames are overlapping in time with an adjacent frame.

7. A system to store fingerprints identifying audio-visual media signals (K(t)) in a database, the arrangement comprising:
- a framing circuit to divide said audio-visual media signals into a sequence of overlapping frames;
  
  sub-sampler to sub-sample said sequence of frames by a factor M to obtain a sub-sampled sequence of frames, each frame overlapping in time with an adjacent frame of the sub-sampled sequence of frames, and the factor M being a positive integer;
  
  a hash extracting circuit to extract for each frame of said sub-sampled sequence of frames, a hash word (H(n)) derived from a perceptually essential property of the signal within said frame, to obtain a respective sub-sampled sequence of hash words, each hash word of the sub-sampled sequence of hash words being nonrandomly positioned within the sub-sampled sequence of hash words; and
  
  a database for storing said sub-sampled sequence of hash words as a fingerprint in said database, the fingerprint being a digital summary of the audio visual media signal.

8. A system to identify an unknown audio-visual media signal, the arrangement comprising:
- a framing circuit to divide at least a part of the unknown audio-visual media signal (U(t)) into a series of frames;
  
  a hash extracting circuit, to extract for each frame, a hash word derived from a perceptually essential property of the signal within said frame, to obtain a respective series of hash words;
  
  an interleaving circuit to divide said series of hash words into M interleaved sub-series of hash words, each hash word of each of the M interleaved sub-series of hash words being extracted from a different frame of the series of frames, each frame of the series of frames overlapping in time with an adjacent frame of the series of frames, from which an adjacent hash word of the M interleaved sub-series of hash words is extracted, and each hash word of each of the M interleaved sub-series of hash words being nonrandomly positioned within each of the M interleaved sub-series of hash words;
  
  a selection circuit to successively apply each of said M interleaved sub-series to a database in which for a plurality of multi-media signals, a sub-sampled sequence of hash words has been stored; and
  
  a computer to identify the unknown signal as the multi-media signal based on whether a difference between at least a part of the stored sub-sampled sequence of hash words and at least one of the M applied interleaved sub-series of hash words is less than a specific threshold value.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Gracenote, Inc. (RR Donnelley & Sons Company)
Original Assignee
Gracenote, Inc. (RR Donnelley & Sons Company)
Inventors
Schimmel, Steven Marco, Haitsma, Jaap Andre, Kalker, Antonius Adrianus Cornelis Maria
Primary Examiner(s)
Barron, Jr.; Gilberto
Assistant Examiner(s)
Lemma; Samson B

Application Number

US10/503,245
Publication Number

US 20050141707A1
Time in Patent Office

2,184 Days
Field of Search

726/13, 726/5, 713/186, 713/168, 713/193, 713/181, 380/201
US Class Current

380/201
CPC Class Codes

G06F 16/41   Indexing; Data structures t...

G06F 16/433   using audio data

G06F 16/634   Query by example, e.g. quer...

G06F 16/683   using metadata automaticall...

Efficient storage of fingerprints

First Claim

13 Assignments

0 Petitions

Accused Products

Abstract

Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

Efficient storage of fingerprints

First Claim

13 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links