Multi-media content identification using multi-level content signature correlation and fast similarity search
First Claim
1. A method of preprocessing media content for storage and access in a media reference database, the method comprising:
- generating a signature term frequency (STF) for a signature of provided media content in a first media reference database, wherein the STF is a frequency of occurrence of the signature in the first media reference database and represents a measure of uniqueness for the signature as compared to existing signatures in the first media reference database; and
entering the signature using a hash index in a second media reference database, wherein the STF of the signature is less than a specified threshold, wherein the specified threshold represents a level of information content and uniqueness for the signature.
14 Assignments
0 Petitions
Accused Products
Abstract
A method is presented for large media data base query and media entry identification based on multi-level similarity search and reference-query entry correlation. Media content fingerprinting detects unique features and generates discriminative descriptors and signatures used to form preliminary reference data base. The preliminary reference data base is processed and a subset-set of it is selected to form a final reference data base. To identify a media query a fast similarity search is performed first on the reference database resulting in a preliminary set of likely matching videos. For each preliminary likely matching video a further multi-level correlation is performed which includes iterative refinement, sub-sequence merging, and final result classification.
-
Citations
23 Claims
-
1. A method of preprocessing media content for storage and access in a media reference database, the method comprising:
-
generating a signature term frequency (STF) for a signature of provided media content in a first media reference database, wherein the STF is a frequency of occurrence of the signature in the first media reference database and represents a measure of uniqueness for the signature as compared to existing signatures in the first media reference database; and entering the signature using a hash index in a second media reference database, wherein the STF of the signature is less than a specified threshold, wherein the specified threshold represents a level of information content and uniqueness for the signature. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method to detect a query sequence of audio and video signatures in a data base of audio and video signatures, the method comprising:
-
searching the database of audio and video signatures in response to a query sequence of audio and video signatures using a hash index for each query signature; retrieving a set of database signatures that are similar as determined by a distance measure of the audio and video signatures to the query sequence of audio and video signatures in response to use of the hash index for each query signature to select a database entry; generating a correlation in time score between corresponding pairs of signatures from the set of database signatures and the query sequence of audio and video signatures, wherein the correlation in time score is based on a first similarity score between a first query and a first reference signature, a second similarity score between a second query and a second reference signature, and a frame correlation between frames for the first query, the second query, and associated reference frames; and identifying a matching sequence between query and reference if the correlation in time score is above a determined threshold. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A method of generating a score for confidence of matching a query media sequence with a reference media sequence, the method comprising:
-
generating a feature correlation score based on a correlation between multiple signatures of matching frames of the query media sequence and the reference media sequence; generating a sequence correlation score using relative differences in frame numbers of the reference media sequence and the query media sequence; generating a match confidence score based on a function of the feature correlation score and the sequence correlation score for the reference media sequence and the query media sequence; and adding the reference media sequence with a generated match confidence score that exceeds a selectable confidence threshold to a list of matching media sequences, wherein the selectable confidence threshold is selected based on an accuracy chart that describes a false positive rate for types of distortion in content quality.
-
-
22. A method of performing fast sequence correlation comprising:
-
performing a fast similarity search using a direct hash index of signatures to identify a first plurality of likely matching chapters of a query media sequence and a reference media sequence; performing a sequence correlation on a reference chapter and a query chapter to identify a second plurality of likely matching chapters of the query media sequence and the reference media sequence; performing the fast similarity search and the sequence correlation in parallel on separate partitions of a reference database having a plurality of reference media sequences; thresholding the first plurality of likely matching chapters and the second plurality of likely matching chapters to eliminate reference media sequences that have a low likelihood of matching and to determine a plurality of most likely matching reference media sequences; and selecting the best matches from among the plurality of the most likely matching reference media sequences. - View Dependent Claims (23)
-
Specification