Multi-media content identification using multi-level content signature correlation and fast similarity search

US 20100306193A1
Filed: 05/27/2010
Published: 12/02/2010
Est. Priority Date: 05/28/2009
Status: Active Grant

First Claim

Patent Images

1. A method of preprocessing media content for storage in a media reference database, the method comprising:

generating a signature term frequency (STF) for each signature, wherein the STF represents a measure of uniqueness for each signature as compared to existing signatures in the media reference database;

entering each signature in the media reference database whose STF is less than a specified threshold, wherein the prespecified threshold represents a level of information content and uniqueness for a signature.

View all claims

14 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method is presented for large media data base query and media entry identification based on multi-level similarity search and reference-query entry correlation. Media content fingerprinting detects unique features and generates discriminative descriptors and signatures used to form preliminary reference data base. The preliminary reference data base is processed and a subset-set of it is selected to form a final reference data base. To identify a media query a fast similarity search is performed first on the reference database resulting in a preliminary set of likely matching videos. For each preliminary likely matching video a further multi-level correlation is performed which includes iterative refinement, sub-sequence merging, and final result classification.

279 Citations

23 Claims

1. A method of preprocessing media content for storage in a media reference database, the method comprising:
- generating a signature term frequency (STF) for each signature, wherein the STF represents a measure of uniqueness for each signature as compared to existing signatures in the media reference database;
  
  entering each signature in the media reference database whose STF is less than a specified threshold, wherein the prespecified threshold represents a level of information content and uniqueness for a signature.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, wherein the STF is generated for each signature by counting the number of times each signature appears in the same reference content within a specified time window.
  - 3. The method of claim 1, wherein the STF is generated for each signature by combining the number of times each signature appears in the same reference content within a specified time window and the number of times the signature occurs in the entire database.
  - 4. The method of claim 1, wherein the STF is generated for each signature by combining the number of times each signature appears in the same reference content.
  - 5. The method of claim 1 wherein the STF is increased by a large value for each signature when the signature is similar to a signature in the dictionary list, wherein a distance measure is used to detect similarity and signatures on the dictionary list are a selected list of frequently occurring signatures observed from current and previous databases.
  - 6. The method of claim 1, further comprising:
    - entering each signature in a dictionary of frequent signatures that is a collection of frequent signatures generated from one or more video reference databases, where each signature has more than a specified number of similar matching signatures within a certain bit error distance.

7. A method to detect a query sequence of audio and video signatures in a data base of audio and video signatures, the method comprising:
- searching the database of audio and video signatures in response to a query sequence of audio and video signatures using a hash index for each query signature;
  
  retrieving a set of database signatures that are similar as determined by a distance measure of the signatures to the query sequence of audio and video signatures in response to use of the hash index for each query signature to select a database entry;
  
  performing a correlation in time between corresponding pairs of signatures from the set of database signatures and the query sequence of audio and video signatures; and
  
  identifying a matching sequence between query and reference if the correlation in time generates a score above a determined threshold.
- View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 8. The method of claim 7, further comprising:
    - accessing a second database and performing a correlation in time between corresponding pairs of signatures from a second set of database signatures and the query sequence of audio and video signatures, wherein the second database is indexed by video id and time location.
  - 9. The method of claim 8 wherein a correlation in space is performed in a spatial domain using the x,y axes coordinate location and a scale size of matching feature points for pairs of matching reference and query signatures as factors in calculating an individual correlation score.
  - 10. The method of claim 7, further comprising:
    - performing a correlation in space between corresponding pairs of signatures from a second set of database signatures and the query sequence video signatures using the spatial information in the associated data of the signatures, wherein the second database is indexed by video id and time location.
  - 11. The method of claim 7 further comprising:
    - sub-dividing audio and videos into smaller chapters based on time; and
      
      searching the data base of audio and video signatures in a cluster search by use of the cluster index to detect chapters of the matching audio and video sequence, wherein the individual reference audio and videos are sub-divided into smaller chapters and the most likely audio and video chapters are returned as a result of the cluster search.
  - 12. The method of claim 11, further comprising:
    - detecting a likely time location of the matching video section, wherein the query consists of multiple overlapping query windows.
  - 13. The method of claim 7 further comprising:
    - binning for multiple trend lines within a selected range of query and original video frames as part of a correlation method to detect the best matching sequence within matching reference audio and video sequences; and
      
      selecting the best bin.
  - 14. The method of claim 7 further comprising:
    - performing frame to frame signature correlation on a detected trend line to generate a correlation score; and
      
      thresholding the correlation score of matching sequences or individual frames to detect a matching sequence.
  - 15. The method of claim 7, further comprising;
    - merging overlapping detected sequences that are overlapping and have similar slopes to generate merged overlapping sequences;
      
      merging non-overlapping detected sequences that have a small gap and have similar slopes to generate merged non-overlapping sequences;
      
      combining the merged overlapping sequences and the merged non-overlapping sequences to generate merged sequences; and
      
      retaining in the database of audio and video signatures the merged sequences that have relatively better correlation score as compared to previously detected original or best merged sequence.
  - 16. The method of claim 7, further comprising:
    - identifying an exact start and an end of a matching sequence between query and original videos;
      
      iteratively extending the query and original start or end frame numbers so as to evaluate iteratively longer sequences; and
      
      retaining extended sequences that have a relatively better correlation score to improve the accuracy of detected sequences.
  - 17. The method of claim 7 further comprising:
    - performing in parallel a similarity search and a time correlation on separate partitions of the database of audio and video signatures;
      
      sorting the detected sequences according to a measure of the similarity of signatures; and
      
      selecting the best matches to report to a user.
  - 18. The method of claim 7 further comprising:
    - weighing individual signatures in one or more databases based on the uniqueness of each signature.
  - 19. The method of claim 7 further comprising:
    - accessing a different database generated using orthogonal information to that in the data base of audio and video signatures;
      
      and performing a correlation in time between corresponding pairs of signatures from the orthogonal set of database signatures and the query sequence of audio and video signatures.
  - 20. The method of claim 7 and a method of tracking a detected sequence object viewpoint further comprising:
    - performing trend correlation for a extended query and original sequence to enabling faster correlation; and
      
      enabling a wider viewpoint detection.

21. A method of generating a likelihood score for a pair of query media frame content items and correlating between matching frames of the query and reference media content frames, the method comprising:
- generating a correlation score based on an individual frame or view similarity score, wherein the frame correlation score can be generated from a correlation between multiple signatures of different features of the query and original frame;
  
  generating a time correlation using relative differences in frame numbers of the original video and the query video; and
  
  generating a correlation between the original video and the query video by using a correlation of individual frames alone and without using a time sequence in the query media frame content and in the reference media content frames, wherein the reference media content frames is an entry in a reference media database.

22. A method of performing very fast sequence correlation comprising:
- performing a fast similarity search using a direct hash index of signatures to identify the likely matching chapters of the query and reference;
  
  performing sequence correlation on a reference chapter and query chapter;
  
  performing the fast similarity search and correlation on separate partitions or servers in parallel;
  
  thresholding the detected sequences to eliminate sequences; and
  
  selecting the best matches.
- View Dependent Claims (23)
- - 23. The method of claim 22 further comprising:
    - using a classifier with input from multiple matching sections such as frame or sequences of video or viewpoint of object to identify partitioned sections of the content databases that are likely to match; and
      
      performing the similarity search and sequence correlation on these selected database partitions.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Roku, Inc.
Original Assignee
Zeitera LLC
Inventors
Kulkarni, Sunil Suresh, Gajjar, Pradipkumar Dineshbhai, Merchant, Shashank, Pereira, Jose Pio, Ramanathan, Prashant

Granted Patent

US 8,335,786 B2
Time in Patent Office

Days
Field of Search
US Class Current

707/728
CPC Class Codes

G06F 16/7328   Query by example, e.g. a co...

G06F 16/783   using metadata automaticall...

G06V 20/48   Matching video sequences

Multi-media content identification using multi-level content signature correlation and fast similarity search

First Claim

14 Assignments

0 Petitions

Accused Products

Abstract

279 Citations

23 Claims

Specification

Use Cases

Quick Links

Others

Multi-media content identification using multi-level content signature correlation and fast similarity search

First Claim

14 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

279 Citations

23 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others