×

System and method for identifying substantially similar files

  • US 8,185,507 B1
  • Filed: 04/05/2007
  • Issued: 05/22/2012
  • Est. Priority Date: 04/20/2006
  • Status: Expired due to Fees
First Claim
Patent Images

1. A system, comprising:

  • a database configured to store data associated with a first file and a second file; and

    a processor configured to determine if binary data associated with a first file and a second file are substantially similar, the processor configured to run a first hashing algorithm against a first portion of a first file to generate a first hash value, the first portion being a first predetermined subset of binary data in the first file, and running a second hashing algorithm against the first portion of the first file to generate a second hash value, to determine whether the first hash value and the second hash value are substantially similar to a third hash value and a fourth hash value associated with a second portion of a second file, the third hash value generated using the first hashing algorithm and the fourth hash value generated using the second hashing algorithm, the second portion being a second predetermined subset of binary data in the second file, the second file further having one or more attributes that are substantially similar to one or more corresponding attributes associated with the first file, the processor further configured to identify a uniform resource locator (URL) of the second file if the first hash value and the second hash value are substantially similar to the third hash value and the fourth hash value associated with the second portion of the second file.

View all claims
  • 8 Assignments
Timeline View
Assignment View
    ×
    ×