×

Systems and methods for efficient data searching, storage and reduction

  • US 8,275,782 B2
  • Filed: 03/19/2009
  • Issued: 09/25/2012
  • Est. Priority Date: 09/15/2004
  • Status: Expired due to Fees
First Claim
Patent Images

1. A system for providing input data to a repository to search repository data in the repository for data that are similar to the input data, the input data being divided into one or more input chunks, the system comprising:

  • a data processor and a memory storing instructions for, for each input chunk, calculating a corresponding set of input distinguishing characteristics (IDCs), each set of IDCs comprising a plurality of distinguishing characteristics, said data processor being configured to partition the respective input chunk into a plurality of seeds, each seed being a smaller part of the respective input chunk and ordered in a seed sequence and to apply a hash function to each of the seeds to generate a plurality of hash values wherein each seed yields one hash value, characterized in that;

    said memory storing instructions configured to cause the data processor to select a subset (k) of the plurality of hash values;

    determine positions of the seeds within the seed sequence corresponding to the selected subset of hash values;

    apply a function to the determined positions to determine corresponding other positions within the seed sequence; and

    define the set of distinguishing characteristics as the hash values of the seeds at the determined other positions.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×