×

Apparatus and methods for scalable object clustering

  • US 8,566,317 B1
  • Filed: 01/06/2010
  • Issued: 10/22/2013
  • Est. Priority Date: 01/06/2010
  • Status: Active Grant
First Claim
Patent Images

1. An apparatus configured to efficiently group a set of strings into clusters of related strings, the apparatus comprising:

  • data storage configured to store computer-readable code and data;

    a processor configured to access the data storage and to execute said computer-readable code;

    computer-readable code configured to receive the set of strings;

    computer-readable code configured to determine a binary output of an evaluation function between a pair of strings by steps including (i) generating a hash table based on a first string, (ii) matching sub-strings of a second string against the first string using the hash table, (iii) recording matches in a list, and (iv) applying a threshold based at least in part on a length of common substrings between the first and second strings; and

    computer-readable code configured to group the strings in the set into clusters by a procedure which, for each string that does not already belong to a cluster, determines the binary output of the evaluation function between the string and each other string in the set that do not yet belong to any cluster.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×