×

Method of merging large databases in parallel

  • US 5,717,915 A
  • Filed: 03/04/1996
  • Issued: 02/10/1998
  • Est. Priority Date: 03/15/1994
  • Status: Expired due to Term
First Claim
Patent Images

1. A method for identifying duplicate records in a database, each record having at least one field and a plurality of keys, comprising the steps of pre-processing the records in the database using a thesaurus database to indicate relatedness, and:

  • (i)(a) sorting the records according to a criteria applied to a first key;

    (b) comparing a number of consecutive sorted records to each other, wherein said number is less than a number of records in said database and identifying a first group of duplicate records;

    (c) storing the identity of said first group;

    (ii)(a) sorting the records according to a criteria applied to a second key;

    (b) comparing a number of consecutive sorted records to each other, wherein said number is less than a number of records in said database and identifying a second group of duplicate records;

    (c) storing the identity of said second group; and

    (iii) subjecting the union of said first and second groups to transitive closure.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×