×

Approach For Application-Specific Duplicate Detection

  • US 20090043767A1
  • Filed: 08/07/2007
  • Published: 02/12/2009
  • Est. Priority Date: 08/07/2007
  • Status: Abandoned Application
First Claim
Patent Images

1. A computer-implemented method for detecting duplicate information, the computer-implemented method comprising:

  • extracting, from a certain document, first view data of a view, wherein the view includes a plurality of view components;

    identifying within said first view data, a first view component datum for each of the plurality of view components;

    generating, for the first view data, a first view signature that includes a plurality of first view component signatures;

    wherein each first view component signature of said first view signature is generated based on a first view component datum of at least one view component of said plurality of view components;

    making a determination of whether the first view data matches any other view data extracted from a plurality of other documents by comparing the plurality of first view signatures against other view signatures of said plurality of other documents; and

    establishing the certain document as a duplicate based on the determination.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×