×

Method and system for clustering identified forms

  • US 7,996,390 B2
  • Filed: 02/15/2008
  • Issued: 08/09/2011
  • Est. Priority Date: 02/15/2008
  • Status: Expired due to Fees
First Claim
Patent Images

1. A device for organizing a plurality of documents that include forms, the device comprising:

  • a processor;

    a non-transitory computer-readable medium operably coupled to the processor, the computer-readable medium comprising instructions that, upon execution by the processor, perform operations comprising(a) identifying a first form in a first document selected from a plurality of documents;

    (b) calculating a first similarity value in a first feature space between a first cluster selected from a plurality of clusters defined for the plurality of documents and the first document, the first feature space associated with a content of the first document and a content of a document assigned to the first cluster;

    (c) calculating a second similarity value in a second feature space between the first cluster and the first document, the second feature space associated with a content of the identified first form and a content of a form in the document assigned to the first cluster;

    (d) calculating a similarity value between the first document and the first cluster based on the calculated first similarity value and the calculated second similarity value;

    (e) repeating (b)-(d) with each of the plurality of clusters as the first cluster selected from the plurality of clusters;

    (f) determining a cluster of the plurality of clusters to which to assign the first document based on the calculated similarity value for each of the plurality of clusters; and

    repeat (a)-(f) for each of the plurality of documents as the first document selected from the plurality of documents until the assignments become stable, wherein determining if the assignments become stable includes calculating a number of documents of the plurality of documents that are assigned to a different cluster in (f).

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×