×

METHOD FOR ORGANIZING LARGE NUMBERS OF DOCUMENTS

  • US 20100287466A1
  • Filed: 07/20/2010
  • Published: 11/11/2010
  • Est. Priority Date: 07/02/2007
  • Status: Active Grant
First Claim
Patent Images

1. ) A method for organizing documents into nodes, in which a node represents a group of substantially equivalent documents, said method comprising:

  • (i) providing a plurality of original documents, each comprising a header and a body, and wherein said header comprises at least one parameter and wherein said body comprises text,(ii) selecting a document from among said documents and associating the document with a node, comparing at least a portion of the body text of said document to at least a portion of the body texts of other documents from amongst said plurality of documents, and in the case of a match, merging the node associated with said document with a node associated with the matching document,(iii) searching the body of said document to locate a first instance of header-type text, wherein said header-type text contains at least one header parameter;

    (iv) constructing a presumed document comprising a header and a body, wherein said header of said presumed document comprises one or more parameters from said header-type text located within said body of said original document, and wherein said body of said presumed document substantially comprises the text located after said header-type text in said body of said original document, and associating said presumed document with a node;

    (v) comparing at least a portion of the body text of the presumed document to at least a portion of the body texts of at least one other document from among said plurality of documents and in the case of a match, merging a node associated with said presumed document with a node associated with the matching document,(vi) if the comparison of (v) does not find a match, processing repeatedly the remainder of the body of said document for successive instances of header-type text, as stipulated in stages (iii)-(v), and for each instance, constructing a presumed document, comparing for any matching documents to the presumed document, and if found, merging the nodes associated with the matching documents, until no new instances of header-type text are found.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×