×

Document classification and characterization

  • US 8,396,871 B2
  • Filed: 01/26/2011
  • Issued: 03/12/2013
  • Est. Priority Date: 01/26/2011
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • receiving, by at least one data processor, data characterizing each of a plurality of documents within a document set;

    grouping, by at least one data processor, the plurality of documents into a plurality of stacks using one or more grouping algorithms, wherein key words are identified in each document and weights specified by a scorecard scoring model are assigned to variables corresponding to each key word, wherein a scoring algorithm, using corresponding variables and weights, provides a score for each document which is used by the grouping algorithm when grouping the documents;

    identifying, by at least one data processor, a prime document for each stack, the prime document including attributes representative of the entire stack;

    providing, by at least one data processor, data characterizing documents for each stack including at least the identified prime document to at least one human reviewer;

    receiving, by at least one data processor, user-generated input from the human reviewer categorizing each provided document;

    providing, by at least one data processor, data characterizing the user-generated input;

    evaluating, by at least one data processor, identified grouping errors using at least one of Z-test techniques and multiple regression techniques;

    determining, by at least one data processor based on the evaluating, a relative contribution of the variables used by the grouping algorithm to the grouping errors; and

    modifying, by at least one data processor, the grouping algorithm so that at least one weight assigned by the grouping algorithm off-sets the relative contribution of the variables used by the grouping algorithm to an error rate, wherein subsequently received documents are grouped using the modified grouping algorithm.

View all claims
  • 8 Assignments
Timeline View
Assignment View
    ×
    ×