×

Computer-implemented system and method for generating document training sets

  • US 10,332,007 B2
  • Filed: 11/07/2016
  • Issued: 06/25/2019
  • Est. Priority Date: 08/24/2009
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for generating document training sets, comprising:

  • providing a set of unclassified documents to each of two or more trained classifiers and receiving a classification code assigned to each unclassified document from each classifier;

    comparing via a server the classification codes assigned to each unclassified document by two or more of the classifiers, wherein the server comprises a central processing unit, memory, an input port to receive the set of unclassified documents, and an output port to provide a training set for a matter;

    determining for at least one of the unclassified documents that a disagreement exists between the classification codes from the two or more classifiers;

    providing via the server for further review the unclassified document with a disagreement in classification codes, wherein results of the further review comprise one of a new classification code and confirmation of one of the assigned classification codes;

    generating the training set for the matter via the server by grouping the unclassified documents for which the disagreement exists; and

    generating a further training set for a same or different matter, comprising;

    training two or more other classifiers by identifying features within one or more coded documents, classifying the features, and utilizing the classified features for training the other classifiers;

    identifying via the other classifiers one or more features within at least one of the unclassified documents;

    assigning by each of the other classifiers, a classification code to each of the identified features;

    comparing the classification codes assigned to each feature;

    determining whether a disagreement exists between the classification codes assigned to at least one of the features via the other classifiers;

    providing the features with a disagreement in classification codes for further review, wherein results of the further review comprise one of a new classification code and confirmation of one of the assigned classification codes; and

    grouping as the further training set the unclassified documents associated with the features for which a disagreement exists.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×