×

System for enhancing expert-based computerized analysis of a set of digital documents and methods useful in conjunction therewith

  • US 8,527,523 B1
  • Filed: 09/14/2009
  • Issued: 09/03/2013
  • Est. Priority Date: 04/22/2009
  • Status: Active Grant
First Claim
Patent Images

1. An electronic document analysis method receiving N electronic documents pertaining to a case encompassing a set of issues including at least one issue and establishing relevance of at least the N electronic documents to at least one individual issue in the set of issues, the method comprising, for at least one individual issue from among said set of issues:

  • receiving an output of a categorization process applied to each document in training and control subsets of said at least N electronic documents, said output including, for each document in said subsets, one of a relevant-to-said-individual issue indication and a non-relevant-to-said-individual issue indication;

    building a text classifier simulating said categorization process using said output for all documents in said training subset of documents;

    evaluating said text classifier'"'"'s quality using said output for all documents in said control subset;

    running the text classifier on the at least N electronic documents thereby to obtain a ranking of the extent of relevance of each of said at least N electronic documents to said individual issue;

    partitioning said at least N electronic documents into uniformly ranked subsets of documents, said uniformly ranked subsets differing in ranking of their member documents by said text classifier and adding more documents from each of said uniformly ranked subsets to said training subset;

    selecting a cut-off point for binarizing said rankings of said documents in said control subset;

    using said cut-off point, computing and storing at least one quality criterion characterizing said binarizing of said rankings of said documents in said control subset, thereby to define a quality of performance indication of a current iteration I;

    displaying a comparison of the quality of performance indication of the current iteration I to quality of performance indications of previous iterations;

    seeking an input as to whether or, not to return to said receiving step thereby to initiate a new iteration I+1 which comprises said receiving, building, running, partitioning, selecting, and computing/storing steps and initiating said new iteration I+1 if and only if so indicated by said input; and

    running the text classifier most recently built on at least said N electronic documents thereby to generate a final output and generating a computer display of said final output.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×