×

System for enhancing expert-based computerized analysis of a set of digital documents and methods useful in conjunction therewith

  • US 8,533,194 B1
  • Filed: 06/15/2011
  • Issued: 09/10/2013
  • Est. Priority Date: 04/22/2009
  • Status: Active Grant
First Claim
Patent Images

1. An electronic document analysis method using a processor for analyzing N electronic documents, the method comprising:

  • i. providing an initial set of control electronic documents, randomly selected from the N documents, to initially use as a current set of control electronic documents;

    ii. for at least one current set of control electronic documents, performing a current iteration comprising the following steps a-c;

    a running a computerized text-classifier on the current set of control electronic documents;

    b. using the current set of control electronic documents and the processor to evaluate at least one aspect of said running of said computerized text-classifier on said current set of control electronic documents; and

    c using the processor for computing a current estimated validation level based on said at least one aspect and on at least one estimated validation level computed in an iteration, if any, previous to said current iteration, andiii. if the current estimated validation level is below a desired validation level, repeating said performing of a current iteration, using a set of training electronic documents larger than that used previously to generate a classifier to be run on a current set of control electronic documents, whereinsaid at least one aspect comprises an F-measure, andif the current estimated validation level is below a desired validation level, said performing of a current iteration is repeated using a set of control electronic documents larger than that used previously and wherein said using a set of control electronic documents larger than that used previously comprises;

    estimating a number of electronic documents to be added to the initial set of control documents if the desired validation level is to be achieved; and

    randomly selecting said number of electronic documents from among the N electronic documents.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×