Systems and methods for conducting and terminating a technology-assisted review
First Claim
1. A system for terminating a classification process, the system comprising:
- at least one computing device having a processor and physical memory, the physical memory storing instructions that cause the processor to;
execute the classification process, wherein the classification process utilizes an iterative search strategy to classify documents in a document collection, wherein the documents in the document collection are stored on a non-transitory storage medium;
determine an upper bound for an expected review effort using an estimate of a number of relevant documents identified as part of the classification process;
select a gain curve slope ratio threshold;
compute points on a gain curve using a selected set of documents in the document collection and results from the classification process;
detect an inflection point in the gain curve;
determine a slope ratio for the detected inflection point using a slope of the gain curve before the detected inflection point, and a slope of the gain curve after the detected inflection point; and
terminate the classification process based upon a determination that the slope ratio for the detected inflection point exceeds the selected slope ratio threshold, and based upon a determination that the upper bound for the expected review effort has been exceeded.
0 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are provided for classifying electronic information and terminating a classification process which utilizes Technology-Assisted Review (“TAR”) techniques. In certain embodiments, the TAR process, which is an iterative process, is terminated based upon one more stopping criteria. In certain embodiments, use of the stopping criteria ensures that the TAR process will reliably achieve a level of quality (e.g., recall) with a certain probability. In certain embodiments, the TAR process is terminated when it independently identifies a target set of documents. In certain embodiments, the TAR process is terminated based upon whether the ratio of the slope of the TAR process'"'"'s gain curve before an inflection point to the slope of the TAR process'"'"' gain curve after the inflection point exceeds a threshold. In certain embodiments, the TAR process is terminated when a review budget and slope ratio of the gain curve each exceed a respective threshold.
163 Citations
14 Claims
-
1. A system for terminating a classification process, the system comprising:
-
at least one computing device having a processor and physical memory, the physical memory storing instructions that cause the processor to; execute the classification process, wherein the classification process utilizes an iterative search strategy to classify documents in a document collection, wherein the documents in the document collection are stored on a non-transitory storage medium; determine an upper bound for an expected review effort using an estimate of a number of relevant documents identified as part of the classification process; select a gain curve slope ratio threshold; compute points on a gain curve using a selected set of documents in the document collection and results from the classification process; detect an inflection point in the gain curve; determine a slope ratio for the detected inflection point using a slope of the gain curve before the detected inflection point, and a slope of the gain curve after the detected inflection point; and terminate the classification process based upon a determination that the slope ratio for the detected inflection point exceeds the selected slope ratio threshold, and based upon a determination that the upper bound for the expected review effort has been exceeded. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computerized method for terminating a classification process, the method comprising:
-
executing the classification process, wherein the classification process utilizes an iterative search strategy to classify documents in a document collection, wherein the documents in the document collection are stored on a non-transitory storage medium; determining an upper bound for an expected review effort using an estimate of a number of relevant documents identified as part of the classification process; selecting a gain curve slope ratio threshold; computing points on a gain curve using a selected set of documents in the document collection and results from the classification process; detecting an inflection point in the gain curve; determining a slope ratio for the detected inflection point using a slope of the gain curve before the detected inflection point, and a slope of the gain curve after the detected inflection point; and terminating the classification process based upon a determination that the slope ratio for the detected inflection point exceeds the selected slope ratio threshold, and based upon a determination that the upper bound for the expected review effort has been exceeded. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
Specification