QUALITY CONTROL CALCULATOR FOR DOCUMENT REVIEW

US 20150254791A1
Filed: 03/10/2014
Published: 09/10/2015
Est. Priority Date: 03/10/2014
Status: Abandoned Application

First Claim

Patent Images

1. A computerized method for automatically managing quality of human document review in a review process, the method comprising:

receiving, by an extraction hardware module of a computing device, tagging decisions for a plurality of documents made by a first reviewer during a first time period;

determining, by a sampling hardware module of the computing device, a subset of the plurality documents based on a first confidence level and first confidence interval;

receiving, by the sampling hardware module of the computing device, tagging decisions made by a second reviewer related to the subset of the plurality of documents;

determining, by a quality-control review hardware module of the computing device, values of a plurality of quality-control metrics based on the tagging decisions of the first and second reviewers with respect to the subset of the plurality of documents, wherein the values of the plurality of quality-control metrics reflect a level of identity between the first and second reviewers in relation to a plurality of tagging criteria;

displaying, by a graphical user interface (GUI) hardware module of the computing device, a graphical user interface on a display device coupled to the computing device, the graphical user interface comprisinga first section having a user input field configured to enable selection of one or more days of the first time period that defines a date range of tagging decisions made by the first reviewer to include in the determining values step,a second section having a plurality of user input fields configured to enable entry of data relating to the tagging decisions made by the second reviewer, anda third section having a visual comparison of the plurality of quality-control metrics between the first and second reviewers in relation to the plurality of tagging criteria;

calculating, by a quality-control calculator hardware module of the computing device, a risk-accuracy value as a weighted combination of a plurality of factors including (1) an accuracy factor determined based on the values of the plurality of quality-control metrics;

(2) a review rate factor indicating the rate of review of the first reviewer during the first time period; and

(3) one or more user-selectable factors reflecting the complexity or difficulty associated with reviewing the plurality of documents; and

recommending, by a recommendation hardware module of the computing device, a second confidence level and a second confidence interval for sampling a second plurality of documents reviewed during a second time period, wherein the second confidence level and the second confidence interval are determined based on the risk-accuracy value.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Described are methods and apparatuses, including computer program products, for automatically managing quality of human document review in a review process. The method includes receiving tagging decisions for multiple documents made by a first reviewer during a first time period and sampling a subset of these documents based on a first confidence level and first confidence interval. The method further includes receiving tagging decisions made by a second reviewer related to the subset of the documents, from which values of multiple quality-control metrics are determined. The method further includes calculating a risk-accuracy value based in part on the values of the quality-control metrics and recommending a second confidence level and a second confidence interval for sampling a second set of documents reviewed by the first reviewer during a second time period.

52 Citations

View as Search Results

30 Claims

1. A computerized method for automatically managing quality of human document review in a review process, the method comprising:
- receiving, by an extraction hardware module of a computing device, tagging decisions for a plurality of documents made by a first reviewer during a first time period;
  
  determining, by a sampling hardware module of the computing device, a subset of the plurality documents based on a first confidence level and first confidence interval;
  
  receiving, by the sampling hardware module of the computing device, tagging decisions made by a second reviewer related to the subset of the plurality of documents;
  
  determining, by a quality-control review hardware module of the computing device, values of a plurality of quality-control metrics based on the tagging decisions of the first and second reviewers with respect to the subset of the plurality of documents, wherein the values of the plurality of quality-control metrics reflect a level of identity between the first and second reviewers in relation to a plurality of tagging criteria;
  
  displaying, by a graphical user interface (GUI) hardware module of the computing device, a graphical user interface on a display device coupled to the computing device, the graphical user interface comprisinga first section having a user input field configured to enable selection of one or more days of the first time period that defines a date range of tagging decisions made by the first reviewer to include in the determining values step,a second section having a plurality of user input fields configured to enable entry of data relating to the tagging decisions made by the second reviewer, anda third section having a visual comparison of the plurality of quality-control metrics between the first and second reviewers in relation to the plurality of tagging criteria;
  
  calculating, by a quality-control calculator hardware module of the computing device, a risk-accuracy value as a weighted combination of a plurality of factors including (1) an accuracy factor determined based on the values of the plurality of quality-control metrics;
  
  (2) a review rate factor indicating the rate of review of the first reviewer during the first time period; and
  
  (3) one or more user-selectable factors reflecting the complexity or difficulty associated with reviewing the plurality of documents; and
  
  recommending, by a recommendation hardware module of the computing device, a second confidence level and a second confidence interval for sampling a second plurality of documents reviewed during a second time period, wherein the second confidence level and the second confidence interval are determined based on the risk-accuracy value.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 2. The method of claim 1, wherein the tagging criteria comprise responsiveness, significance, privileged and redaction requirement.
  - 3. The method of claim 1, wherein each tagging decision comprises a decision regarding whether a family of one or more related documents satisfies at least one of the tagging criteria.
  - 4. The method of claim 1, further comprising calculating, by the computing device, values of a plurality of first-level review metrics corresponding to the tagging decisions made by the first reviewer.
  - 5. The method of claim 4, wherein the value of at least one of the first-level review metrics indicates a percentage of the tagging decisions that satisfies a tagging criterion.
  - 6. The method of claim 4, further comprising computing, by the computing device, the value of each of the first-level review metrics as an average over a user-selectable time period.
  - 7. The method of claim 1, wherein the plurality of quality control metrics comprise a recall rate, a precision rate and an F-measure for each of the plurality of tagging criteria.
  - 8. The method of claim 7, further comprising:
    - computing, by the computing device, the recall rate and the precision rate corresponding to each of the plurality of tagging criteria based on a percentage of agreement of tagging decisions between the first and second reviewers with respect to the corresponding tagging criterion; and
      
      computing, by the computing device, the F-measure corresponding to each of the plurality of tagging criteria based on the corresponding recall rate and precision rate.
  - 9. The method of claim 8, wherein the accuracy factor comprises a weighted average of the F-measures for the plurality of tagging criteria.
  - 10. The method of claim 1, wherein the one or more user-selectable factors comprise a difficulty protocol factor, a deadline factor, a sensitivity factor and a type of data factor.
  - 11. The method of claim 1, further comprising, receiving, by the computing device, a plurality of weights corresponding to the plurality of factors for customizing the calculation of the risk-accuracy value.
  - 12. The method of claim 1, wherein the second confidence level is inversely related to the risk-accuracy value.
  - 13. The method of claim 12, wherein an increase in the risk-accuracy value is indicative of a decrease in accuracy of the first reviewer, an increase in difficulty or complexity of the plurality of documents reviewed, or an abnormal review rate of the first reviewer.
  - 14. The method of claim 1, wherein the first time period is a current day and the second time period is the following day.
  - 15. The method of claim 1, further comprising calculating, by the computing device, a plurality of cumulative metrics for a duration of the review process, the plurality of cumulative metrics comprising at least one of the total number documents reviewed, the total number of hours spent by the first reviewer, an average review rate of the first reviewer, a percentage of completion, an overall accuracy value of the first reviewer, an average confidence level, or an average confidence interval.
  - 16. The method of claim 15, further comprising:
    - receiving data related to a second review process similar to the review process, the data including an accuracy threshold to be achieved by the second review process;
      
      gathering a plurality of historical cumulative metrics data, including the plurality of cumulative metrics for the review process and one or more cumulative metrics associated with other review processes similar to the second review process;
      
      determining, based on the historical cumulative metrics data, a cost model illustrating average costs for similar review processes of various durations to achieve the accuracy threshold; and
      
      determining, based on the cost model, an optimal duration for the second review process that minimizes costs while satisfying the accuracy threshold.
  - 17. The method of claim 16, further comprising recommending, based on the optimal duration for the second review process, at least one of a number of first-level reviewers or a number of quality-control reviewers to staff to the second review process to realize the optimal duration.
  - 18. The method of claim 16, further comprising estimating a cost associated with completing the second review process in the optimal duration.
  - 19. The method of claim 16, further comprising determining a degree of similarity between the second review process and the other review processes based on a complexity score for each of the review processes.
  - 20. The method of claim 16, wherein the optimal duration corresponds to a point in the cost model with the lowest average cost.

21. A computer-implemented system for automatically managing quality of human document review in a review process, the computer-implemented system comprising a plurality of hardware modules each coupled to a processor and a memory of a computing device, the hardware modules including an extraction module, a sampling module, a graphical user interface (GUI) module, a quality-control review module, a quality-control calculator module, and a recommendation module:
- the extraction module comprising registers and instructions for extracting tagging decisions for a plurality of documents made by a first reviewer during a first time period;
  
  the sampling module comprising registers and instructions for (i) determining a subset of the plurality documents based on a first confidence level and first confidence interval and (ii) receiving tagging decisions made by a second reviewer related to the subset of the plurality of documents;
  
  the quality-control review module comprising registers and instructions for determining values of a plurality of quality-control metrics based on the tagging decisions of the first and second reviewers with respect to the subset of the plurality of documents, wherein the values of the plurality of quality-control metrics reflect levels of identity between the first and second reviewers in relation to a plurality of tagging criteria;
  
  the graphical user interface (GUI) module comprising registers and instructions for displaying a graphical user interface on a display device coupled to the computing device, the graphical user interface comprisinga first section having a user input field configured to enable selection of one or more days of the first time period that defines a date range of tagging decisions made by the first reviewer to include in the determining values step,a second section having a plurality of user input fields configured to enable entry of data relating to the tagging decisions made by the second reviewer, anda third section having a visual comparison of the plurality of quality-control metrics between the first and second reviewers in relation to the plurality of tagging criteria;
  
  the quality-control calculator comprising registers and instructions for calculating a risk-accuracy value as a weighted combination of a plurality of factors including (1) an accuracy factor determined based on the values of the plurality of quality-control metrics;
  
  (2) a review rate factor indicating the rate of review of the first reviewer during the first time period; and
  
  (3) one or more user-selectable factors reflecting the complexity associated with reviewing the plurality of documents; and
  
  a recommendation module comprising registers and instructions for recommending a second confidence level and a second confidence interval for sampling a second plurality of documents reviewed by the first reviewer during a second time period, wherein the second confidence level and the second confidence interval are determined based on the risk-accuracy value.
- View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29)
- - 22. The computer-implemented system of claim 21, wherein the tagging criteria comprise responsiveness, significance, privileged and redaction requirement.
  - 23. The computer-implemented system of claim 21, further comprising a first level review module configured to calculate values of a plurality of first-level review metrics corresponding to the tagging decisions made by the first reviewer.
  - 24. The computer-implemented system of claim 21, wherein the plurality of quality-control metrics comprise a recall rate, a precision rate and an F-measure computed with respect to each of the plurality of tagging criteria.
  - 25. The computer-implemented system of claim 21, wherein the recommendation module is further configured to:
    - receive data related to a second review process similar to the review process, the data including an accuracy threshold to be achieved by the second review process;
      
      determine a plurality of historical cumulative metrics data for the review process and other review processes similar to the second review process;
      
      determine, based on the historical cumulative metrics data, a cost model illustrating average costs for similar review processes of various durations to achieve the accuracy threshold; and
      
      determine, based on the cost model, an optimal duration for the second review process that minimizes costs while satisfying the accuracy threshold.
  - 26. The computer-implemented system of claim 21, wherein the recommendation module is further configured to recommend, based on the optimal duration for the second review process, at least one of a number of first-level reviewers or a number of quality-control reviewers to staff to the second review process to realize the optimal duration.
  - 27. The computer-implemented system of claim 21, wherein the recommendation module is further configured to recommend a cost associated with completing the second review process in the optimal duration.
  - 28. The computer-implemented system of claim 21, wherein the optimal duration corresponds to a point in the cost model with the lowest average cost.
  - 29. The computer-implemented system of claim 21, wherein the recommendation module is further configured to determine a degree of similarity between the second review process and the other review processes based on a complexity score for each of the review processes.

30. A computer program product, tangibly embodied in a non-transitory computer readable medium, for automatically managing quality of human document review in a review process, the computer program product including instructions being configured to cause a plurality of hardware modules each coupled to a processor and a memory of a computing device, the hardware modules including an extraction module, a sampling module, a graphical user interface (GUI) module, a quality-control review module, a quality-control calculator module, and a recommendation module to:
- receive, by the extraction module, tagging decisions for a plurality of documents made by a first reviewer during a first time period;
  
  determine, by the sampling module, a subset of the plurality documents based on a first confidence level and first confidence interval;
  
  receive, by the sampling module, tagging decisions made by a second reviewer related to the subset of the plurality of documents;
  
  determine, by the quality-control review module, values of a plurality of quality-control metrics based on the tagging decisions of the first and second reviewers with respect to the subset of the plurality of documents, wherein the values of the plurality of quality-control metrics reflect levels of identity between the first and second reviewers in relation to a plurality of tagging criteria;
  
  display, by the graphical user interface (GUI) module, a graphical user interface on a display device coupled to the computing device, the graphical user interface comprisinga first section having a user input field configured to enable selection of one or more days of the first time period that defines a date range of tagging decisions made by the first reviewer to include in the determining values step,a second section having a plurality of user input fields configured to enable entry of data relating to the tagging decisions made by the second reviewer, anda third section having a visual comparison of the plurality of quality-control metrics between the first and second reviewers in relation to the plurality of tagging criteria;
  
  calculate, by the quality control calculator module, a risk-accuracy value as a weighted combination of a plurality of factors including (1) an accuracy factor determined based on the values of the plurality of quality-control metrics;
  
  (2) a review rate factor indicating the rate of review of the first reviewer during the first time period; and
  
  (3) one or more user-selectable factors reflecting the complexity associated with reviewing the plurality of documents; and
  
  recommend, by the recommendation module, a second confidence level and a second confidence interval for sampling a second plurality of documents reviewed by the first reviewer during a second time period, wherein the second confidence level and the second confidence interval are determined based on the risk-accuracy value.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
FMR LLC
Original Assignee
FMR LLC
Inventors
Stockton, Jamal Odin, Lisi, Michael Perry, Rhodin, Erica Louise

Application Number

US14/202,401
Publication Number

US 20150254791A1
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06Q 10/0635 Risk analysis of enterprise...

G06Q 50/18 Legal services

QUALITY CONTROL CALCULATOR FOR DOCUMENT REVIEW

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

52 Citations

30 Claims

Specification

Solutions

Use Cases

Quick Links

QUALITY CONTROL CALCULATOR FOR DOCUMENT REVIEW

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

52 Citations

30 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links