Method of tuning a computer system
First Claim
Patent Images
1. A method comprising:
- obtaining a target of accuracy of a computer system configured to classify documents or to locate texts satisfying a criterion in documents;
determining classification of a set of documents or location of the texts satisfying the criterion in the set of documents thereby labeling the set of documents, using the computer system, wherein the set of documents are previously unlabeled;
performing validation of the classification or the location of the texts, in which the classification or the location is either upheld or overturned; and
tuning the accuracy of the computer system by adjusting an amount of the validation based on the target;
wherein adjusting the amount of the validation comprises;
for each of the set of documents and each of a plurality of hypotheses on a largest value and a smallest value of scores for the set of documents, the scores respectively representing probabilities of the classification or location of the texts of the set of documents, generating a hypothesized score as a function of the largest value and the smallest value;
initializing credibilities respectively for the hypotheses as an equal value, the credibilities respectively representing probabilities of the hypotheses correctly estimating their respective largest value and smallest value; and
adjusting the credibilities using Bayesian inference, based on the hypothesized scores for the set of documents and the hypotheses and results of the validation.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein is a method for tuning a computer system suitable for classifying documents or to locating texts satisfying a criterion in documents. Examples of the computer system may be used for electronic discovery or to exercise due diligence in a transaction. The method includes obtaining a target of accuracy of the computer system and tuning the accuracy of the computer system by adjusting a characteristic of the computer system based on the target.
32 Citations
26 Claims
-
1. A method comprising:
-
obtaining a target of accuracy of a computer system configured to classify documents or to locate texts satisfying a criterion in documents; determining classification of a set of documents or location of the texts satisfying the criterion in the set of documents thereby labeling the set of documents, using the computer system, wherein the set of documents are previously unlabeled; performing validation of the classification or the location of the texts, in which the classification or the location is either upheld or overturned; and tuning the accuracy of the computer system by adjusting an amount of the validation based on the target; wherein adjusting the amount of the validation comprises; for each of the set of documents and each of a plurality of hypotheses on a largest value and a smallest value of scores for the set of documents, the scores respectively representing probabilities of the classification or location of the texts of the set of documents, generating a hypothesized score as a function of the largest value and the smallest value; initializing credibilities respectively for the hypotheses as an equal value, the credibilities respectively representing probabilities of the hypotheses correctly estimating their respective largest value and smallest value; and adjusting the credibilities using Bayesian inference, based on the hypothesized scores for the set of documents and the hypotheses and results of the validation. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A computer program product comprising a non-transitory computer readable medium having instructions recorded thereon, the instructions when executed by a computer implementing a method comprising:
-
obtaining a target of accuracy of a computer system configured to classify documents or to locate texts satisfying a criterion in documents; determining classification of a set of documents or location of the texts satisfying the criterion in the set of documents thereby labeling the set of documents, using the computer system, wherein the set of documents are previously unlabeled; performing validation of the classification or the location of the texts, in which the classification or the location is either upheld or overturned; and tuning the accuracy of the computer system by adjusting an amount of the validation based on the target; wherein adjusting the amount of the validation comprises; for each of the set of documents and each of a plurality of hypotheses on a largest value and a smallest value of scores for the set of documents, the scores respectively representing probabilities of the classification or location of the texts of the set of documents, generating a hypothesized score as a function of the largest value and the smallest value; initializing credibilities respectively for the hypotheses as an equal value, the credibilities respectively representing probabilities of the hypotheses correctly estimating their respective largest value and smallest value; and adjusting the credibilities using Bayesian inference, based on the hypothesized scores for the set of documents and the hypotheses and results of the validation. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
Specification