Classifying documents based on automatically detected rules
First Claim
1. A computer-implemented method for classifying a set of documents, the method comprising:
- receiving an identified subset of documents within a set of documents;
automatically creating at least one classification rule for the subset of documents based on the documents in the identified subset, wherein at least a true-positive threshold proportion of documents in the subset of documents follows the at least one classification rule, further wherein at most a false-positive threshold proportion of documents in the set of documents and outside the identified subset of documents follow the at least one classification rule;
storing the at least one classification rule in association with the identified subset of documents;
receiving an input indicating removing a target document from the identified subset of documents; and
modifying the at least one classification rule based on the input indicating removing the target document from the identified subset of documents.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for classifying a set of documents are provided. In some aspects, a method includes receiving a subset of the set of documents. The method also includes automatically determining at least one classification rule for the subset of documents based on the documents in the subset. At least a true-positive threshold proportion of documents in the subset of documents follows the at least one classification rule. At most a false-positive threshold proportion of documents in the set of documents and not in the subset of documents follow the at least one classification rule. The method also includes storing the at least one classification rule in association with the subset of documents.
-
Citations
22 Claims
-
1. A computer-implemented method for classifying a set of documents, the method comprising:
-
receiving an identified subset of documents within a set of documents; automatically creating at least one classification rule for the subset of documents based on the documents in the identified subset, wherein at least a true-positive threshold proportion of documents in the subset of documents follows the at least one classification rule, further wherein at most a false-positive threshold proportion of documents in the set of documents and outside the identified subset of documents follow the at least one classification rule; storing the at least one classification rule in association with the identified subset of documents; receiving an input indicating removing a target document from the identified subset of documents; and modifying the at least one classification rule based on the input indicating removing the target document from the identified subset of documents. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A non-transitory computer-readable medium for classifying a set of documents, the computer-readable medium comprising instructions that, when executed by a computer, cause the computer to:
-
receive a plurality of identified subsets of documents within a set of documents; automatically create at least one classification rule corresponding to each identified subset in at least a portion of the plurality of identified subsets of documents, the at least one classification rule being based on the documents in the corresponding identified subset, wherein at least one document in each identified subset in the at least a portion of the plurality of identified subsets follows the at least one corresponding classification rule; store the at least one corresponding classification rule in association with each identified subset of documents in the at least the portion of the plurality of identified subsets of documents; receive an input indicating removing a target document from the identified subset of documents; and modify the at least one classification rule based on the input indicating removing the target document from the identified subset of documents. - View Dependent Claims (14, 15, 16, 17, 18, 19)
-
-
20. A system for classifying a set of electronic messages, the system comprising:
-
one or more processors; and a memory comprising instructions that, when executed by the one or more processors, cause the one or more processors to; receive an identified subset of electronic messages within a set of electronic messages; automatically create at least one classification rule for the subset of electronic messages based on the electronic messages in the identified subset, wherein at least a true-positive threshold proportion of electronic messages in the identified subset of electronic messages follows the at least one classification rule, further wherein at most a false-positive threshold proportion of electronic messages in the set of electronic messages and outside the identified subset of electronic messages follow the at least one classification rule; storing the at least one classification rule in association with the identified subset of electronic messages; receive an input indicating removing a target document from the identified subset of documents; and modify the at least one classification rule based on the input indicating removing the target document from the identified subset of documents. - View Dependent Claims (21, 22)
-
Specification