System and method for high precision and high recall relevancy searching
First Claim
Patent Images
1. A computer-implemented method comprising:
- generating, by a processing device, a filter for identifying a relevant document based on an initial relevance rule related to a set of documents;
applying, by the processing device, the filter to the set of documents thereby identifying a subset of relevant documents;
receiving, by the processing device from an assessor, the subset of relevant documents comprising an identification of key information;
generating, by the processing device, an updated relevance rule based on the key information and the initial relevance rule, wherein the generating comprises identifying, by the computer, a conflict between the key information and the initial relevance rule, wherein one of the key information and the initial relevance rule identifies at least one document within the set of documents to be relevant and the other of the key information and the initial relevance rule identifies the at least one document within the set of documents to be non-relevant;
generating, by the processing device, a query based on the updated relevance rule for identifying the relevant documents within the set of documents; and
outputting, by the processing device, the set of documents within which the relevant documents have been identified.
4 Assignments
0 Petitions
Accused Products
Abstract
A method and system for performing high precision and high recall relevancy searching is provided. According to embodiments of the present invention, a relevance rule is generated based on a user model and language from within one or more relevant and non-relevant documents. A query is created based on the relevance rule wherein the query may be applied to a corpus to identify relevant and non-relevant documents. The relevance rule may be iteratively refined in order to increase the accuracy of the query. The resulting query may be used by a litigator during the discovery phase of a litigation to respond to a request for production.
-
Citations
20 Claims
-
1. A computer-implemented method comprising:
-
generating, by a processing device, a filter for identifying a relevant document based on an initial relevance rule related to a set of documents; applying, by the processing device, the filter to the set of documents thereby identifying a subset of relevant documents; receiving, by the processing device from an assessor, the subset of relevant documents comprising an identification of key information; generating, by the processing device, an updated relevance rule based on the key information and the initial relevance rule, wherein the generating comprises identifying, by the computer, a conflict between the key information and the initial relevance rule, wherein one of the key information and the initial relevance rule identifies at least one document within the set of documents to be relevant and the other of the key information and the initial relevance rule identifies the at least one document within the set of documents to be non-relevant; generating, by the processing device, a query based on the updated relevance rule for identifying the relevant documents within the set of documents; and outputting, by the processing device, the set of documents within which the relevant documents have been identified. - View Dependent Claims (2, 3, 4, 5, 6, 13, 14, 15, 16)
-
-
7. A system comprising:
-
a processing device; a database coupled to the processing device; an assessment module coupled to the processing device and the database, wherein the assessment module configured to; generate a filter for identifying a relevant document based on an initial relevance rule related to a set of documents, apply the filter to the set of documents thereby identifying a subset of relevant documents, receive from an assessor, the subset of relevant documents comprising an identification of key information, generate an updated relevance rule based on the key information and the initial relevance rule, wherein the generate comprising identify a conflict between the key information and the initial relevance rule, wherein one of the key information and the initial relevance rule identifies at least one document within the set of documents to be relevant and the other of the key information and the initial relevance rule identifies the at least one document within the set of documents to be non-relevant; generate a query based on the updated relevance rule for identifying relevant documents within the set of documents, and a classification module coupled to the processing device and the database, the classification module configured to output the set of documents wherein the relevant documents have been identified. - View Dependent Claims (8, 9, 10, 11, 12, 17, 18, 19, 20)
-
Specification