System and method for generating a reference set for use during document review
First Claim
Patent Images
1. A method for generating a reference set for use during document review, using one or more processors, comprising:
- obtaining a collection of unclassified documents;
identifying one or more features within each of the unclassified documents;
generating clusters of the features and selecting at least one feature from one or more of the clusters as reference set candidates;
assigning a classification code to each reference set candidate; and
refining the reference set candidates, comprising;
grouping the reference set candidates into further clusters;
selecting at least one of the reference set candidates in one or more of the further clusters as the further reference set candidates;
assigning a classification code to the further reference set candidates; and
forming a reference set from the unclassified documents associated with the further classified reference set candidates; and
wherein all the steps are performed by a suitably programmed computer.
7 Assignments
0 Petitions
Accused Products
Abstract
A system and method for providing generating reference sets for use during document review is provided. A collection of unclassified documents is obtained. Selection criteria are applied to the document collection and those unclassified documents that satisfy the selection criteria are selected as reference set candidates. A classification code is assigned to each reference set candidate. A reference set is formed from the classified reference set candidates. The reference set is quality controlled and shared between one or more users.
302 Citations
6 Claims
-
1. A method for generating a reference set for use during document review, using one or more processors, comprising:
-
obtaining a collection of unclassified documents; identifying one or more features within each of the unclassified documents; generating clusters of the features and selecting at least one feature from one or more of the clusters as reference set candidates; assigning a classification code to each reference set candidate; and refining the reference set candidates, comprising; grouping the reference set candidates into further clusters; selecting at least one of the reference set candidates in one or more of the further clusters as the further reference set candidates; assigning a classification code to the further reference set candidates; and forming a reference set from the unclassified documents associated with the further classified reference set candidates; and wherein all the steps are performed by a suitably programmed computer. - View Dependent Claims (2, 3)
-
-
4. A system for generating a reference set for use during document review, comprising:
-
a collection of unclassified documents; a clustering module to identify one or more features within each of the unclassified documents, to generate clusters of the features, and to select at least one feature from one or more of the clusters as reference set candidates; a classification module to assign a classification code to each reference set candidate; an identification module to refine the reference set candidates by grouping the reference set candidates into further clusters, selecting at least one of the reference set candidates in one or more of the further clusters, and assigning a classification code to the further reference set candidates; a data module to form a reference set from the unclassified documents associated with the further classified reference set candidates; and a processor to execute the modules. - View Dependent Claims (5, 6)
-
Specification