Computer-Implemented System and Method For Generating A Reference Set Via Clustering
First Claim
Patent Images
1. A computer-implemented method for generating a reference set via clustering, comprising:
- obtaining a collection of unclassified documents;
grouping the unclassified documents into clusters;
selecting n-documents from each cluster and combining the selected n-documents as reference set candidates, wherein one of the n-documents from each cluster is located closest to a center of that cluster;
assigning a classification code to each of the reference set candidates; and
grouping two or more of the reference set candidates as a reference set of classified documents.
5 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented system and method for generating a reference set via clustering is provided. A collection of unclassified documents is obtained and grouped into clusters. N-documents are selected from each cluster and are combined as reference set candidates. One of the n-documents from each cluster is located closest to a center of that cluster. A classification code is assigned to each of the reference set candidates. Two or more of the reference set candidates are grouped as a reference set of classified documents.
4 Citations
20 Claims
-
1. A computer-implemented method for generating a reference set via clustering, comprising:
-
obtaining a collection of unclassified documents; grouping the unclassified documents into clusters; selecting n-documents from each cluster and combining the selected n-documents as reference set candidates, wherein one of the n-documents from each cluster is located closest to a center of that cluster; assigning a classification code to each of the reference set candidates; and grouping two or more of the reference set candidates as a reference set of classified documents. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer-implemented system for generating a reference set via clustering, comprising:
-
a collection module to obtain a collection of unclassified documents; a clustering module to group the unclassified documents into clusters; a candidate selection module to select n-documents from each cluster and to combine the selected n-documents as reference set candidates, wherein one of the n-documents from each cluster is located closest to a center of that cluster; a classification module to assign a classification code to each of the reference set candidates; and a reference set module to group two or more of the reference set candidates as a reference set of classified documents. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
Specification