×

System and method for clustering unstructured documents

  • US 7,809,727 B2
  • Filed: 12/24/2007
  • Issued: 10/05/2010
  • Est. Priority Date: 08/31/2001
  • Status: Active Grant
First Claim
Patent Images

1. A system for clustering unstructured documents, comprising:

  • a selection module that selects documents having terms with frequencies of occurrence of the terms that satisfy upper edge conditions less than 100% and lower edge conditions greater than 0% from a set of documents;

    a concept module that generates concepts based on one or more of the terms for the selected documents; and

    a cluster module that groups the selected documents into clusters, comprising;

    an evaluation module that evaluates a weight for each of the clusters;

    a determination module that determines, for each of the selected documents, inner products of that selected document and each cluster from the frequencies of occurrence for at least one of the terms from the concepts and the cluster weights; and

    an assignment module that assigns each selected document into one such cluster based on the inner products of the selected document; and

    a processor to execute each of the modules, which are stored on a computer-readable storage medium.

View all claims
  • 12 Assignments
Timeline View
Assignment View
    ×
    ×