Unsupervised document clustering using latent semantic density analysis

  • US 8,713,021 B2
  • Filed: 07/07/2010
  • Issued: 04/29/2014
  • Est. Priority Date: 07/07/2010
  • Status: Active Grant
  • ×
    • Pin Icon | RPX Insight
    • Pin
First Claim
Patent Images

1. A computer-implemented method for clustering documents, comprising:

  • at a device comprising one or more processors and memory;

    generating a latent semantic mapping (LSM) space from a collection of a plurality of documents, the LSM space includes a plurality of document vectors, each representing one of the documents in the collection;

    identifying a plurality of centroid document vectors from the plurality of document vectors;

    forming a plurality of document groups each including a respective group of document vectors in the LSM space that are within a predetermined hypersphere diameter from a respective one of the plurality of centroid document vectors, wherein the predetermined hypersphere diameter represents a predetermined closeness measure among the document vectors in the LSM space; and

    selectively designating a particular document group from the plurality of document groups as a document cluster based on the particular document group containing a maximum number of document vectors among the plurality of document groups.

View all claims
  • 1 Assignment
    ×
    ×

    Thank you for your feedback

    ×
    ×