×

System and method for good nearest neighbor clustering of text

  • US 7,747,083 B2
  • Filed: 03/27/2006
  • Issued: 06/29/2010
  • Est. Priority Date: 03/27/2006
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method for clustering text, comprising:

  • representing at least one text in a set of texts as a dimensional vector of words;

    representing an other text in the set of texts as a dimensional vector of words;

    determining a dot-product of the dimensional vector of the other text and the dimensional vector of the at least one text;

    comparing the dot-product to a threshold, wherein the threshold comprises an upper bound of a value in a range from zero to one that represents a cosine similarity between the other text and the at least one text;

    if the dot-product exceeds the threshold, determining the at least one text to be the good nearest neighbor of the other text;

    clustering the other text in a cluster; and

    clustering the at least one text determined to be the good nearest neighbor of the other text in the cluster.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×