×

Suffix tree similarity measure for document clustering

  • US 10,565,233 B2
  • Filed: 03/17/2014
  • Issued: 02/18/2020
  • Est. Priority Date: 05/07/2008
  • Status: Active Grant
First Claim
Patent Images

1. A system, comprising:

  • a memory having stored therein executable instructions; and

    a processor, coupled to the memory, configured to execute or facilitate execution of the executable instructions to at least;

    create a suffix tree document model that is a first representation of documents in at least one knowledge source on a computerized network;

    convert the suffix tree document model to a vector document model that is a second representation of the documents, wherein the vector document model comprises respective weighted vectors for the documents, where each weighted vector of the respective weighted vectors consists of M elements and M is a total number of nodes in the suffix tree document model not including a root node of the suffix tree document model;

    determine at least one similarity between two or more of the documents based upon the respective weighted vectors;

    generate clusters of the documents based on the at least one similarity; and

    in response to a search query of the least one knowledge source, providing a search result based on the clusters of the documents.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×