×

Document clustering method and system

  • US 6,728,932 B1
  • Filed: 03/22/2000
  • Issued: 04/27/2004
  • Est. Priority Date: 03/22/2000
  • Status: Expired due to Term
First Claim
Patent Images

1. A method for clustering documents comprising:

  • generating a hybrid matrix of vectors comprising a first vector representing a first document and a second vector representing a log-based document cluster; and

    clustering the documents using the hybrid matrix, wherein the step of generating the hybrid matrix comprises;

    accessing retrieval session logs;

    clustering retrieval sessions into session clusters;

    generating, for each session cluster, a log-based document cluster by combining all documents opened during any retrieval session of the session cluster;

    generating a log-based document cluster vector for each of the log-based document clusters;

    replacing each document in the log-based document cluster with the log-based document cluster vector;

    generating an individual document vector for each document not opened during any retrieval session; and

    combining the log-based document cluster vector and the individual document cluster vector, wherein the log-based document cluster is formed by concatenating the documents to be combined.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×