×

Method and apparatus for clustering a collection of linked documents using co-citation analysis

  • US 6,038,574 A
  • Filed: 03/18/1998
  • Issued: 03/14/2000
  • Est. Priority Date: 03/18/1998
  • Status: Expired due to Term
First Claim
Patent Images

1. A method for clustering documents contained in a collection of linked documents, said method comprising the steps of:

  • a) specifying a link frequency threshold, said link frequency threshold indicating a number of times a document is linked to from another document in said collection;

    b) for each document in said collection, determining an associated link frequency;

    c) discarding each document in said collection whose link frequency is lower than said link frequency threshold;

    d) creating a co-citation list, said co-citation list comprised of pairs of documents that are linked to by the same document in said collection; and

    e) performing a suitable clustering operation on said co-citation list to generate document clusters.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×