×

Method and system for calculating document importance using document classifications

  • US 7,774,340 B2
  • Filed: 06/30/2004
  • Issued: 08/10/2010
  • Est. Priority Date: 06/30/2004
  • Status: Active Grant
First Claim
Patent Images

1. A method in a computer system with a processor and a memory for calculating importance of documents, the documents having inter-document links, the method comprising:

  • providing an organization of the documents into collections, each collection including a plurality of documents;

    for each collection, identifying inter-collection links of the documents within the collection, an inter-collection link being a link between a document in one collection and a document in another collection;

    calculating by the processor importance of each collection by applying a page ranking algorithm to the collections wherein the page ranking algorithm operates on nodes and links, each collection represented by a node and the inter-collection links represented by links between the nodes, to calculate importance for each collection of documents based on the inter-collection links from a document in one collection to a document in another collection;

    dividing by the processor the collections into a high importance set of collections and a low importance set of collections, wherein the collections of the high importance set of collections have a calculated importance that is above the calculated importance of the collections in the low importance set of collections;

    after dividing the collection into the high importance set of collections and the low importance set of collections,calculating by the processor importance of the documents in the high importance set of collections by applying a page ranking algorithm to the documents in the high importance set of collections wherein each document is represented by a node and the inter-document links between documents within the high importance set of collections are represented by links between the nodes to calculate the importance of each document in the high importance set of collections based on the inter-document links between the documents of the high importance set of collections wherein the documents of the low importance set of collections are not factored into calculating importance of the documents of the high importance set of collections; and

    calculating importance of the documents in the low importance set of collections by applying a ranking algorithm to the documents in the low importance set of collections to calculate the importance of each document in the low importance set of collections by, for each collection of the low importance set of collections,calculating a local importance of each document in the collection of the low importance set of collections by applying a page ranking algorithm to the documents in the collection wherein each document in the collection is represented by a node and the intra-collection links between documents of the collection are represented by links between the nodes; and

    for each document in the collection of the low importance set of collections, calculating a combined importance for that document based on the calculated importance of the collection and the calculated local importance of the document; and

    presenting a combined ranking of the documents in the collections based on the calculated importance of the documents of the high importance set of collections and the calculated importance of the documents of the low importance set of collections.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×