×

Methods and apparatus for user-centered web crawling

  • US 20040205049A1
  • Filed: 04/10/2003
  • Published: 10/14/2004
  • Est. Priority Date: 04/10/2003
  • Status: Abandoned Application
First Claim
Patent Images

1. A computer-based method of performing document retrieval in accordance with an information network, the method comprising the steps of:

  • obtaining a query comprising at least a user-defined predicate;

    determining a group of one or more users for a set of one or more documents that satisfy the predicate, the user group comprising one or more users who have previously accessed at least one of the one or more documents in the set, wherein a determination of whether a user has previously accessed a document is obtained from a log that maintains data representing user document access behavior;

    determining a topical inclination value for each user in the user group, the topical inclination value for each user being indicative of a level of interest the user has in the one or more documents in the set;

    determining a topical affinity value for each document accessed by the user group based on the topical inclination value determined for each user, the topical affinity value for each document being indicative of the likelihood that each document satisfies the predicate based on the access behavior associated with the one or more users in the user group; and

    outputting the one or more documents ranked in accordance with their respective topical affinity values.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×