×

Techniques for web site integration

  • US 8,015,173 B2
  • Filed: 05/26/2005
  • Issued: 09/06/2011
  • Est. Priority Date: 05/08/2000
  • Status: Active Grant
First Claim
Patent Images

1. A processor-implemented method, comprising:

  • receiving an input from a user identifying an initial document in a collection of documents;

    identifying other documents in the collection of documents that are related to the initial document, comprising;

    assigning scores to a plurality of compressed document surrogates corresponding to the collection of documents, the scores depending on an occurrence in the compressed document surrogates of at least one term in the initial document, wherein a score SD of a compressed document surrogate D in the plurality of compressed document surrogates is determined by crediting the compressed document surrogate D, for each term T in the initial document which occurs in the compressed document surrogate D, with an amount proportional to Robertson'"'"'s term frequency TFTD and to IDFT where
    TFTD =NTD/(NTD+K1+K2 *(LD/L0)),andNTD is the number of times the term T occurs in compressed document surrogate D,LD is the length of compressed document surrogate D,L0 is the average length of a document in the collection of documents,K1 and K2 are constants, and
    IDFT= log((N+K3 )/NT)/log(N+K4), andN is the number of documents in the collection of documents,NT is the number of documents containing the term T in the collection of documents, andK3 and K4 are constants;

    selecting a set of documents from the collection of documents, the set of documents comprising documents corresponding to those of the plurality of compressed document surrogates assigned the highest scores; and

    presenting information identifying the set of documents to the user.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×