×

Generalized term frequency scores in information retrieval systems

  • US 6,507,839 B1
  • Filed: 06/19/2000
  • Issued: 01/14/2003
  • Est. Priority Date: 03/31/1999
  • Status: Expired due to Term
First Claim
Patent Images

1. A method for selecting documents which may be of interest from among documents in a collection, comprising:

  • (a) choosing terms to be used in selecting documents which may be of interest, (b) dividing a plurality of documents D in the collection into S0 segments, (c) determining, for the plurality of documents D in the collection, which of the terms chosen to be used in selecting documents are found in each segment Si of the document D, (d) calculating, for the plurality of documents D in the collection a generalized term frequency score SD;

    SD=

    T=1T0




    Si=1S0


    TFSTD
    embedded image

    where;

    SD is the total score for the document D, T0 is the number of terms selected to be used in the search, S0 is the number of segments in the document D, and TFSTD is the score for document D based on the occurrence of term T in segment Si of document D, and (e) selecting documents from among the documents in the collection based upon the scores SD achieved by the documents.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×