×

Method for characterizing a document set using evaluation surrogates

  • US 5,774,888 A
  • Filed: 12/30/1996
  • Issued: 06/30/1998
  • Est. Priority Date: 12/30/1996
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for determining measures of relevance of a document to selected topics, wherein the document is represented as a stream of tokens and the selected topics are represented by topic profiles, each of which includes one or more compound term templates that specify the precise forms of terms characteristic of the topic, the method comprising the steps of:

  • applying the topic profiles to the token stream to identify compound terms in the document;

    augmenting the token stream with a compound term token for each compound term identified;

    eliminating from the augmented token stream tokens representing common terms, redundant tokens that correspond to repeated instances of a term, and selected tokens representing components of compound terms to provide a compact representation of the document;

    calculating a similarity function between the compact document representation of the document and the topic profiles to form an evaluation surrogate of the document for the topic profiles.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×