×

Information retrieval system and method that generates weighted comparison results to analyze the degree of dissimilarity between a reference corpus and a candidate document

  • US 6,167,398 A
  • Filed: 05/13/1998
  • Issued: 12/26/2000
  • Est. Priority Date: 01/30/1997
  • Status: Expired due to Term
First Claim
Patent Images

1. A method of information retrieval comprising:

  • (a) receiving from a user, data identifying a stored reference corpus;

    (b) retrieving the identified reference corpus from storage;

    (c) generating initial values of respective weights corresponding to a plurality of analysis algorithms by processing the retrieved reference corpus in accordance with a predetermined algorithm;

    (d) retrieving from storage another text document as a candidate document;

    (e) performing respective comparisons between the candidate document and the reference corpus in accordance with each of said analysis algorithms and producing respective comparison results;

    (f) generating corresponding weighted comparison results by multiplying each said comparison result by its respective weight;

    (g) summing the weighted comparison results to produce a dissimilarity measure that is indicative of the degree of dissimilarity between the retrieved reference corpus and the retrieved candidate document; and

    (h) storing the candidate document in a retained text store if said sum is indicative of a degree of dissimilarity less than a predetermined degree of dissimilarity.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×