×

Document extraction and comparison method with applications to automatic personalized database searching

  • US 5,926,812 A
  • Filed: 03/28/1997
  • Issued: 07/20/1999
  • Est. Priority Date: 06/20/1996
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method for determining the relevance of the content of a first set of documents to the content of a second set of documents, the method comprising:

  • extracting from the first set of documents a corresponding first set of document extract entries and from the second set of documents a corresponding second set of document extract entries, wherein each entry in the first and second sets of document extract entries comprises a weighted word histogram for a corresponding document;

    generating from the first set of document extract entries a first set of word clusters, and generating from the second set of document extract entries a second set of word clusters, wherein each word cluster in the first and second sets of word clusters comprises a cluster word list, a total distance matrix, and a number of connections matrix; and

    determining a degree of similarity between clusters from the first set of word clusters and clusters from the second set of word clusters.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×