×

Category processing of query topics and electronic document content topics

  • US 6,182,066 B1
  • Filed: 11/26/1997
  • Issued: 01/30/2001
  • Est. Priority Date: 11/26/1997
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for categorizing electronic document content of a plurality of documents for matching to user requests comprising the steps of:

  • parsing said document content into a plurality of items, each of said items comprising a contiguous phrase of more than two words located within said document;

    assigning each of said plurality of items at least one of a plurality of token IDs;

    vectorizing said plurality of token IDs into a plurality of document vectors;

    calculating the cosine measure of each of said document vectors against each other of said document vectors to provide a plurality of similarity measures, one similarity measure for each document against each other of said plurality of documents.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×