×

Document analysis and retrieval

  • US 8,015,206 B2
  • Filed: 07/11/2008
  • Issued: 09/06/2011
  • Est. Priority Date: 12/30/2002
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer program product, comprising a computer readable storage device having a computer readable program code embodied therein, said program code configured to be executed by a processor of a computer system to implement a method for document analysis and retrieval, said method comprising the steps of:

  • receiving a document having text therein from a host of a first computing system;

    generating document keys associated with said text from analysis of said text, each said document key selected from the group consisting of a keyword of said text and a keyphrase of said text;

    providing a document taxonomy having categories, each category having category keys, each said category key selected from the group consisting of a keyword of said category and a keyphrase of said category;

    comparing the category keys of each category with said document keys to make a determination of a distance between the document and each category as a measure of how close the document is to each category, wherein said comparing comprises computing said distance for each category as a dot product of a vector of said document keys and a vector of said category keys for each category; and

    returning a subset of said categories to said host, wherein said subset of said categories reflects said determination.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×