×

System and method for interpreting document contents

  • US 6,772,170 B2
  • Filed: 11/16/2002
  • Issued: 08/03/2004
  • Est. Priority Date: 09/13/1996
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method, comprising the steps of:

  • a) semantically filtering a set of documents in a database to extract a set of semantic concepts, to improve an efficiency of a predictive relationship to its content, based on at least one of word frequency, overlap and topicality;

    b) defining a topic set, said topic set being characterized as the set of semantic concepts which best discriminate the content of the documents containing them, said topic set being defined based on at least one of word frequency, overlap and topicality;

    c) forming a matrix with the semantic concepts contained within the topic set defining one dimension of said matrix and the semantic concepts contained within the filtered set of documents comprising another dimension of said matrix;

    d) calculating matrix entries as the conditional probability that a document in the database will contain each semantic concept in the topic set given that it contains each semantic concept in the filtered set of documents; and

    e) providing said matrix entries as vectors to interpret the document contents of said database.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×