×

Distributed method for integrating data mining and text categorization techniques

  • US 20080097937A1
  • Filed: 09/28/2007
  • Published: 04/24/2008
  • Est. Priority Date: 07/10/2003
  • Status: Abandoned Application
First Claim
Patent Images

1. A method for prediction analysis using text categorization, the method comprising the steps of:

  • grouping a plurality of text documents into a plurality of classes;

    selecting a top m most discriminatory terms for each class of documents using statistical based measures;

    determining for each document the presence or absence of each of the discriminatory terms;

    learning rule-based models of each class of documents using a rule learning algorithm;

    determining, for at least a portion of the plurality of documents, if a given learned rule has been satisfied by each respective document;

    creating a database of the rules associated with documents satisfying the rules; and

    performing distributed data mining to form a predictive result based on at least a portion of the plurality of documents.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×