×

Using rule induction to identify emerging trends in unstructured text streams

  • US 8,712,926 B2
  • Filed: 05/23/2008
  • Issued: 04/29/2014
  • Est. Priority Date: 05/23/2008
  • Status: Active Grant
First Claim
Patent Images

1. A system including a computer processor configured to operate a plurality of modules, said modules comprising:

  • a decision module configured to use a decision tree to classify documents from a set U of documents into categories based on a subset V of U, wherein the subset V comprises documents of U that were written within a specific time period, and the subset V provides an indication of emerging trends in the set U of documents that occur at a higher frequency during the specific time period than outside the specific time period,wherein the decision module utilizes an entropy function that favors splitting the set U into categories, andwherein the decision module creates a separate category for the documents in V and also the documents in U that are not in V;

    a conversion module configured to convert the decision tree into a logically equivalent rule set, wherein each document of U is guaranteed to only be classified by one rule of the rule set, wherein the rule set is configured as a sortable table;

    a labeling module configured to label, for each one of the categories based on the subset V, a text event, wherein the labeling module is configured to label the text event with each of a plurality of antecedents including positive and negative antecedents on a path from a leaf node to the root node of the decision tree, wherein each antecedent corresponds to a particular leaf node on the path; and

    a display module configured to display a list of results based on the text event labels to a user.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×