×

Systems and methods for automatically categorizing unstructured text

  • US 7,853,544 B2
  • Filed: 11/23/2005
  • Issued: 12/14/2010
  • Est. Priority Date: 11/24/2004
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method for identifying a set of categories for unstructured text messages and training an automated classifier therefor, the method comprising:

  • from a stream of the unstructured text messages captured in computer readable form, selecting a subset thereof for presentation to a user as an exploration set, the subset selected from the stream by a programmed computer, wherein the selection of the exploration set is in a generally random manner though in accord with one or more set delimiting criteria provided by the user;

    via a display of the programmed computer, providing the user with both (i) a reviewable presentation of each unstructured text message selected for presentation as part of the exploration set and (ii) a flag definition and assignment interface, whereby the user defines categories for the unstructured text messages and flags at least one message of the exploration set as associated with each of the categories so defined;

    via the display of the programmed computer, providing the user with a reviewable presentation of a training subset of the unstructured text messages, wherein each of the unstructured text messages of the training subset is presented together with a category selection interface whereby the user accumulates, for each of at least a subset of the categories, a respective pool of training instances from the training subset for use in training an automated classifier; and

    training an automated classifier to classify individual ones of the unstructured text messages using the training subset.

View all claims
  • 10 Assignments
Timeline View
Assignment View
    ×
    ×