×

Classification rule generation device, classification rule generation method, classification rule generation program, and recording medium

  • US 9,323,839 B2
  • Filed: 01/13/2011
  • Issued: 04/26/2016
  • Est. Priority Date: 01/13/2011
  • Status: Active Grant
First Claim
Patent Images

1. A classification rule generation device comprising:

  • an input circuit that inputs a document as a sample target document;

    a storage circuit that stores extraction conditions for extracting partial text which is a portion of the sample target document and which is used for generating classification rules for classifying a classification target document to be classified into one of classification categories, the partial text being extracted from the sample target document according to the classification categories, the extraction conditions being set for each of the classification categories;

    a matching circuit that matches the sample target document input by the input circuit against the extraction conditions stored in the storage circuit;

    an extraction circuit that performs partial text extraction to extract the partial text from the sample target document according to the classification categories, based on a result of matching by the matching circuit; and

    a learning circuit that, when the partial text corresponding to one of the classification categories is extracted by the partial text extraction by the extraction circuit, performs predetermined machine learning using the partial text extracted, and generates the classification rules,wherein the extraction conditions set for each of the classification categories include a keyword that corresponds to each of the classification categories,the matching circuit includes a position identification circuit that identifies an existing position of the keyword for each of the classification categories in the sample target document,the extraction circuit extracts a portion around and including the keyword as the partial text from the sample target document, based on the existing position of the keyword identified by the position identification circuit,the extraction conditions set for each of the classification categories are set such that type information indicating a type of the keyword is set for at least one of the keywords, andthe extraction circuit, when extracting the partial text corresponding to each of the classification categories from the sample target document, performs the partial text extraction based on the type information indicated by the keyword identified by the position identification circuit.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×