×

Semi-supervised data integration model for named entity classification

  • US 9,292,797 B2
  • Filed: 12/14/2012
  • Issued: 03/22/2016
  • Est. Priority Date: 12/14/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method for providing a semi-supervised data integration model for named entity classification from a first repository of entity information in view of an auxiliary repository of classification assistance data, the method comprising:

  • comparing training data to named entity candidates taken from the first repository, thereby forming a positive training seed set in view of identified commonality between the training data and the named entity candidates;

    in view of the positive training seed set, populating a decision tree;

    in view of populating the decision tree, creating classification rules for classifying the named entity candidates;

    sampling a number of entities from the named entity candidates;

    in view of the classification rules, labeling the sampled entities as positive examples and/or negative examples;

    in view of the positive examples and the auxiliary repository, updating the positive training seed set to include identified commonality between the positive examples and the auxiliary repository;

    in view of the negative examples and the auxiliary repository, updating a negative training seed set to include negative examples which lack commonality with the auxiliary repository; and

    in view of both the updated positive and negative training seed sets, updating the decision tree and the classification rules.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×