Please download the dossier by clicking on the dossier button x
×

Classification-Based Redaction in Natural Language Text

  • US 20120239380A1
  • Filed: 03/15/2011
  • Published: 09/20/2012
  • Est. Priority Date: 03/15/2011
  • Status: Active Grant
First Claim
Patent Images

1. A method for redacting natural language text comprising a plurality of features, the method comprising:

  • providing, by a processing device, a sensitive concepts model according to a classification algorithm operating upon the plurality of features, wherein sensitive concepts are classes used by the classification algorithm when providing the sensitive concepts model;

    providing, by the processing device, a utility concepts model according to the classification algorithm operating upon the plurality of features, wherein utility concepts are classes used by the classification algorithm when providing the utility concepts model;

    for at least one identified sensitive concept and at least one identified utility concept, and based on the sensitive concepts model and the utility concepts model, identifying, by the processing device, at least one feature in the natural language text that implicates the at least one identified sensitive topic more than the at least one identified utility concept to provide identified features; and

    perturbing, by the processing device, at least some of the identified features in at least a portion of the natural language text.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×