×

Language independent probabilistic content matching

  • US 9,633,001 B2
  • Filed: 06/23/2015
  • Issued: 04/25/2017
  • Est. Priority Date: 02/07/2012
  • Status: Active Grant
First Claim
Patent Images

1. A computing system comprising:

  • at least one processor; and

    memory storing instructions executable by the at least one processor, wherein the instructions configure the computing system;

    access a rule that defines patterns that are used to identify content as sensitive content, the rule defininga segmented pattern to be matched to textual content written in a segmented language, and corroborating data associated with the segmented pattern, andan un-segmented pattern to be matched to textual content written in an un-segmented language, and corroborating data associated with the un-segmented pattern;

    identify an electronic source document having electronic document content;

    determine whether the electronic document content is sensitive content by matching the electronic document content against the patterns in the rule and generating a confidence score corresponding to the determination as to whether the electronic document content is sensitive content, wherein generation of the confidence score is based on whether the electronic document content matched the segmented pattern or the un-segmented pattern, and based on the corroborating data associated with the matched pattern, the generation of the confidence score being regardless of a language of the electronic document content;

    identify a data dissemination policy based on the determination as to whether the electronic document content is sensitive content and the corresponding confidence score; and

    automatically process the electronic document by identifying an action defined by the data dissemination policy and automatically performing the identified action to control electronic dissemination of the electronic document content over a computer network by at least one of;

    automatically blocking the document content from being sent to a potential recipient;

    automatically displaying a message indicating that the document content contains sensitive material and that the document content will be blocked from being sent to a potential recipient;

    orautomatically displaying a message indicating that the document content contains sensitive material and instructing the user how to proceed based on the data dissemination policy.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×