Detecting policy violations in information content containing data in a character-based language
First Claim
Patent Images
1. A computer-implemented method comprising:
- identifying a policy for protecting source data having a tabular format, the source data containing one or more data fragments in a character-based language, wherein the character-based language does not provide visual delimiters between words in the one or more data fragments;
receiving information content having at least a first portion in the character-based language; and
determining whether any part of the information content, including the first portion in the character-based language, violates the policy.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for detecting policy violations in information content containing data in a character-based language is described. In one embodiment, the method includes identifying a policy for protecting source data having a tabular format. The source data contains one or more data fragments in the character-based language. The method further includes receiving information content having at least a portion in the character-based language, and determining whether any part of the information content, including the portion in the character-based language, violates the policy.
246 Citations
25 Claims
-
1. A computer-implemented method comprising:
-
identifying a policy for protecting source data having a tabular format, the source data containing one or more data fragments in a character-based language, wherein the character-based language does not provide visual delimiters between words in the one or more data fragments; receiving information content having at least a first portion in the character-based language; and determining whether any part of the information content, including the first portion in the character-based language, violates the policy. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer-implemented method comprising:
-
identifying source data to be protected, the source data having a tabular format, the source data comprising one or more fragments in a character-based language, wherein the character-based language does not provide visual delimiters between words in the one or more fragments; creating an abstract data structure from the source data, wherein creating the abstract data structure comprises; identifying a plurality of tokens within the source data based on the tabular format, creating a signature for each of the plurality of tokens, and for each token in the character-based language, creating a token delimiter indicator; and providing the abstract data structure to a data monitoring system to detect violations of one or more policies in information content containing data in the character-based language, the one or more policies being defined to protect the source data. - View Dependent Claims (12, 13, 14)
-
-
15. A system comprising:
-
a natural language identifier to receive information content having at least a first portion in a character-based language; and a data monitoring component, coupled with the natural language identifier, to identify a policy for protecting source data having a tabular format, the source data containing one or more data fragments in the character-based language, wherein the character-based language does not provide visual delimiters between words in the one or more data fragments, and to determine whether any part of the information content, including the first portion in the character-based language, violates the policy. - View Dependent Claims (16, 17, 18, 19)
-
-
20. A non-transitory computer readable storage medium that provides instructions, which when executed on a processing system cause the processing system to perform a method comprising:
-
identifying a policy for protecting source data having a tabular format, the source data containing one or more data fragments in a character-based language, wherein the character-based language does not provide visual delimiters between words in the one or more data fragments; receiving information content having at least a first portion in the character-based language; and determining whether any part of the information content, including the first portion in the character-based language, violates the policy. - View Dependent Claims (21, 22, 23, 24, 25)
-
Specification