Method and apparatus for detecting policy violations in a data repository having an arbitrary data schema
First Claim
Patent Images
1. A computer-implemented method, comprising:
- identifying, by a computer system, structured data stored in a data repository having an arbitrary data schema, the arbitrary data schema of the data repository being unknown;
scanning the structured data of the data repository;
converting the structured data into text data; and
applying a schema-independent policy to the converted text data to detect confidential information stored in the data repository regardless of the arbitrary data schema of the structured data, wherein the schema-independent policy is a data loss prevention policy, and wherein a policy violation is triggered when the converted text data contains confidential information.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for scanning structured data from a data repository having an arbitrary data schema and for applying a policy to the data of the data repository are described. In one embodiment, the structured data is converted to unstructured text data to allow a schema-independent policy to be applied to the text data in order to detect a policy violation in the data repository regardless of the data schema used by the data repository.
169 Citations
18 Claims
-
1. A computer-implemented method, comprising:
-
identifying, by a computer system, structured data stored in a data repository having an arbitrary data schema, the arbitrary data schema of the data repository being unknown; scanning the structured data of the data repository; converting the structured data into text data; and applying a schema-independent policy to the converted text data to detect confidential information stored in the data repository regardless of the arbitrary data schema of the structured data, wherein the schema-independent policy is a data loss prevention policy, and wherein a policy violation is triggered when the converted text data contains confidential information. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. An apparatus, comprising:
-
a data repository having an arbitrary data schema; and a scan agent, coupled to the data repository, to access data stored in the data repository, to derive the arbitrary data schema of the data repository, the arbitrary data schema of the data repository being unknown, to scan the data of the data repository, and to convert the data of the data repository into text data to allow application of a schema-independent policy to the converted text data of the data repository to detect confidential data stored in the data repository regardless of the arbitrary data schema of the structured data, wherein the schema-independent policy is a data loss prevention policy, and wherein a policy violation is triggered when the converted text data contains confidential information. - View Dependent Claims (15, 16)
-
-
17. A computer-readable storage medium having instructions stored thereon that when executed by a computer cause the computer to perform a method, comprising:
-
identifying, by a computer system, structured data stored in a data repository having an arbitrary data schema, the arbitrary data schema of the data repository being unknown; scanning the structured data of the data repository; converting the structured data into text data; and applying a schema-independent policy to the converted text data to detect confidential information stored in the data repository regardless of the arbitrary data schema of the structured data, wherein the schema-independent policy is a data loss prevention policy, and wherein a policy violation is triggered when the converted text data contains confidential data. - View Dependent Claims (18)
-
Specification