Method and apparatus for autonomic discovery of sensitive content
First Claim
1. A method, operative at or in association with an endpoint in a data loss prevention (DLP) system, the DLP system executing at least in part on hardware and operative to perform scans of resources in a file system associated with the endpoint to search for sensitive content, the scans including an ordered set of scans that include a last scan followed by a next scan, comprising:
- following the last scan;
obtaining information identifying an identity of a resource being accessed; and
updating a statistical model of resource access and usage based on the obtained information, the statistical model including, with respect to a resource, a count of resource accesses since the last scan; and
prior to initiating the next scan, prioritizing resources for further scanning for sensitive content based at least in part on resource access counts in the statistical model and one or more content sensitivity classifications associated with one or more resources.
5 Assignments
0 Petitions
Accused Products
Abstract
A data loss prevention (DLP) system provides a policy-based mechanism for managing how data is discovered and classified on an endpoint workstation, file server or other device within an enterprise. The technique described herein works in an automated manner by analyzing file system activity as one or more endpoint applications interact with a file system to build a statistical model of which areas of the file system are (or will be deemed to be) active or highly active. Using this information, scanning to those areas by the DLP software is then prioritized appropriately to focus compute resources on scanning and classifying preferably only those files and folders that are necessary to be scanned, i.e., the file system portions in which the user is applying the majority of his or her activity. As a result, the technique limits scanning to only those areas that have meaningful activity (thereby conserving compute resources with respect to files or folders that have not changed), improving scanning efficiency.
20 Citations
24 Claims
-
1. A method, operative at or in association with an endpoint in a data loss prevention (DLP) system, the DLP system executing at least in part on hardware and operative to perform scans of resources in a file system associated with the endpoint to search for sensitive content, the scans including an ordered set of scans that include a last scan followed by a next scan, comprising:
-
following the last scan; obtaining information identifying an identity of a resource being accessed; and updating a statistical model of resource access and usage based on the obtained information, the statistical model including, with respect to a resource, a count of resource accesses since the last scan; and prior to initiating the next scan, prioritizing resources for further scanning for sensitive content based at least in part on resource access counts in the statistical model and one or more content sensitivity classifications associated with one or more resources. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An apparatus, operative at or in association with an endpoint in a data loss prevention (DLP) system, the DLP system operative to perform scans of resources in a file system associated with the endpoint to search for sensitive content, the scans including an ordered set of scans that include a last scan followed by a next scan, comprising:
-
a processor; computer memory holding computer program instructions executed by the processor and operative; following the last scan; to obtain information identifying an identity of a resource being accessed; and to update a statistical model of resource access and usage based on the obtained information, the statistical model including, with respect to a resource, a count of resource accesses since the last scan; and prior to initiating the next scan, to prioritize resources for further scanning for sensitive content based at least in part on resource access counts in the statistical model and one or more content sensitivity classifications associated with one or more resources. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
-
18. A non-transitory computer readable storage medium comprising a computer program product for use in a data processing system operative at or in association with an endpoint in a data loss prevention (DLP) system, the DLP system operative to perform scans of resources in a file system associated with the endpoint to search for sensitive content, the scans including an ordered set of scans that include a last scan followed by a next scan, the computer program product holding computer program instructions which, when executed by the data processing system, perform a method comprising:
-
following the last scan; obtaining information identifying an identity of a resource being accessed; and updating a statistical model of resource access and usage based on the obtained information, the statistical model including, with respect to a resource, a count of resource accesses since the last scan; and prior to initiating the next scan, prioritizing resources for further scanning for sensitive content based at least in part on resource access counts in the statistical model and one or more content sensitivity classifications associated with one or more resources. - View Dependent Claims (19, 20, 21, 22, 23, 24)
-
Specification