Method for managing mainframe overhead during detection of sensitive information, computer readable storage media and system utilizing same
First Claim
1. A method for controlling resource usage in the analysis of data including sensitive information and arranged in a plurality of records included in a plurality of data sets on a mainframe system, comprising:
- receiving an indication of names of data sets for analysis and a limitations relating to the analysis that includes a maximum number of threads concurrently running on the mainframe system;
querying the mainframe system for a plurality of the named data sets to create a plurality of respective threads;
determining the number of threads concurrently running on the mainframe system;
querying the mainframe system for an additional named data set if the number of threads concurrently running on the mainframe system is not greater than the maximum number;
analyzing the data arranged in the plurality of records that correspond to the named data sets to infer structure in the plurality of records to identify fields including the sensitive information;
storing a redrive position;
halting the analysis of the data arranged in the plurality of records at a data set position in accordance with the limitations before analyzing all the data arranged in the plurality of records;
resuming the analysis at the redrive position in accordance with the limitations, wherein the redrive position including a data set position in the plurality of records prior to the data set position where the analysis was halted; and
analyzing the identified fields for sensitive information.
1 Assignment
0 Petitions
Accused Products
Abstract
Examples of methods, systems, and computer-readable media for managing mainframe overhead during detection of sensitive information are described using multiple techniques. The techniques may include manipulating a scan definition, defining scan parameters and limitations, utilizing user-supplied scan filters, and using a redrive operation. The redrive operation may include halting one or more analysis requests associated with scan definitions, storing a redrive position for each analysis request, and resuming the servicing of analysis requests at the redrive position for each request.
55 Citations
23 Claims
-
1. A method for controlling resource usage in the analysis of data including sensitive information and arranged in a plurality of records included in a plurality of data sets on a mainframe system, comprising:
-
receiving an indication of names of data sets for analysis and a limitations relating to the analysis that includes a maximum number of threads concurrently running on the mainframe system; querying the mainframe system for a plurality of the named data sets to create a plurality of respective threads; determining the number of threads concurrently running on the mainframe system; querying the mainframe system for an additional named data set if the number of threads concurrently running on the mainframe system is not greater than the maximum number; analyzing the data arranged in the plurality of records that correspond to the named data sets to infer structure in the plurality of records to identify fields including the sensitive information; storing a redrive position; halting the analysis of the data arranged in the plurality of records at a data set position in accordance with the limitations before analyzing all the data arranged in the plurality of records; resuming the analysis at the redrive position in accordance with the limitations, wherein the redrive position including a data set position in the plurality of records prior to the data set position where the analysis was halted; and analyzing the identified fields for sensitive information. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. One or more non-transitory computer readable storage media encoded with executable instructions when executed by one or more processing units causes the one or more processing unit to control resource usage in the analysis of data including sensitive information and arranged in a plurality of records included in a plurality of data sets on a mainframe system, comprising:
-
receiving an indication of names of data sets for analysis and a limitations relating to the analysis that includes a maximum number of threads concurrently running on the mainframe system; querying the mainframe system for a plurality of the named data sets to create a plurality of respective threads; determining the number of threads concurrently running on the mainframe system; querying the mainframe system for an additional named data set if the number of threads concurrently running on the mainframe system is not greater than the maximum number; analyzing the data arranged in the plurality of records that correspond to the named data sets to infer structure in the plurality of records to identify fields including the sensitive information; storing a redrive position; halting the analysis of the data arranged in the plurality of records at a data set position in accordance with the limitations before analyzing all the data arranged in the plurality of records; resuming the analysis at the redrive position in accordance with the limitations, the redrive position including a data set position in the plurality of records prior to the data set position where the analysis was halted; and analyzing the identified fields for sensitive information. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A system for controlling resource usage in the analysis of data sensitive information and arranged in a plurality of records included in a plurality of data sets on a mainframe system comprising:
-
at least one processing unit coupled to a memory, wherein the memory is encoded with computer executable instructions that, when executed by the at least one processor unit causes the at least one processing unit to; receive an indication of names of data sets for analysis and limitations relating to the analysis that includes a maximum number of threads concurrently running on the mainframe system; query the mainframe system for a plurality of the named data sets to create a plurality of respective threads; determine the number of threads concurrently running on the mainframe system; query the mainframe system for an additional named data set if the number of threads concurrently running on the mainframe system is not greater than the maximum number; analyze the data arranged in the plurality of records that correspond to the named data sets to infer structure in the plurality of records to identify fields including the sensitive information; store a redrive position; halt the analysis of the data arranged in the plurality of records at a data set position in accordance with the limitations before analyzing all the data arranged in the plurality of records; resume the analysis at the redrive position in accordance with the limitations, the redrive position including a data set position in the plurality of records prior to the data set position where the analysis was halted; and analyze the identified fields for sensitive information. - View Dependent Claims (16, 17, 18, 19)
-
-
20. A method for controlling resource usage in the analysis of data including sensitive information and arranged in a plurality of records included in a plurality of data sets on a mainframe system, comprising:
-
receiving an indication of names of data sets for analysis and a limitation relating to the analysis that includes a maximum number of threads concurrently running on the mainframe system; querying the mainframe system for a plurality of the named data sets to create a plurality of respective threads; determining the number of threads concurrently running on the mainframe system;
querying the mainframe system for an additional named data set if the number of threads concurrently running on the mainframe system is not greater than the maximum number;analyzing the data arranged in the plurality of records that correspond to the named data sets to infer structure in the plurality of records to identify fields including the sensitive information; and analyzing the identified fields for sensitive information so as to permit an assessment of the risk of unauthorized access to the sensitive information. - View Dependent Claims (21)
-
-
22. One or more non-transitory computer readable storage media encoded with instructions executable by one or more processing units of a computing system controlling resource usage in the analysis of data including sensitive information and arranged in a plurality of records included in a plurality of data sets on a mainframe system, the instructions comprising instructions for:
- receiving an indication of names of data sets for analysis and a limitation relating to the analysis that includes a maximum number of threads concurrently running on the mainframe system;
querying the mainframe system for a plurality of the named data sets to create a plurality of respective threads;
determining the number of threads concurrently running on the mainframe system;
querying the mainframe system for an additional named data set if the number of threads concurrently running on the mainframe system is not greater than the maximum number;
analyzing the data arranged in the plurality of records that correspond to the named data sets to infer structure in the plurality of records to identify fields including the sensitive information; and
analyzing the identified fields for sensitive information so as to permit an assessment of the risk of unauthorized access to the sensitive information. - View Dependent Claims (23)
- receiving an indication of names of data sets for analysis and a limitation relating to the analysis that includes a maximum number of threads concurrently running on the mainframe system;
Specification