Data processing and scanning systems for generating and populating a data inventory
First Claim
1. A data processing intelligent data repository scanning system comprising:
- one or more processors;
computer memory; and
a computer-readable medium storing computer-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising;
connecting the data processing intelligent data repository scanning system to an application executing on one or more remote computing devices using an application programming interface;
scanning one or more data repositories on the one or more remote computing devices to identify one or more data attributes, wherein the one or more data attributes are associated with a processing activity, and wherein the processing activity is associated with one or more individuals;
generating a catalog of one or more pieces of information associated with the one or more individuals, wherein one of the one or more pieces of information is associated with the processing activity;
analyzing the one or more data attributes and correlating metadata for the scanned one or more data repositories with particular attributes of the one or more data attributes discovered in the one or more data repositories;
using one or more machine learning techniques to categorize one or more data elements from the generated catalog based at least in part on a confidence score;
analyzing a data flow of the particular attributes of the one or more data attributes between the one or more data repositories; and
storing the categorized one or more data elements and the data flow in the computer memory.
2 Assignments
0 Petitions
Accused Products
Abstract
In particular embodiments, a data processing data inventory generation system is configured to: (1) generate a data model (e.g., a data inventory) for one or more data assets utilized by a particular organization; (2) generate a respective data inventory for each of the one or more data assets; and (3) map one or more relationships between one or more aspects of the data inventory, the one or more data assets, etc. within the data model. In particular embodiments, a data asset (e.g., data system, software application, etc.) may include, for example, any entity that collects, processes, contains, and/or transfers personal data (e.g., such as a software application, “internet of things” computerized device, database, website, data-center, server, etc.). The system may be configured to identify particular data assets and/or personal data in data repositories using any suitable intelligent identity scanning technique.
1029 Citations
20 Claims
-
1. A data processing intelligent data repository scanning system comprising:
-
one or more processors; computer memory; and a computer-readable medium storing computer-executable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising; connecting the data processing intelligent data repository scanning system to an application executing on one or more remote computing devices using an application programming interface; scanning one or more data repositories on the one or more remote computing devices to identify one or more data attributes, wherein the one or more data attributes are associated with a processing activity, and wherein the processing activity is associated with one or more individuals; generating a catalog of one or more pieces of information associated with the one or more individuals, wherein one of the one or more pieces of information is associated with the processing activity; analyzing the one or more data attributes and correlating metadata for the scanned one or more data repositories with particular attributes of the one or more data attributes discovered in the one or more data repositories; using one or more machine learning techniques to categorize one or more data elements from the generated catalog based at least in part on a confidence score; analyzing a data flow of the particular attributes of the one or more data attributes between the one or more data repositories; and storing the categorized one or more data elements and the data flow in the computer memory. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-implemented data processing method for scanning one or more data repositories to identify one or more attributes of data associated with a processing activity, the method comprising:
-
connecting, by one or more processors, via one or more computer networks, to one or more databases; scanning, by one of more processors, the one or more databases to generate a catalog of one or more pieces of information associated with one or more individuals, wherein the one or more pieces of information identify one or more processing activities undertaken by a particular organization that include processing of personal data associated with the one or more individuals; storing the generated catalog in computer memory; scanning one or more data repositories based at least in part on the generated catalog to identify one or more attributes of data associated with the one or more individuals by searching one or more data fields in the one or more databases for the one or more processing activities; analyzing and correlating the one or more attributes of data and metadata for the scanned one or more data repositories; using one or more machine learning techniques to categorize one or more data elements from the generated catalog; analyzing a flow of the one or more data elements between the one or more data repositories and at least one known data asset; modifying an existing data model of data assets to include an attribute defined by the one or more data elements; and electronically linking the at least one known data asset and the attribute in the existing data model of data assets. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A non-transitory computer-readable medium storing computer-executable instructions for scanning one or more data repositories to identify one or more attributes of data associated with one or more individuals, the computer-executable instructions comprising computer-executable instructions for:
-
connecting, by one or more processors via one or more computer networks, to one or more remote compute devices; analyzing, by one or more processors, one or more applications on the one or more remote computing devices to identify one or more data repositories on the one or more remote computing devices; scanning, by one or more processors, the one or more data repositories on the one or more remote computing devices to identify one or more data attributes, wherein the one or more data attributes are associated with a processing activity, and wherein the processing activity is associated with one or more individuals; generating, by one or more processors, a catalog of one or more pieces of information associated with the one or more individuals, wherein one of the one or more pieces of information are associated with the processing activity; storing, by one or more processors, the generated catalog in computer memory; analyzing, by one or more processors, the one or more data attributes and correlating metadata for the scanned one or more data repositories with particular attributes of the one or more data attributes discovered in the one or more data repositories; using one or more machine learning techniques to categorize one or more data elements from the generated catalog; analyzing a flow of the one or more data elements between the one or more data repositories and at least one known data asset; modifying an existing data model of data assets to include an attribute defined by the one or more data elements; and electronically linking the at least one known data asset and the attribute in the existing data model of data assets. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification