System and method for disguising data
First Claim
1. A method for disguising and de-identifying data comprising:
- retrieving an input value from an input data set containing confidential information;
generating an index value based upon the input value;
retrieving a substitute value from a lookup table using the index value, the substitute value containing non-confidential information;
constructing an output value based upon the substitute value; and
storing the output value in an output data set.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for disguising and de-identifying data is provided. Records are extracted from an input data set containing confidential information (e.g., a production database). One or more data transformation algorithms for disguising specific types of data, including first names, last names, company names, telephone numbers, addresses, social security numbers, and e-mail addresses, in addition to generic data, are applied to the records to disguise or “scrub” confidential information therefrom. The transformation algorithms retrieve substitute information from one or more lookup tables, or generate substitute information using in-memory manipulation rules, and produce output records containing the substitute information. The output records are structurally similar to the input records, contain no confidential information, and are stored in an output data set that can be utilized in less-secure (e.g., non-production) environments. Optionally, transformation keys can be provided for increasing confidentiality and improving transformation effectiveness. A configuration tool allows users of various roles to define, approve, and implement data transformation rules, parameters, and processes.
103 Citations
50 Claims
-
1. A method for disguising and de-identifying data comprising:
-
retrieving an input value from an input data set containing confidential information;
generating an index value based upon the input value;
retrieving a substitute value from a lookup table using the index value, the substitute value containing non-confidential information;
constructing an output value based upon the substitute value; and
storing the output value in an output data set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A method for disguising and de-identifying data comprising:
-
retrieving an input value from an input data set containing confidential information;
generating a transformation key based upon the input value;
manipulating the input value based upon the transformation key to produce an output value containing non-confidential information; and
storing the output value in an output data set. - View Dependent Claims (18, 19, 20, 21, 22, 23)
-
-
24. A system for disguising and de-identifying data comprising:
-
an input data set containing confidential information;
a plurality of data transformation algorithms for removing confidential information from the input data set; and
a driver program for invoking the plurality of data transformation algorithms on the input data set and producing an output data set having no confidential information, wherein data in the output data set is structurally similar to data in the input set and contains no confidential information. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38)
-
-
39. A method for disguising and de-identifying confidential information comprising:
-
retrieving an input value from an input data set;
determining a data type of the input value;
applying a transformation algorithm to the input value based upon the type of the input value to produce an output value having no confidential information; and
storing the output value in an output dataset. - View Dependent Claims (40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)
-
Specification