Data obfuscation of text data using entity detection and replacement
First Claim
1. A computer-implemented method of obfuscating text data in a document, said method comprising:
- establishing, using a computing device, configuration parameters to be used in obfuscating said text data in said document, said configuration parameters comprising first configuration parameters identifying selected text data in said document to be obfuscated and second configuration parameters defining one of many different selectable levels of obfuscation for said document;
identifying, by said computing device, text data in said document based on said first configuration parameters to produce identified text data;
annotating, by said computing device, each said identified text data in said document with a corresponding tag to produce annotated text data;
transforming, by said computing device, said annotated text data using obfuscating data associated with at least one of said second configuration parameters to produce transformed data according to said levels of obfuscation defined for said document;
substituting, by said computing device, said transformed data for respective annotated text data into said document; and
storing, by said computing device, a record for each of said annotated text data and said corresponding tag,said transformed data displaying obfuscated text data such that readability of said document is maintained.
0 Assignments
0 Petitions
Accused Products
Abstract
A data obfuscation method, apparatus and computer program product are disclosed in which at least selected text entities such as words or abbreviations in a document are obfuscated to prevent the disclosure of private information if the document is disclosed. A user establishes various configuration parameters for selected text entities desired to obfuscated. The document is processed and text entities matching the configuration parameters are tagged for obfuscation. The tagged entities are then substituted in the document with obfuscating text. The obfuscating text can be derived from a hash table. The hash table may be used to provide a reverse obfuscation method by which original data can be restored to an obfuscated document.
46 Citations
11 Claims
-
1. A computer-implemented method of obfuscating text data in a document, said method comprising:
-
establishing, using a computing device, configuration parameters to be used in obfuscating said text data in said document, said configuration parameters comprising first configuration parameters identifying selected text data in said document to be obfuscated and second configuration parameters defining one of many different selectable levels of obfuscation for said document; identifying, by said computing device, text data in said document based on said first configuration parameters to produce identified text data; annotating, by said computing device, each said identified text data in said document with a corresponding tag to produce annotated text data; transforming, by said computing device, said annotated text data using obfuscating data associated with at least one of said second configuration parameters to produce transformed data according to said levels of obfuscation defined for said document; substituting, by said computing device, said transformed data for respective annotated text data into said document; and storing, by said computing device, a record for each of said annotated text data and said corresponding tag, said transformed data displaying obfuscated text data such that readability of said document is maintained. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method of restoring an obfuscated document to an original form, said method comprising:
-
sequentially parsing said obfuscated document to examine text entities in said document; determining whether a current text entity in said document is found in a hash table, said hash table being populated during creation of said obfuscated document and comprising records of original text data in said document for each of said text entities used in formation of said obfuscated document and annotation tags that identify a particular position of said text entities in said obfuscated document corresponding to said original text data; selecting a next text entity and repeating said determining when said current text entity is not found in said hash table; obtaining a replacement text value from said hash table, said replacement text value corresponding to said original text data for said current text entity; substituting said replacement text value into said document in place of said current text entity according to said annotation tags; and repeating said sequentially parsing, said determining, said selecting, said obtaining and said substituting until all of said document has been parsed.
-
-
10. A non-transitory storage medium having a computer program recorded thereon, the program being executable by a computer to perform a method of obfuscating text data in a document, said method comprising:
-
identifying configuration parameters to be used in obfuscating said text data in said document, said configuration parameters comprising first configuration parameters identifying selected text data in said document to be obfuscated and second configuration parameters defining one of many different selectable levels of obfuscation for said document; identifying text data in said document based on said first configuration parameters to produce identified text data; annotating each said identified text data in said document with a corresponding tag to produce annotated text data; transforming said annotated text data using obfuscating data associated with at least one of said second configuration parameters to produce transformed data according to said levels of obfuscation defined for said document; substituting said transformed data for respective annotated text data into said document; and storing a record for each of said annotated text data and said corresponding tag, said transformed data displaying obfuscated text data such that readability of said document is maintained.
-
-
11. Computer apparatus for obfuscating text data in a document, said apparatus comprising:
-
means for establishing configuration parameters to be used in obfuscating said text data in said document, said configuration parameters comprising first configuration parameters identifying selected text data in said document to be obfuscated and second configuration parameters defining one of many different selectable levels of obfuscation for said document; means for identifying text data in said document based on said first configuration parameters to produce identified text data; means for annotating each said identified text data in said document with a corresponding tag to produce annotated text data; means for transforming said annotated text data using obfuscating data associated with at least one of said second configuration parameters to produce transformed data according to said levels of obfuscation defined for said document; means for substituting said transformed data for respective annotated text data in said document; and means for storing a record for each of said annotated text data and said corresponding tag, said transformed data displaying obfuscated text data such that readability of said document is maintained.
-
Specification