System and method for protecting specified data combinations
First Claim
1. A system for protecting data in a network, the system comprising:
- one or more hardware processors;
a registration subsystem that when running on at least one of the one or more hardware processors is to;
create a plurality of tuples, wherein each tuple includes a respective set of data file tokens, each set of data file tokens corresponding to a respective set of data elements in a data file;
select a data file token of a first tuple as a token key to index the first tuple, wherein the data file token is selected as the token key based on determining the data file token occurs with less frequency across the plurality of tuples than frequencies at which other data file tokens of the first tuple occur across the plurality of tuples; and
create an index table with a particular index including the token key; and
a detection subsystem that when running on at least one of the one or more hardware processors is to;
tokenize data elements in an object into corresponding object tokens, the object captured via network traffic traversing the network;
identify the first tuple based on determining the token key of the particular index in the index table corresponds to a particular object token of the object tokens; and
validate a correspondence between the first tuple and the object based on determining that the data file tokens of the first tuple correspond to the object tokens according to a predetermined threshold.
9 Assignments
0 Petitions
Accused Products
Abstract
A method in one example implementation includes extracting a plurality of data elements from a record of a data file, tokenizing the data elements into tokens, and storing the tokens in a first tuple of a registration list. The method further includes selecting one of the tokens as a token key for the first tuple, where the token is selected because it occurs less frequently in the registration list than each of the other tokens in the first tuple. In specific embodiments, at least one data element is an expression element having a character pattern matching a predefined expression pattern that represents at least two words and a separator between the words. In other embodiments, at least one data element is a word defined by a character pattern of one or more consecutive essential characters. Other specific embodiments include determining an end of the record by recognizing a predefined delimiter.
-
Citations
20 Claims
-
1. A system for protecting data in a network, the system comprising:
-
one or more hardware processors; a registration subsystem that when running on at least one of the one or more hardware processors is to; create a plurality of tuples, wherein each tuple includes a respective set of data file tokens, each set of data file tokens corresponding to a respective set of data elements in a data file; select a data file token of a first tuple as a token key to index the first tuple, wherein the data file token is selected as the token key based on determining the data file token occurs with less frequency across the plurality of tuples than frequencies at which other data file tokens of the first tuple occur across the plurality of tuples; and create an index table with a particular index including the token key; and a detection subsystem that when running on at least one of the one or more hardware processors is to; tokenize data elements in an object into corresponding object tokens, the object captured via network traffic traversing the network; identify the first tuple based on determining the token key of the particular index in the index table corresponds to a particular object token of the object tokens; and validate a correspondence between the first tuple and the object based on determining that the data file tokens of the first tuple correspond to the object tokens according to a predetermined threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. At least one non-transitory machine readable medium for protecting data in a network, the at least one non-transitory machine readable medium comprising instructions that, when executed, cause one or more processors to:
-
create a plurality of tuples, wherein each tuple includes a respective set of data file tokens, each set of data file tokens corresponding to a respective set of data elements in a data file; select a data file token of a first tuple as a token key to index the first tuple, wherein the data file token is selected as the token key based on determining the data file token occurs with less frequency across the plurality of tuples than frequencies at which other data file tokens of the first tuple occur across the plurality of tuples; create an index table with a particular index including the token key; tokenize data elements in an object into corresponding object tokens, the object captured via network traffic traversing the network; identify the first tuple based on determining the token key of the particular index in the index table corresponds to a particular object token of the object tokens; and validate a correspondence between the first tuple and the object based on determining that the data file tokens of the first tuple correspond to the object tokens according to a predetermined threshold. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method for protecting data in a network, the method comprising:
-
creating, by at least one hardware processor of a registration system, a plurality of tuples, wherein each tuple includes a respective set of data file tokens, each set of data file tokens corresponding to a respective set of data elements in a data file; selecting a data file token of a first tuple as a token key to index the first tuple, wherein the data file token is selected as the token key based on determining the data file token occurs with less frequency across the plurality of tuples than frequencies at which other data file tokens of the first tuple occur across the plurality of tuples; creating an index table with a particular index including the token key; tokenizing, by at least one hardware processor of a detection system, data elements in an object into corresponding object tokens, the object captured via network traffic traversing the network; identifying the first tuple based on determining the token key of the particular index in the index table corresponds to a particular object token of the object tokens; and validating a correspondence between the first tuple and the object based on determining that the data file tokens of the first tuple correspond to the object tokens according to a predetermined threshold. - View Dependent Claims (20)
-
Specification