System and method for protecting specified data combinations
First Claim
1. At least one non-transitory, computer readable medium comprising instructions that, when executed, cause one or more processors to:
- tokenize a plurality of data elements in an object into a plurality of object tokens;
identify, in an index table of token keys, a token key that corresponds to an object token of the plurality of object tokens;
identify a first tuple of a plurality of tuples in a registration list based, at least in part, on the token key, wherein the first tuple includes a set of registered tokens that represents a set of data elements, and the token key is a registered token in the set of registered tokens;
determine a number of the registered tokens that are found in at least one object token of the plurality of object tokens; and
take an action based on determining that the number of the registered tokens satisfies a predetermined threshold associated with the set of data elements, wherein the action is preventing transmission of the object or locking down a storage repository.
2 Assignments
0 Petitions
Accused Products
Abstract
A method in one example implementation includes extracting a plurality of data elements from a record of a data file, tokenizing the data elements into tokens, and storing the tokens in a first tuple of a registration list. The method further includes selecting one of the tokens as a token key for the first tuple, where the token is selected because it occurs less frequently in the registration list than each of the other tokens in the first tuple. In specific embodiments, at least one data element is an expression element having a character pattern matching a predefined expression pattern that represents at least two words and a separator between the words. In other embodiments, at least one data element is a word defined by a character pattern of one or more consecutive essential characters. Other specific embodiments include determining an end of the record by recognizing a predefined delimiter.
-
Citations
20 Claims
-
1. At least one non-transitory, computer readable medium comprising instructions that, when executed, cause one or more processors to:
-
tokenize a plurality of data elements in an object into a plurality of object tokens; identify, in an index table of token keys, a token key that corresponds to an object token of the plurality of object tokens; identify a first tuple of a plurality of tuples in a registration list based, at least in part, on the token key, wherein the first tuple includes a set of registered tokens that represents a set of data elements, and the token key is a registered token in the set of registered tokens; determine a number of the registered tokens that are found in at least one object token of the plurality of object tokens; and take an action based on determining that the number of the registered tokens satisfies a predetermined threshold associated with the set of data elements, wherein the action is preventing transmission of the object or locking down a storage repository. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. An apparatus, comprising:
-
a memory device including a set of instructions; and a processor, coupled to the memory device, that, when executing the set of instructions, is to tokenize a plurality of data elements in an object into a plurality of object tokens; identify, in an index table of token keys, a token key that corresponds to an object token of the plurality of object tokens; identify a tuple of a plurality of tuples in a registration list based, at least in part, on the token key, wherein the tuple includes a set of registered tokens that represents a set of data elements, and the token key is a registered token in the set of registered tokens; determine a number of the registered tokens that are found in at least one object token of the plurality of object tokens; and take an action based on determining that the number of the registered tokens satisfies a predetermined threshold associated with the set of data elements, wherein the action is preventing transmission of the object or locking down a storage repository. - View Dependent Claims (14, 15, 16, 17)
-
-
18. A method, the method comprising:
-
tokenizing a plurality of data elements in an object into a plurality of object tokens; identifying, in an index table of token keys, a token key that corresponds to an object token of the plurality of object tokens; identifying a tuple of a plurality of tuples in a registration list based, at least in part, on the token key, wherein the tuple includes a set of registered tokens that represents a set of data elements, and the token key is a registered token in the set of registered tokens; determining a number of the registered tokens that are found in at least one object token of the plurality of object tokens; and taking an action based on determining that the number of the registered tokens satisfies a predetermined threshold associated with the set of data elements, wherein the action is preventing transmission of the object or locking down a storage repository. - View Dependent Claims (19, 20)
-
Specification