Mechanism to search information content for preselected data
First Claim
Patent Images
1. A computer implemented method for a computer system having a memory and at least one processor, the method comprising:
- receiving, by the processor, an index derived from preselected data having a tabular structure, wherein the index defines the tabular structure of the preselected data, includes data corresponding to an encrypted version of the preselected data, and contains positioning information reflecting relative placement of data items within the tabular structure of the preselected data, the positioning information identifying rows and columns storing the data items of the preselected data;
receiving, by the processor, information content;
detecting, by the processor, in the information content, a sequence of content fragments that is indicative of containing a portion of the preselected data;
determining, by the processor, whether a subset of content fragments within the sequence matches any sub-set of the preselected data using the index derived from the preselected data; and
performing, by the processor, a policy enforcement action based on results of determining whether a subset of content fragments within the sequence matches any sub-set of the preselected data.
3 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for detecting preselected data embedded in information content is described. In one embodiment, the method comprises receiving information content and detecting in the information content a sequence of content fragments that may contain a portion of preselected data. The method further comprises determining whether a sub-set of these content fragments matches any sub-set of the preselected data using an abstract data structure that defines a tabular structure of the preselected data.
96 Citations
20 Claims
-
1. A computer implemented method for a computer system having a memory and at least one processor, the method comprising:
-
receiving, by the processor, an index derived from preselected data having a tabular structure, wherein the index defines the tabular structure of the preselected data, includes data corresponding to an encrypted version of the preselected data, and contains positioning information reflecting relative placement of data items within the tabular structure of the preselected data, the positioning information identifying rows and columns storing the data items of the preselected data; receiving, by the processor, information content; detecting, by the processor, in the information content, a sequence of content fragments that is indicative of containing a portion of the preselected data; determining, by the processor, whether a subset of content fragments within the sequence matches any sub-set of the preselected data using the index derived from the preselected data; and performing, by the processor, a policy enforcement action based on results of determining whether a subset of content fragments within the sequence matches any sub-set of the preselected data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 18)
-
-
17. A computer implemented method for a computer system having a memory and at least one processor, the method comprising:
-
receiving, by the processor, information content; detecting, by the processor, in the information content, a sequence of content fragments that may contain a portion of preselected data; determining, by the processor, whether a subset of content fragments within the sequence matches any sub-set of the preselected data using an abstract data structure that defines a tabular structure of the preselected data, wherein the abstract data structure comprises a tuple-storage structure derived from the preselected data, wherein the tuple-storage structure comprises a plurality of tuples, each of the plurality of tuples including a row numbers of a data item in a corresponding cell of the tabular structure of the preselected data and including a column number and optionally a column type of the data item in the corresponding cell, wherein determining, by the processor, whether a sub-set of the content fragments within the sequence matches any sub-set of the pre-selected data comprises; finding, by the processor, a set of matching tuples in the abstract data structure for each content fragment in the sequence, combining, by the processor, sets of matching tuples found for all content fragments in the sequence, grouping, by the processor, the combined sets of matching tuples by row numbers into groups of matching tuple sets, sorting, by the processor, the groups of matching tuple sets by the number of matching tuple sets contained in each group, selecting, by the processor, groups that have matching tuple sets with distinct column numbers, and determining, by the processor, whether any of the selected groups have a number of matching tuple sets that exceeds a predefined threshold; and performing, by the processor, a policy enforcement action based on results of determining whether a subset of content fragments within the sequence matches any sub-set of the multi-column tabular preselected data.
-
-
19. A system comprising:
-
a memory containing an index derived from preselected data having a tabular structure, wherein the index defines the tabular structure of the preselected data, includes data corresponding to an encrypted version of the preselected data, and contains positioning information reflecting relative placement of data items within the tabular structure of the preselected data, the positioning information identifying rows and columns storing the data items of the preselected data; and at least one processor coupled to the memory, the at least one processor executing a set of instructions which cause the processor to receive information content, detect, in the information content, a sequence of content fragments that is indicative of containing a portion of the preselected data, determine whether a subset of content fragments within the sequence matches any sub-set of the preselected data using the index, and perform a policy enforcement action based on results of determining whether a subset of content fragments within the sequence matches any sub-set of the preselected data.
-
-
20. A computer readable storage medium that stores instructions, which when executed on a processor cause the processor to perform a method comprising:
-
receiving an index derived from the preselected data having a tabular structure, wherein the index defines the tabular structure of the preselected data, includes data corresponding to an encrypted version of the preselected data, and contains positioning information reflecting relative placement of data items within the tabular structure of the preselected data, the positioning information identifying rows and columns storing the data items of the preselected data; receiving information content; detecting, in the information content, a sequence of content fragments that is indicative of containing a portion of preselected data; determining whether a subset of content fragments within the sequence matches any sub-set of the preselected data using the index derived from the preselected data; and performing a policy enforcement action based on results of determining whether a subset of content fragments within the sequence matches any sub-set of the preselected data.
-
Specification