Method and system for managing confidential information
First Claim
1. A method of controlling usage of information subject to evolution, usage of the information being in information objects, the information objects respectively comprising a plurality of basic information units wherein some of said basic information units change with said evolution and some of said basic information units comprising elementary information units that remain constant with said evolution, the method being carried out on an electronic processor, the method comprising:
- prior to said evolution, identifying ones of said information objects to which it is desired to apply a usage policy;
monitoring said information objects during said usage and said evolution;
comparing basic information units of said monitored information objects with basic information units of ones of said identified information objects, to determine a quantity of said elementary information units that are identical in both; and
if said quantity exceeds a predetermined threshold for a given one of said identified information objects then applying a respective usage policy associated with said given identified information object to said monitored information object, thereby managing said information objects in the face of evolution of information therein.
20 Assignments
0 Petitions
Accused Products
Abstract
A method and a system for information management and control is presented, based on modular and abstract description of the information. Identifiers are used to identify features of interest in the information and information use policies are assigned directly or indirectly on the basis of the identifiers, allowing for flexible and efficient policy management and enforcement, in that a policy can be defined with a direct relationship to the actual information content of digital data items. The information content can be of various kinds: e.g., textual documents, numerical spreadsheets, audio and video files, pictures and images, drawings etc. The system can provide protection against information policy breaches such as information misuse, unauthorized distribution and leakage, and for information tracking.
-
Citations
172 Claims
-
1. A method of controlling usage of information subject to evolution, usage of the information being in information objects, the information objects respectively comprising a plurality of basic information units wherein some of said basic information units change with said evolution and some of said basic information units comprising elementary information units that remain constant with said evolution, the method being carried out on an electronic processor, the method comprising:
-
prior to said evolution, identifying ones of said information objects to which it is desired to apply a usage policy; monitoring said information objects during said usage and said evolution; comparing basic information units of said monitored information objects with basic information units of ones of said identified information objects, to determine a quantity of said elementary information units that are identical in both; and if said quantity exceeds a predetermined threshold for a given one of said identified information objects then applying a respective usage policy associated with said given identified information object to said monitored information object, thereby managing said information objects in the face of evolution of information therein. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161)
-
2. A method according to claim 1, wherein said information objects comprise at least one simple information object, said simple information object comprising one of the following:
-
an elementary information unit; a set of elementary information units; and an ordered set of elementary information units.
-
-
3. A method according to claim 2, wherein said information objects comprise at least one compound information object, said compound information object comprising at least one of the following:
a simple information object;
a compound information object;
an ordered set of compound information objects;
an ordered set of simple information objects; and
an ordered set of compound and simple information objects.
-
4. A method according to claim 1, wherein said elementary information units comprise at least one of the following:
a sentence;
a sequences of words;
a word;
a sequence of characters;
a character;
a sequence of numbers;
a number;
a sequence of digits;
a digit;
a vector;
a curve;
a pixel;
a block of pixels;
an audio frame;
a musical note;
a musical bar;
a visual object;
a sequence of video frames;
a sequence of musical notes;
a sequence of musical bars; and
a video frame.
-
5. A method according to claim 2, further comprising assigning elementary information units identifiers to elementary information units after identification.
-
6. A method according to claim 2, wherein said elementary information unit identifiers are utilized in said identifying.
-
7. A method according to claim 5, wherein said elementary information unit identifiers are determined by the content of said elementary information units which they are assigned to.
-
8. A method according to claim 7, wherein said elementary information unit identifiers are solely determined by said content.
-
9. A method according to claim 5, wherein said elementary information units identifiers are at least partly determined by locations within an information object of respective elementary information units to which they are assigned.
-
10. A method according to claim 5, wherein said elementary information units identifiers are at least partly determined by the content of an elementary information unit in proximity to said elementary information units to which they are assigned.
-
11. A method according to claim 5, comprising storing said elementary information units identifiers in a database.
-
12. A method according to claim 11, further comprising using said elementary information units identifiers stored in said database for identifying at least one further, unidentified, information object.
-
13. A method according to claim 11, further comprising using said elementary information units identifiers stored in said database for comparing information objects.
-
14. A method according to claim 5, comprising storing only some of said elementary information units identifiers in a database.
-
15. A method according to claim 14, wherein said storing of only some of said elementary information units identifiers in a database is to achieve at least one of the following:
-
reduce storage cost; increase efficiency of assigning of said elementary information units identifiers to said elementary information units by only performing said assignment for elementary information units identifiers that are stored in said database; and increase the efficiency of searching for said elementary information units identifiers in said database.
-
-
16. A method according to claim 14, wherein said storage of only some of said elementary information units identifiers in a database is done in a manner that ensures that any area of a given size in said information object contains a predetermined minimum number of said stored elementary information units.
-
17. A method according to claim 16, wherein said given size is dependent on properties of a respective information object.
-
18. A method according to claim 17, wherein said properties of said information object comprise at least one of the following:
importance;
size;
confidentiality level; and
format.
-
19. A method according to claim 16, wherein said minimum number is dependent on properties of said information object.
-
20. A method according to claim 19, wherein said properties of said information object comprise at least one of the following:
importance;
size;
confidentiality level; and
format.
-
21. A method according to claim 5, comprising applying preprocessing to said elementary information units before assigning identifiers thereto.
-
22. A method according to claim 21, wherein said preprocessing is done in order to enhance at least one of efficiency and robustness.
-
23. A method according to claim 21, wherein said preprocessing comprises at least one of canonization;
- removal of common words;
removal of words not having a substantial effect on the meaning of a text;
removal of punctuation;
correction of spelling;
canonization of spelling;
scene detection;
canonizing size;
canonizing orientation;
canonizing color;
removing color;
reducing noise;
enhancing area separation;
enhancing borders;
enhancing lines;
sharpening;
blurring;
removal of elementary information units substantially similar to neighboring elementary information units;
canonization of grammar; and
transformation to a phonetic representation.
- removal of common words;
-
24. A method according to claim 21, comprising carrying out said preprocessing so as to ensure that any area of a given size in said information object contains at least a predetermined number of said elementary information units having an assigned elementary information unit identifier.
-
25. A method according to claim 24, wherein said given size is dependent on properties of said information object.
-
26. A method according to claim 25, wherein said properties of said information object comprise at least one of a group comprising:
- importance;
size;
confidentiality level; and
format.
- importance;
-
27. A method according to claim 24, wherein said predetermined number is dependent on properties of said information object.
-
28. A method according to claim 27, wherein said properties of said information object comprise at least one of a group comprising:
- importance;
size;
confidentiality level; and
format.
- importance;
-
29. A method according to claim 5, comprising formulating respective assigned elementary information unit identifier to be resilient to small errors.
-
30. A method according to claim 5, wherein said assigning of elementary information unit identifier utilizes image matching.
-
31. A method according to claim 5, wherein said assigning of elementary information unit identifier comprises a mapping to a Euclidian space.
-
32. A method according to claim 31, wherein said mapping to a Euclidian space comprises approximating a pairwise difference between elementary information units.
-
33. A method according to claim 32, wherein said approximating is such that a difference between two elementary information units approximates said pairwise difference between said two elementary information units.
-
34. A method according to claim 32, wherein said approximation of said pairwise difference between elementary information units comprises an approximation of at least one of the following:
semantic difference;
distance measured by image matching;
phonetic difference; and
spelling difference.
-
35. A method according to claim 5, wherein said assigning of said elementary information unit identifier is carried out a plurality of times, each time utilizing a different method for assigning of an elementary information unit identifier.
-
36. A method according to claim 35, wherein said assigning of elementary information unit identifier several times comprises storing of elementary information unit identifier assigned utilizing different methods are stored separately.
-
37. A method according to claim 35, wherein said assigning of said elementary information unit identifier several times comprises storing of elementary information unit identifier assigned utilizing different methods can be distinguished according to said method utilized to assign them.
-
38. A method according to claim 35, wherein said different methods are selected such as to optimize between at least any two of the following:
storage space;
search speed;
capability to detect transformation;
capability to detect a specific transformation;
resilience to transformation;
resolution of identification from among similar information objects;
resolution of identification of boundaries within compound information objects;
resilience to a specific transformation; and
resilience to transformation.
-
39. A method according to claim 35, wherein said assigning utilizing different methods comprises utilizing said different methods sequentially until a predetermined stop condition is reached.
-
40. A method according to claim 5, wherein said assigning of a respective elementary information unit identifier comprises utilizing a method having at least one of the following characteristics:
order sensitive to data in the elementary information unit;
order insensitive in the elementary information unit;
utilizing changing definitions of the elementary information unit such that said assigning of said elementary information unit identifier is carried out a plurality of times using a plurality of definitions;
utilizing an exchangeable method of preprocessing, such that said assigning of said elementary information unit identifier is carried out several times;
being omission resilient;
being insertion resilient;
being replacement resilient;
being dictionary based;
being distribution based;
being locality based;
being histogram based; and
being n-gram based.
-
41. A method according to claim 5, wherein said information object comprises spreadsheet data, and wherein said assigning of said elementary information unit identifier assigned to said information object comprises utilizing a method comprising at least one of the following characteristics:
invariance to linear transformation;
invariance to reordering;
invariance to permutation;
resilience to linear transformation;
resilience to reordering;
resilience to permutation;
resilience to minor changes;
resilience to cuts;
utilizing of statistic moment;
utilizing of statistic moment for a table;
utilizing statistic moment for a row;
utilizing statistic moment for a column; and
utilizing a mathematical descriptor of the information object data.
-
42. A method according to claim 5, comprising utilizing said elementary information unit identifiers for said information object identification using a technique having at least one of the following characteristics:
- omission resilience;
insertion resilience;
replacement resilience;
being dictionary based;
being distribution based;
being locality based;
being based on the size of elementary information units;
being based on the size of information objects;
resilience to linear transformation;
resilience to reordering;
resilience to permutation;
resilience to minor changes;
resilience to cuts;
being histogram based; and
being n-gram based.
- omission resilience;
-
43. A method according to claim 5, comprising using said determining to locate at least one information object with similar content to a given information object.
-
44. A method according to claim 43, wherein said locating is done in an information storage medium.
-
45. A method according to claim 44, further comprising utilizing a crawler for automatic location of information objects within said information storage medium.
-
46. A method according to claim 44, wherein said information storage medium comprises at least one file system.
-
47. A method according to claim 1, wherein said information object identification is carried out on an instance of said information object, said information object instance being said information object in a specific format.
-
48. A method according to claim 47, wherein said format comprise at least one of the following:
jpeg image;
gif image;
Word document format;
Lotus notes format;
mpeg format;
text format;
rich text format;
Unicode text format;
multi byte text encoding format;
formatted text format;
ASCII text format;
HTML;
XML;
PDF;
postscript;
MS-Excel spreadsheet;
MS-Excel drawing;
MS-Visio drawing;
Photoshop drawing;
AutoCAD drawing format; and
CAD drawing format.
-
49. A method according to claim 1, wherein said information comprises at least one of the following:
numeric data;
spreadsheet data;
numeric spreadsheet data;
textual spreadsheet data;
word processor data;
textual data;
hyper text data;
audio data;
visual data;
multimedia data;
binary data;
raw data;
database data;
video data;
drawing data;
chart data;
picture data; and
image data.
-
50. A method according to claim 1, comprising using said a determined quantity to attach to said information object an information object policy, said policy comprising at least one of the following:
-
an allowed distribution of said information object; a restriction on distribution of said information object; an allowed storage of said information object; a restriction on storage of said information object; an action to be taken as a reaction to an event; an allowed usage of said information object; and a restriction on usage of said information object.
-
-
51. A method according to claim 50, wherein said information object policy comprises at least one action to be taken as a reaction to an event, and wherein said action comprises at least one of the following:
preventing distribution of said information object;
preventing storage of said information object;
preventing usage of said information object;
reporting distribution of said information object;
reporting storage of said information object;
reporting usage of said information object;
reporting;
alerting about distribution of said information object;
alerting storage of said information object;
alerting usage of said information object;
alerting;
logging distribution of said information object;
logging storage of said information object;
logging usage of said information object;
logging;
notifying about distribution of said information object;
notifying about storage of said information object;
notifying about usage of said information object;
notifying;
notifying to an administrator;
notifying to a manager;
notifying to a recipient;
notifying to a sender;
notifying to an owner of said information object;
quarantine;
alerting an administrator;
alerting a manager;
alerting a recipient;
alerting a sender;
alerting an owner of said information object;
reporting to an administrator;
reporting to a manager;
reporting to a recipient;
reporting to a sender;
reporting to an owner of said information object;
encrypting said information object;
changing said information object;
replacing said information object; and
utilizing digital rights management technology on said information object.
-
52. A method according to claim 50, wherein said information object policy comprises at least one action to be taken as a reaction to an event, and wherein said event comprises at least one of the following:
-
attempted distribution of said information object;
attempted storage of said information object;attempted usage of said information object;
distribution of said information object;
storage of said information object; and
usage of said information object.
-
-
53. A method according to claim 52, wherein said policy comprises at least one mandatory lifecycle.
-
54. A method according to claim 53, wherein said action is dependent on the matching of said mandatory lifecycle with a lifecycle of a respective event.
-
55. A method according to claim 53, wherein said mandatory lifecycle comprises at least one mandatory recipient of said information object;
- and an order of events concerning said information object.
-
56. A method according to claim 50, wherein said information object usage comprises at least one of the following:
copying an excerpt;
editing;
copying to clipboard;
copying an excerpt to clipboard;
changing format;
changing encoding;
encryption;
decryption;
changing digital management;
opening by an application; and
printing.
-
57. A method according to claim 50, wherein said information object policy comprises placing a substantially imperceptible marking in said information object, said marking comprising information content, and said method comprising placing said marking, when indicated by said policy, before allowing at least one of the following:
storage of said information object;
usage of said information object; and
distribution of said information object.
-
58. A method according to claim 57, wherein said information content for storage in said marking comprises at least one of the following:
-
the identity of said information object; the identity of a user performing the action in respect to said information object; the identity of a user authorizing the action in respect to said information object; the identity of a user overriding policy and approving the action in respect to said information object; and the identity of a user requesting the action in respect to said information object.
-
-
59. A method according to claim 50, wherein said information object policy further comprises changing said information object by at least one of the following:
-
deleting part of said information object;
replacing part of said information object; and
inserting an additional part to said information objectbefore allowing at least one of the following actions; storage of said information object;
usage of said information object; and
distribution of said information object.
-
-
60. A method according to claim 59, wherein said changing of said information object is done in order to eliminate parts having policies that do not allow for said action to be executed while they are in the document.
-
61. A method according to claim 59, wherein said changing of said information object is carried out in order to personalize said information object.
-
62. A method according to claim 59, wherein said changing of said information object is carried out in order to customize said information object for a specific use.
-
63. A method according to claim 59, wherein said changing of said information object is done in a manner selected to achieve at least one of the following:
preserving the coherency of said information object;
seamlessness;
preserve the structure of said information object;
preserving the linguistic coherency of said information object;
preserving the formatting style of said information object; and
preserve the pagination style of said information object.
-
64. A method according to claim 59, wherein said information objects comprise compound information objects and wherein said changing of said information object is made to constituent parts of a compound information object.
-
65. A method according to claim 59, wherein said inserting an additional part to said information object comprises inserting at least one of the following:
- a header;
a footer; and
a disclaimer.
- a header;
-
66. A method according to claim 50, wherein said storing comprises storage in at least one of the following:
a portable media device;
a floppy disk;
a hard drive;
a portable hard drive;
a flash card;
a flash device;
disk on key;
magnetic tape;
magnetic media;
optic media;
punched cards;
a machine readable media;
a CD;
a DVD;
a firewire device;
a USB device; and
a hand held computer.
-
67. A method according to claim 50, wherein said policy comprises distribution regulation, said distribution regulation being for regulating at least one of the following:
-
sending said information object via mail; sending said information object via web mail; uploading said information object to a web server; uploading said information object to a FTP server; sending said information object via a file transfer application; sending said information object via an instant messaging application; sending said information object via a file transfer protocol; and sending said information object via an instant messaging protocol.
-
-
68. A method according to claim 50, wherein said policy is dependent on at least one of the following:
the domain of a respective information object;
the identity of a system;
the identity of a user;
the identity level of a user authorizing an action;
the identity of a user requesting an action;
the identity of a user involved in an action;
the identity of a user receiving an information object;
the authentication level of a system;
the authentication level of a user;
the authentication level of a user requesting an action;
the authentication level of a user authorizing an action;
the authentication level of a user involved in an action;
the authentication level of a user receiving said information object;
the authentication level of a user sending said information object;
the format of an information object instance;
an interface being used;
an application being used;
encryption being used;
digital rights management technology being used;
detection of transformation, wherein said transformation is operable to reduce the ability to identify said transformed information object;
information object integrity;
regular usage pattern;
regular distribution pattern;
regular storage pattern;
information path;
consistency of an action with usage pattern;
the identity of a user overriding policy and authorizing the action in respect to said information object;
the authentication level of a user overriding policy and authorizing the action in respect to said information object;
the identity of a user sending information object;
information property of said information object;
language of said information object;
representation of said information object;
operations done on of said information object;
identity of users involved along the life cycle of said information object;
application used on of said information object;
transition channel of said information object;
participant agents;
virtual location of a computer;
logical location of a computer;
physical location of a computer;
type of a computer;
type of a laptop computer;
type of a desktop computer;
type of a server computer; and
owner identity.
-
69. A method according to claim 50 or claim 68, further comprising defining areas and wherein said policy is dependent on whether an action is taken inside a given defined area.
-
70. A method according to claim 50 or claim 68, further comprising defining areas and wherein said policy is dependent on whether an event occurs inside a given defined area.
-
71. A method according to claim 50, further comprising enabling at least one user to override at least one of decisions contained within said policy.
-
72. A method according to claim 50, wherein at least part of said policy is stored in a database.
-
73. A method according to claim 50, wherein at least part of said policy is defined in terms of a logic expression.
-
74. A method according to claim 73, wherein said expression is evaluated by lazy evaluation.
-
75. A method according to claim 73, wherein at least some of the variables in said logic expression comprise of at least on of the following:
an external function;
an external function based on group membership; and
an external variable.
-
76. A method according to claim 50, comprising defining an information class as a group consisting of at least two information objects, said defining further comprising associating with said information class a corresponding class policy being a policy shared by said information objects.
-
77. A method according to claim 76, wherein said information class policy comprises at least a part of respective policies of said information objects within said class.
-
78. A method according to claim 76, wherein said information class is a knowledge class.
-
79. A method according to claim 50, wherein at least part of said policy is defined in terms of any one of a group comprising rules, imposed restrictions, granted privileges, reaction to one or more given events, group operations, a property of said information object, a property of a user, a property of a computer, a property of an entity, and a hierarchy of calculations.
-
80. A method according to claim 50, wherein at least part of said policy is defined in terms of a role, wherein said role consists of a property of at least one of a user and a system and wherein said role further comprises at least one authorization.
-
81. A method according to claim 50, wherein at least part of said policy is defined in terms of at least one of the following languages:
a scripting language;
an ordered calculation language;
a programming language;
an interpreted language; and
a functional language.
-
82. A method according to claim 81, wherein said at least one of said following languages comprises instructions for the operation of an ordered calculation resulting in at least one of the following:
policy;
instruction to perform an action;
restriction; and
allowance.
-
83. A method according to claim 50, wherein said information object is a compound information object comprising constituent simple information objects, and a respective policy assigned to said information object comprises different policies for at least some of said constituent information objects.
-
84. A method according to claim 50, comprising using organizational structure information in order to assign a respective policy object.
-
85. A method according to claim 50, wherein at least a part of an information object policy of a respective information object is derived from a default information object policy when said part of said information object policy of said information object is not explicitly defined.
-
86. A method according to claim 50, wherein an information object policy comprises at least some information about one or more methods utilized for assigning of an elementary information unit identifier to a respective information object.
-
87. A method according to claim 50, further comprising changing access control information in accordance with said policy.
-
88. A method according to claim 50, comprising attaching a respective policy to information objects according to their logical location within an information storage medium.
-
89. A method according to claim 88, further comprising utilizing a crawler for automatic location of information objects within said information storage medium.
-
90. A method according to claim 89, wherein said information storage medium
is a file system. -
91. A method according to claim 1, wherein said determining said determined quantity comprises utilizing conditional probabilities for at least one of the following:
-
identification of information objects; classification of information objects; and identification of a knowledge domain of information objects.
-
-
92. A method according to claim 1, wherein said determining said determined quantity further comprises utilizing keywords for at least one of the following:
identification of information objects;
identification of elementary information units;
classification of information objects; and
identification of the domain of information objects.
-
93. A method according to claim 92, wherein said keywords are stored in a database.
-
94. A method according to claim 92, wherein said keywords are stored in at least one of the following forms:
hash value;
raw string; and
numeric representation.
-
95. A method according to claim 1, wherein at least one user is defined in an owner definition as an owner of said information object.
-
96. A method according to claim 95, wherein said owner definition is stored in a database.
-
97. A method according to claim 1, wherein said determining said quantity further comprises utilizing organizational structure information.
-
98. A method according to claim 97, wherein said organizational structure information comprise at least one of the following:
user superiority;
working groups;
organizational hierarchy;
departmental separation; and
membership in working groups.
-
99. A method according to claim 97, wherein at least part of said organizational structure information is stored in a database.
-
100. A method according to claim 97, wherein at least part of said organizational structure information is used for information object classification.
-
101. A method according to claim 97, wherein at least part of said organizational structure information is imported from at least one of the following:
-
organizational data system;
data management system;
organizational data management system;knowledge management system;
user directory;
LDAP server;
document; and
an organizational chart.
-
-
102. A method according to claim 1, further comprising making use of at least one user interface operable to assist in at least one of the following:
classification;
policy definition;
template definition;
approving and revising automatic template definition;
importing organizational structure information;
revising organizational structure information;
produce reports;
overriding policy decisions; and
providing authorizations.
-
103. A method according to claim 1, further comprising using template information objects to represent commonly repeated information, such that a template information object together with a difference information object representing instance specific information are together formable to produce a compound information object in which common and specific information are respectively identifiable.
-
104. A method according to claim 103, comprising using said template information object in identifying any of unknown information object comprising information corresponding to said template information object.
-
105. A method according to claim 103, wherein said template information object is a compound information object, wherein said template information object comprises at least one placeholder, and wherein said method comprises replacing said placeholder by at least part of said difference information object when said difference information object and a respective template information object are combined.
-
106. A method according to claim 105, wherein at least one of said placeholders is a specialized placeholder, said specialized placeholder comprising specialization information to identify a respective specialization of said specialized placeholder.
-
107. A method according to claim 106, wherein said specialized placeholder comprises a restriction about information objects permitted for replacing said specialized placeholder, and wherein said restriction comprises a rule for excluding at least one of the following:
-
an object comprising numeric information; an object comprising a word; an object comprising a character; an object comprising a digit; an object comprising a sentence; and an object comprising a simple information object.
-
-
108. A method according to claim 105, wherein at least one of said placeholders is a specialized placeholder, said specialized placeholder comprising a restriction about information objects permitted for replacing said specialized placeholder.
-
109. A method according to claim 103, wherein said template information object comprises at least one of a group comprising:
- a disclaimer;
a form;
a header;
a footer;
a contract; and
an invoice.
- a disclaimer;
-
110. A method according to claim 103, comprising defining a template information object and wherein said defining comprises automatically identifying a template information object candidate.
-
111. A method according to claim 110, wherein said automatically identifying a template information object candidate comprises identification of shared elementary information units of at least two information objects.
-
112. A method according to claim 110, wherein said step of automatically identifying a template information object candidate comprises identification of substantially similar information objects.
-
113. A method according to claim 110, wherein said step of automatically identifying a template information object candidate comprises the use of at least one of text parsing;
- and text matching.
-
114. A method according to claim 103, comprising deriving at least a part of a respective information object policy associated with a template instance information object from an information object policy of a respective originating template information object.
-
115. A method according to claim 1, further comprising a stage of detection of information objects having undergone transformations.
-
116. A method according to claim 115, wherein said stage of detection is aimed for detection of transformations intended to reduce the ability to identify said information object.
-
117. A method according to claim 115, wherein said detection of information objects that have undergone transformation comprises detection of at least one of a group comprising:
transformation artifacts;
spelling mistakes;
wrong grammar;
wrong punctuation;
wrong capitalization;
missing punctuation;
missing capitalization;
irregular word distribution;
lack of common words;
predominance of unknown words;
inconsistent headers;
headers inconsistent with file type;
headers inconsistent with file content;
file type inconsistent with file content;
irregular distribution of characters;
irregular distribution of words;
irregular distribution of character sequences;
irregular distribution of word sequences;
irregular length of words;
irregular length of sentences;
irregular distribution of length of words;
irregular distribution of length of sentences;
irregular file format;
irregular file encoding;
unknown file format;
unknown file encoding;
mix of non-alphabetic characters;
unopenable file;
action time;
information object creation time;
information object update time;
encryption; and
an unexpectedly high level of entropy.
-
118. A method according to claim 1, wherein said information object is a knowledge object.
-
119. A method according to claim 2, wherein said elementary information unit is an elementary fact.
-
120. A method according to claim 119, wherein said elementary fact comprises at least one of the following:
sentence;
database entry;
representation independent description of knowledge;
modular description of knowledge; and
abstract description of knowledge.
-
121. A method according to claim 1, further comprising a stage of discerning lifecycle information about a respective information object.
-
122. A method according to claim 121, wherein said discerning of information about the lifecycle of said information object comprises utilizing information about sharing of at least one elementary information unit in said information object, wherein said elementary information unit is shared with at least one additional information object.
-
123. A method according to claim 121, wherein said discerning of information about the lifecycle of said information object is based on at least one of a group comprising:
- file system date information;
information about editing of said information object; and
information about registration of said information object.
- file system date information;
-
124. A method according to claim 121, comprising utilizing said information about the lifecycle of said information object for the creation of a lifecycle graph.
-
125. A method according to claim 121, comprising utilizing said information about the lifecycle of said information object to define at least part of the policy of said information object said utilizing comprising identifying at least one other information object along said information object'"'"'s lifecycle and examining a policy associated therewith.
-
126. A method according to claim 1, further comprising utilizing a client.
-
127. A method according to claim 126, wherein said client comprises at least one of the following:
-
end point software;
end point hardware;
tamper resistant software;
tamper resistant hardware;client side software; and
client side hardware.
-
-
128. A method according to claim 126, comprising utilizing said client for at least one of the following:
-
monitoring of client side storage;
monitoring of client side access;
monitoring of client side usage;monitoring of client side distribution; monitoring of copying of information object excerpts; monitoring of clipboard; monitoring of at least one application; monitoring of at least one interface; control of at least one application; control of at least one interface; control of clipboard; control of copying of information object excerpts; control of client side storage; control of client side access; control of client side usage; and control of client side distribution.
-
-
129. A method according to claim 1, comprising utilizing comparing of at least two information objects to calculate pairwise similarity between objects.
-
130. A method according to claim 129, comprising utilizing said pairwise similarity to map said information objects to a space.
-
131. A method according to claim 130, wherein said space is an Euclidian space, and wherein the closeness between any two objects within said Euclidian space is approximately proportional to said pairwise similarity between said information objects.
-
132. A method according to claim 130, wherein said space is a weighted graph, and wherein the weight of an edge between any two objects within said graph space is approximately proportional to said pairwise similarity between said information objects.
-
133. A method according to claim 130, wherein said space is a graph, and wherein the existence of an edge between any two objects within said graph space is dependent on said pairwise similarity between said information objects.
-
134. A method according to claim 130, wherein said space is utilized to identify at least one similarity information class, wherein said information class consists of at least two information objects, wherein there is provided an information class policy comprising a policy shared by the information class, and wherein said similarity information class is bounded within said space.
-
135. A method according to claim 130, comprising utilizing said space to identify at last one information object substantially similar to an unidentified information object.
-
136. A method according to claim 130, comprising using said space to identify at least one other information object substantially similar to an information object for which policy is not known, thereby to obtain a policy associated with said other information object to use as basis for a policy for said information object.
-
137. A method according to claim 1, comprising storing information about said information object in a database.
-
138. A method according to claim 1, further comprising extracting a descriptor of said information object, based on statistical analysis of said information object.
-
139. A method according to claim 2, comprising storing the order of said elementary information units within said information object in a database.
-
140. A method according to claim 139, comprising using said order for identification of said information object.
-
141. A method according to claim 1, further comprising interfacing at least one of an information management system;
- and a document management system.
-
142. A method according to claim 1, further comprising tracking at least one of the following:
usage patterns;
storage patterns; and
distribution patterns.
-
143. A method according to claim 142, wherein said tracking is carried out to infer information about at least one of the following:
normal usage patterns;
normal storage patterns;
normal distribution patterns;
irregular usage patterns;
irregular storage patterns; and
irregular distribution patterns.
-
144. A method according to claim 143, wherein said inferred information is used to define at least part of a policy.
-
145. A method according to claim 143, comprising using said inferred information for information object classification.
-
146. A method according to claim 1, further comprising logging.
-
147. A method according to claim 146, wherein said logging comprising logging of at least one of the following:
actions;
events; and
information objects identification.
-
148. A method according to claim 146, wherein at least part of said logging is controlled by a policy.
-
149. A method according to claim 146, wherein at least part of said logging is stored in a database.
-
150. A method according to claim 146, comprising utilizing said logging to augment lifecycle information for said information object.
-
151. A method according to claim 1, further comprising assessing the integrity of at least one information object, wherein said integrity assessment consists of comparing said information object with a version of said information object for which integrity is assured.
-
152. A method according to claim 151, further comprising issuing a certificate of said integrity for at least one information object.
-
153. A method according to claim 152, wherein said certificate is a cryptographic certificate.
-
154. A method according to claim 151, further comprising replacing said information object with said version of said information object for which said integrity is assured.
-
155. A method according to claim 151, comprising identifying when said integrity of said information object is not satisfactory, and in such a case not allowing distribution of said information object.
-
156. A method according to claim 151, comprising identifying when said integrity of said information object is not satisfactory, and in such a case not allowing storage of said information object.
-
157. A method according to claim 151, comprising identifying when said integrity of said information object is not satisfactory, and in such a case not allowing usage of said information object.
-
158. A method according to claim 1, further comprising defining at least one constituent information object to be an ignored information object, and wherein, whenever said to be ignored information object is an element of a compound information, ignoring said object in identification of said compound information object.
-
159. A method according to claim 1, further comprising not allowing usage of respective ones of said information objects outside an organization.
-
160. A method according to claim 1, further comprising not allowing storage of respective ones of said information object outside an organization.
-
161. A method according to claim 1, further comprising not allowing distribution of respective ones of said information object outside an organization.
-
2. A method according to claim 1, wherein said information objects comprise at least one simple information object, said simple information object comprising one of the following:
-
-
162. Apparatus for automatic information identification to enforce an information management policy over a period of time on information objects that are subject to evolution of content contained therein over said period of time, said information objects containing elementary information units, the apparatus being a computerized apparatus comprising:
-
a scanning module for identifying an information object in use subsequent to at least some of said evolution, finding respective elementary information units within said information object, some of said elementary information units not being subject to said evolution, and comparing said found elementary information units with elementary information units of a pre-evolution information object to which an instance of said information management policy was applied, said comparison finding a quantity of elementary information units that match; and a deduction module for deducing information about the identity of said information object in use by comparing said quantity to a predetermined threshold, and when said threshold is exceeded, applying said policy instance to said information object in use, said managing thereby being applied to said information object in use despite evolution of said content contained in said information object over said time period. - View Dependent Claims (163, 164, 165, 166, 167, 168, 169, 170, 171, 172)
-
163. Apparatus according to claim 162, wherein said information objects comprise at least one simple information object, said simple information object comprising one of the following:
-
an elementary information unit; a set of elementary information units; and an ordered set of elementary information units.
-
-
164. Apparatus according to claim 162, wherein said elementary information units comprise at least one of the following:
a sentence;
a sequences of words;
a word;
a sequence of characters;
a character;
a sequence of numbers;
a number;
a sequence of digits;
a digit;
a vector;
a curve;
a pixel;
a block of pixels;
an audio frame;
a musical note;
a musical bar;
a visual object;
a sequence of video frames;
a sequence of musical notes;
a sequence of musical bars; and
a video frame.
-
165. Apparatus according to claim 162, wherein said deduction module is further configured to assign elementary information unit identifiers to elementary information units after identification.
-
166. Apparatus according to claim 162, wherein said deduction module is further configured to utilize said elementary information unit identifiers in said deducing.
-
167. Apparatus according to claim 165, wherein said deduction module is configured to provide said elementary information unit identifiers in a manner determined at least partly by the content of said elementary information units which they are assigned to.
-
168. Apparatus according to claim 167, wherein said elementary information unit identifiers are solely determined by said content.
-
169. Apparatus according to claim 165, wherein said deduction module is configured to provide said elementary information units identifiers in a manner at least partly determined by locations within an information object of respective elementary information units to which they are assigned.
-
170. Apparatus according to claim 162, wherein said information object identification is carried out on an instance of said information object, said information object instance being said information object in a specific format.
-
171. Apparatus according to claim 162 further comprising a policy attachment unit associated with said deduction module, said policy attachment unit being configured to use said deducing to attach to said information object an information object policy, said policy comprising at least one of the following:
-
an allowed distribution of said information object; a restriction on distribution of said information object; an allowed storage of said information object; a restriction on storage of said information object; an action to be taken as a reaction to an event; an allowed usage of said information object; and a restriction on usage of said information object.
-
-
172. Apparatus according to claim 162, wherein said deducing comprises utilizing conditional probabilities for at least one of the following:
-
identification of information objects; classification of information objects; and identification of a knowledge domain of information objects.
-
-
163. Apparatus according to claim 162, wherein said information objects comprise at least one simple information object, said simple information object comprising one of the following:
-
Specification
- Resources
-
Current AssigneeForcepoint LLC (Francisco Partners Management LLC)
-
Original AssigneePortAuthority Technologies, Inc. (Francisco Partners Management LLC)
-
InventorsPeled, Ariel, Grindlinger, Yair, Troyansky, Lidror, Carny, Ofir, Baratz, Arik
-
Primary Examiner(s)LIN, KENNY S
-
Application NumberUS10/533,452Publication NumberTime in Patent Office3,360 DaysField of Search709/200, 709/224, 707/600, 707/705US Class Current709/200CPC Class CodesG06F 21/1078 Logging; MeteringG06F 21/16 Program or content traceabi...G06F 21/554 involving event detection a...G06F 21/60 Protecting dataG06Q 10/0637 Strategic management or ana...G06Q 10/10 Office automation; Time man...G06Q 50/265 Personal security, identity...H04L 63/0245 Filtering by information in...H04L 63/0281 ProxiesH04L 63/10 for controlling access to d...H04L 63/20 for managing network securi...