Techniques for controlling distribution of information from a secure domain
DCFirst Claim
1. A computer-implement method of controlling distribution of a message from a sender to a recipient, the method comprising:
- constructing semantic models for a plurality of message categories;
constructing a semantic model for the message;
comparing the semantic model of the message with the semantic models for the plurality of message categories;
classifying the message based on the comparison; and
determining if the message can be distributed to the recipient based on the classification of the message.
9 Assignments
Litigations
0 Petitions
Accused Products
Abstract
Techniques for controlling distribution of information from a secure domain by automatically detecting outgoing messages which violate security policies corresponding to the secure domain. Semantic models are constructed for one or more message categories and for the outgoing messages. The semantic model of an outgoing message is compared with the semantic models of the message categories to determine a degree of similarity between the semantic models. The outgoing message is classified based on the degree of similarity obtained from the comparison. A determination is made, based on the classification of the outgoing message, if distribution of the outgoing message would violate a security policy for the secure domain. Distribution of the outgoing message is allowed if no security policy is violated.
-
Citations
55 Claims
-
1. A computer-implement method of controlling distribution of a message from a sender to a recipient, the method comprising:
-
constructing semantic models for a plurality of message categories;
constructing a semantic model for the message;
comparing the semantic model of the message with the semantic models for the plurality of message categories;
classifying the message based on the comparison; and
determining if the message can be distributed to the recipient based on the classification of the message. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
receiving descriptions for the plurality of message categories;
receiving text segments from a natural language processing information retrieval system based on the descriptions;
extracting knowledge representations from the text segments; and
constructing the semantic models for the plurality of message categories using the knowledge representations.
-
-
3. The method of claim 2 wherein the semantic models for the plurality of message categories include concept-relation-concept (CRC) triples and relation-concept (RC) tuples.
-
4. The method of claim 2 wherein receiving the text segments from the natural language processing information retrieval system based on the descriptions comprises:
-
extracting concepts for each message category from the descriptions;
submitting the concepts as queries to a natural language processing information retrieval system; and
receiving the text segments from the natural language processing information retrieval system relevant to the queries.
-
-
5. The method of claim 2 wherein extracting the concepts for each message category comprises using a lexical database to expand the concepts extracted from the descriptions.
-
6. The method of claim 2 wherein the natural language processing information retrieval system has access to a corpus of documents relevant to the plurality of message categories.
-
7. The method of claim 1 wherein constructing the semantic model for the message comprises:
-
parsing the message to extract meta-information and text information from the message;
extracting knowledge representations from the text information and the meta-information; and
constructing the semantic model for the message using the knowledge representations.
-
-
8. The method of claim 7 wherein the meta-information includes information about the sender, information about the recipient, security information for the sender, and security information for the recipient.
-
9. The method of claim 7 wherein the semantic model for the message includes concept-relation-concept (CRC) triples and relation-concept (RC) tuples.
-
10. The method of claim 1 wherein comparing the semantic model of the message with the semantic models for the plurality of message categories comprises determining a degree of similarity between the semantic model of the message and the semantic model for each message category in the plurality of message categories.
-
11. The method of claim 10 wherein classifying the message based on the comparison comprises:
-
providing a threshold degree of similarity; and
classifying the message as belonging to a message category if the degree of similarity between the semantic model of the message and the semantic model of the message category exceeds the threshold degree of similarity.
-
-
12. The method of claim 11 wherein the degree of similarity is user-defined.
-
13. The method of claim 11 wherein providing the threshold degree of similarity comprises providing degree of similarity thresholds for each message category of the plurality of message categories.
-
14. The method of claim 11 wherein providing the threshold degree of similarity comprises providing a single degree of similarity threshold for the plurality of message categories.
-
15. The method of claim 10 wherein classifying the message based on the comparison comprises:
-
providing a threshold degree of similarity; and
classifying the message as not belonging to any message category if the degree of similarity between the semantic model of the message and the semantic models of the plurality of message categories is lower than the threshold degree of similarity.
-
-
16. The method of claim 15 further comprising:
-
providing a graphical user interface;
displaying information about the message on the graphical user interface; and
manually classifying the message as belonging to a message category using the graphical user interface.
-
-
17. The method of claim 16 wherein displaying the information about the message on the graphical user interface comprises:
-
displaying sections of the message which showed some similarity with the message categories;
displaying the message categories with which the message showed some similarities; and
displaying reasons for the similarities.
-
-
18. The method of claim 16 further comprising:
-
forwarding the manually classified message and the manual classification information to a machine learning module; and
updating the semantic model of the message category to which the message was manually classified based on the manual classification information.
-
-
19. The method of claim 1 wherein determining if the message can be routed to the recipient comprises:
-
providing a security policy;
determining if the message violates the security policy based on the classification of the message; and
permitting distribution of the message to the recipient if the security policy is not violated.
-
-
20. The method of claim 19 wherein determining if the message violates the security policy comprises:
-
determining a security clearance level for the sender;
determining a security clearance level for the message category to which the message was classified; and
indicating that the message violates the security policy if the security clearance level of the sender is lower than the security clearance level of the message category.
-
-
21. The method of claim 19 wherein determining if the message violates the security policy comprises:
-
determining a security clearance level for the recipient;
determining a security clearance level for the message category to which the message was classified; and
indicating that the message violates the security policy if the security clearance level of the recipient is lower than the security clearance level of the message category.
-
-
22. The method of claim 1 wherein determining if the message can be routed to the recipient comprises:
-
providing a security policy;
determining if the message violates the security policy based on the classification of the message; and
prohibiting distribution of the message to the recipient if the security policy is violated.
-
-
23. A computer program product for controlling distribution of a message from a sender to a recipient, the computer program product comprising:
-
code for constructing semantic models for a plurality of message categories;
code for constructing a semantic model for the message;
code for comparing the semantic model of the message with the semantic models for the plurality of message categories;
code for classifying the message based on the comparison;
code for determining if the message can be distributed to the recipient based on the classification of the message; and
a computer-readable medium for storing the codes. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44)
code for receiving descriptions for the plurality of message categories;
code for receiving text segments from a natural language processing information retrieval system based on the descriptions;
code for extracting knowledge representations from the text segments; and
code for constructing the semantic models for the plurality of message categories using the knowledge representations.
-
-
25. The computer program product of claim 24 wherein the semantic models for the plurality of message categories include concept-relation-concept (CRC) triples and relation-concept (RC) tuples.
-
26. The computer program product of claim 24 wherein the code for receiving the text segments from the natural language processing information retrieval system based on the descriptions comprises:
-
code for extracting concepts for each message category from the descriptions;
code for submitting the concepts as queries to a natural language processing information retrieval system; and
code for receiving the text segments from the natural language processing information retrieval system relevant to the queries.
-
-
27. The computer program product of claim 24 wherein the code for extracting the concepts for each message category comprises code for using a lexical database to expand the concepts extracted from the descriptions.
-
28. The computer program product of claim 24 wherein the natural language processing information retrieval system has access to a corpus of documents relevant to the plurality of message categories.
-
29. The computer program product of claim 23 wherein the code for constructing the semantic model for the message comprises:
-
code for parsing the message to extract meta-information and text information from the message;
code for extracting knowledge representations from the text information and the meta-information; and
code for constructing the semantic model for the message using the knowledge representations.
-
-
30. The computer program product of claim 29 wherein the meta-information includes information about the sender, information about the recipient, security information for the sender, and security information for the recipient.
-
31. The computer program product of claim 29 wherein the semantic model for the message includes concept-relation-concept (CRC) triples and relation-concept (RC) tuples.
-
32. The computer program product of claim 23 wherein the code for comparing the semantic model of the message with the semantic models for the plurality of message categories comprises code for determining a degree of similarity between the semantic model of the message and the semantic model for each message category in the plurality of message categories.
-
33. The computer program product of claim 32 wherein the code for classifying the message based on the comparison comprises:
-
code for providing a threshold degree of similarity; and
code for classifying the message as belonging to a message category if the degree of similarity between the semantic model of the message and the semantic model of the message category exceeds the threshold degree of similarity.
-
-
34. The computer program product of claim 33 wherein the degree of similarity is user-defined.
-
35. The computer program product of claim 33 wherein the code for providing the threshold degree of similarity comprises code for providing degree of similarity thresholds for each message category of the plurality of message categories.
-
36. The computer program product of claim 33 wherein the code for providing the threshold degree of similarity comprises code for providing a single degree of similarity threshold for the plurality of message categories.
-
37. The computer program product of claim 32 wherein the code for classifying the message based on the comparison comprises:
-
code for providing a threshold degree of similarity; and
code for classifying the message as not belonging to any message category if the degree of similarity between the semantic model of the message and the semantic models of the plurality of message categories is lower than the threshold degree of similarity.
-
-
38. The computer program product of claim 37 further comprising:
-
code for providing a graphical user interface;
code for displaying information about the message on the graphical user interface; and
code for manually classifying the message as belonging to a message category using the graphical user interface.
-
-
39. The computer program product of claim 38 wherein the code for displaying the information about the message on the graphical user interface comprises:
-
code for displaying sections of the message which showed some similarity with the message categories;
code for displaying the message categories with which the message showed some similarities; and
code for displaying reasons for the similarities.
-
-
40. The computer program product of claim 38 further comprising:
-
code for forwarding the manually classified message and the manual classification information to a machine learning module; and
code for updating the semantic model of the message category to which the message was manually classified based on the manual classification information.
-
-
41. The computer program product of claim 23 wherein the code for determining if the message can be routed to the recipient comprises:
-
code for providing a security policy;
code for determining if the message violates the security policy based on the classification of the message; and
code for permitting distribution of the message to the recipient if the security policy is not violated.
-
-
42. The computer program product of claim 41 wherein the code for determining if the message violates the security policy comprises:
-
code for determining a security clearance level for the sender;
code for determining a security clearance level for the message category to which the message was classified; and
code for indicating that the message violates the security policy if the security clearance level of the sender is lower than the security clearance level of the message category.
-
-
43. The computer program product of claim 41 wherein the code for determining if the message violates the security policy comprises:
-
code for determining a security clearance level for the recipient;
code for determining a security clearance level for the message category to which the message was classified; and
code for indicating that the message violates the security policy if the security clearance level of the recipient is lower than the security clearance level of the message category.
-
-
44. The computer program product of claim 23 wherein the code for determining if the message can be routed to the recipient comprises:
-
code for providing a security policy;
code for determining if the message violates the security policy based on the classification of the message; and
code for prohibiting distribution of the message to the recipient if the security policy is violated.
-
-
45. A system for controlling distribution of a message from a sender to a recipient, the system comprising:
-
a processor;
a memory coupled to the processor, the memory configured to store a plurality of modules for execution by the processor, the modules including;
a first module for constructing semantic models for a plurality of message categories;
a second module for constructing a semantic model for the message;
a comparator module for comparing the semantic model of the message with the semantic models for the plurality of message categories;
a classifier module for classifying the message based on the comparison; and
a security module for determining if the message can be routed to the recipient based on the classification of the message.
-
-
46. A system for controlling distribution of a message from a sender to a recipient, the system comprising:
-
a batch processing subsystem configured to construct semantic models for a plurality of message categories; and
a real time processing subsystem configured to construct a semantic model for the message, the real-time processing subsystem including;
a comparator subsystem configured to compare the semantic model of the message with the semantic models for the plurality of message categories;
a classifier subsystem configured to classify the message based on the comparison; and
a security subsystem configured to determine if the message can be routed to the recipient based on the classification of the message. - View Dependent Claims (47, 48, 49, 50, 51, 52, 53, 54, 55)
a parser and semantic tagger configured to receive descriptions for the plurality of message categories and to extract concepts for each message category from the descriptions;
a natural language processing information retrieval system configured to receive the concepts as queries and to generate text segments relevant to the queries, the text segments being extracted from a document collection;
a knowledge extraction subsystem configured to extract knowledge representations from the text segments and to construct the semantic models for the plurality of message categories using the knowledge representations.
-
-
48. The system of claim 46 wherein the real-time processing subsystem further comprises:
-
an information interpreter configured to parse the message and extract meta-information and text information from the message;
a knowledge extraction system configured to extract knowledge representations from the text information and the meta-information, and to construct the semantic model for the message using the knowledge representations.
-
-
49. The system of claim 46 wherein:
-
the comparator subsystem is configured to determine a degree of similarity between the semantic model of the message and the semantic models for each message category in the plurality of message categories; and
the classifier subsystem is configured to classify the message as belonging to a message category if the degree of similarity between the semantic model of the message and the semantic model of the message category exceeds a threshold degree of similarity.
-
-
50. The system of claim 46 wherein:
-
the comparator subsystem is configured to determine a degree of similarity between the semantic model of the message and the semantic model for each message category in the plurality of message categories; and
the classifier subsystem is configured to classify the message as not belonging to any of the message categories in the plurality of message categories if the degree of similarity between the semantic model of the message and the semantic models of the plurality of message categories is lower than a threshold degree of similarity.
-
-
51. The system of claim 50 further comprising a user interface module configured to display information about the message and to allow manual classification of the message as belonging to a message category from the plurality of message categories.
-
52. The system of claim 51 further comprising a machine learning subsystem configured to receive the manually classified message and to update the semantic model of the message category to which the message was manually classified.
-
53. The system of claim 46 wherein the security subsystem is further configured to determine if the message violates a security policy, and to permit distribution of the message to the recipient if the security policy is not violated.
-
54. The system of claim 53 wherein the security subsystem determines if the message violates a security policy by determining a security clearance level for the sender, a security clearance level for the sender, and a security clearance level for message category to which the message was classified, and indicating that the message violates the security policy if the security clearance level of the sender or recipient is lower than the security clearance level of the message category.
-
55. The system of claim 46 wherein the security subsystem is further configured to determine if the message violates a security policy, and to prohibit distribution of the message to the recipient if the security policy is violated.
Specification