DATA LEAK PREVENTION ENFORCEMENT BASED ON LEARNED DOCUMENT CLASSIFICATION

US 20150254469A1
Filed: 03/07/2014
Published: 09/10/2015
Est. Priority Date: 03/07/2014
Status: Active Grant

First Claim

Patent Images

1. An automated method for data leak prevention, the method comprising:

obtaining, automatically by a processor, a plurality of training documents, each of the training documents including at least respective content and respective metadata;

generating a classification model, automatically by the processor, wherein the classification model is generated, based at least in part upon the content and metadata of each of the training documents;

obtaining, automatically by the processor, at least one non-training document, wherein the non-training document includes at least respective content;

applying to the non-training document, automatically by the processor, the classification model in order to classify the non-training document into one of at least two categories;

monitoring, automatically by the processor, for attempted access to the non-training document; and

taking action, automatically by the processor, when the monitoring determines the existence of the attempted access to the non-training document;

wherein the action that is taken is based upon the category into which the non-training document to which access is attempted has been classified; and

wherein the action that is taken comprises one of;

(a) denying access to the non-training document to which access is attempted;

(b) logging the attempted access to the non-training document to which access is attempted; and

(c) a combination thereof.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present disclosure relates generally to the field of automatically learning and automatically adapting to perform classification of protected data. In various examples, learning and adapting to perform classification of protected data may be implemented in the form of systems, methods and/or algorithms.

Citations

20 Claims

1. An automated method for data leak prevention, the method comprising:
- obtaining, automatically by a processor, a plurality of training documents, each of the training documents including at least respective content and respective metadata;
  
  generating a classification model, automatically by the processor, wherein the classification model is generated, based at least in part upon the content and metadata of each of the training documents;
  
  obtaining, automatically by the processor, at least one non-training document, wherein the non-training document includes at least respective content;
  
  applying to the non-training document, automatically by the processor, the classification model in order to classify the non-training document into one of at least two categories;
  
  monitoring, automatically by the processor, for attempted access to the non-training document; and
  
  taking action, automatically by the processor, when the monitoring determines the existence of the attempted access to the non-training document;
  
  wherein the action that is taken is based upon the category into which the non-training document to which access is attempted has been classified; and
  
  wherein the action that is taken comprises one of;
  
  (a) denying access to the non-training document to which access is attempted;
  
  (b) logging the attempted access to the non-training document to which access is attempted; and
  
  (c) a combination thereof.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method of claim 1, wherein the training documents are obtained from a document management system.
  - 3. The method of claim 1, wherein the non-training document is obtained from a document management system.
  - 4. The method of claim 1, further comprising generating, automatically by the processor, an enforcement policy, wherein the enforcement policy specifies the at least one action to be taken when the attempt is made to access a document having a predetermined category.
  - 5. The method of claim 1, wherein the action that is taken comprises permitting access to the non-training document to which access is attempted.
  - 6. The method of claim 5, wherein the action that is taken is permitting and logging access to the non-training document to which access is attempted.
  - 7. The method of claim 4, wherein the enforcement policy is enforced on at least one of:
    - (a) an email component;
      
      (b) an end user device component;
      
      (c) a web component;
      
      (d) a network component; and
      
      (e) a combination thereof.
  - 8. The method of claim 7, wherein:
    - the end user device component comprises at least one of;
      
      (a) a desktop computer;
      
      (b) a laptop computer;
      
      (c) a tablet;
      
      (d) a smartphone; and
      
      (e) a combination thereof.

9. A computer readable storage medium, tangibly embodying a program of instructions executable by the computer for automated data leak prevention, the program of instructions, when executing, performing the following steps:
- obtaining automatically a plurality of training documents, each of the training documents including at least respective content and respective metadata;
  
  generating automatically a classification model, wherein the classification model is generated, based at least in part upon the content and metadata of each of the training documents;
  
  obtaining automatically at least one non-training document, wherein the non-training document includes at least respective content;
  
  applying automatically to the non-training document the classification model in order to classify the non-training document into one of at least two categories;
  
  monitoring automatically for attempted access to the non-training document; and
  
  taking action automatically when the monitoring determines the existence of the attempted access to the non-training document;
  
  wherein the action that is taken is based upon the category into which the non-training document to which access is attempted has been classified; and
  
  wherein the action that is taken comprises one of;
  
  (a) denying access to the non-training document to which access is attempted;
  
  (b) logging the attempted access to the non-training document to which access is attempted; and
  
  (c) a combination thereof.
- View Dependent Claims (10, 11, 12, 13, 14)
- - 10. The computer readable storage medium of claim 9, wherein the program of instructions, when executing, further perform:
    - generating automatically an enforcement policy, wherein the enforcement policy specifies the at least one action to be taken when the attempt is made to access a document having a predetermined category.
  - 11. The computer readable storage medium of claim 9, wherein the action that is taken comprises permitting access to the non-training document to which access is attempted.
  - 12. The computer readable storage medium of claim 9, wherein the action that is taken is permitting and logging access to the non-training document to which access is attempted.
  - 13. The computer readable storage medium of claim 10, wherein the enforcement policy is enforced on at least one of:
    - (a) an email component;
      
      (b) an end user device component;
      
      (c) a web component;
      
      (d) a network component; and
      
      (e) a combination thereof.
  - 14. The computer readable storage medium of claim 13, wherein:
    - the end user device component comprises at least one of;
      
      (a) a desktop computer;
      
      (b) a laptop computer;
      
      (c) a tablet;
      
      (d) a smartphone; and
      
      (e) a combination thereof.

15. A computer-implemented system for automatic data leak prevention, the system comprising:
- a first obtaining element configured to obtain automatically a plurality of training documents, each of the training documents including at least respective content and respective metadata;
  
  a first generating element configured to generate automatically a classification model, wherein the classification model is generated, based at least in part upon the content and metadata of each of the training documents;
  
  a second obtaining element configured to obtain automatically at least one non-training document, wherein the non-training document includes at least respective content;
  
  an applying element configured to apply automatically to the non-training document the classification model in order to classify the non-training document into one of at least two categories;
  
  a monitoring element configured to monitor automatically for attempted access to the non-training document; and
  
  a taking action element configured to take action automatically when the monitoring determines the existence of the attempted access to the non-training document;
  
  wherein the action that is taken is based upon the category into which the non-training document to which access is attempted has been classified; and
  
  wherein the action that is taken comprises one of;
  
  (a) denying access to the non-training document to which access is attempted;
  
  (b) logging the attempted access to the non-training document to which access is attempted; and
  
  (c) a combination thereof.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The system of claim 15, further comprising:
    - a second generating element configured to generate automatically an enforcement policy, wherein the enforcement policy specifies the at least one action to be taken when the attempt is made to access a document having a predetermined category.
  - 17. The system of claim 15, wherein the action that is taken comprises permitting access to the non-training document to which access is attempted.
  - 18. The system of claim 15, wherein the action that is taken is permitting and logging access to the non-training document to which access is attempted.
  - 19. The system of claim 16, wherein the enforcement policy is enforced on at least one of:
    - (a) an email component;
      
      (b) an end user device component;
      
      (c) a web component;
      
      (d) a network component; and
      
      (e) a combination thereof.
  - 20. The system of claim 15, further comprising an output element configured to output the category into which the non-training document is classified.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Kyndryl Incorporated (Kyndryl Holdings, Inc.)
Original Assignee
International Business Machines Corporation
Inventors
Butler, Anthony M.

Granted Patent

US 9,626,528 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06F 21/6218   to a system of files or obj...

G06N 20/00   Machine learning

G06N 5/025   Extracting rules from data

DATA LEAK PREVENTION ENFORCEMENT BASED ON LEARNED DOCUMENT CLASSIFICATION

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

DATA LEAK PREVENTION ENFORCEMENT BASED ON LEARNED DOCUMENT CLASSIFICATION

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links