Intelligent security management

US 10,320,819 B2
Filed: 02/27/2017
Issued: 06/11/2019
Est. Priority Date: 02/27/2017
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method, comprising:

training a topic model using a set of training documents, each training document of the set having at least one identified topic and an assigned risk score;

training a random forest regressor using the set of training documents;

crawling a plurality of documents, stored for an entity across an electronic resource environment, to index the plurality of documents;

determining, using at least the topic model, one or more topics for each document of the plurality of documents;

determining, using at least the random forest regressor, a risk score for each document of the plurality of documents;

training a recurrent neural network using historical activity with respect to the plurality of documents in the electronic resource environment;

determining, using the recurrent neural network, an expected activity of a specified user with respect to the plurality of documents over at least one determined period time;

detecting user activity with respect to at least a specified document of the plurality of documents, the user activity associated with the specified user;

processing the activity using the recurrent neural network to determine whether the user activity deviates from the expected type of activity, the determination further based at least in part upon at least one topic determined for the specified document; and

generating a security alert if the user activity is determined to deviate unacceptably from the expected activity and a risk score for at least one of the user activity or the specified document at least meets an alert threshold.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A corpus of documents (and other data objects) stored for an entity can be analyzed to determine one or more topics for each document. Elements of the documents can be analyzed to also assign a risk score. The types of topics and security elements, and the associated risk scores, can be learned and adapted over time using, for example, a topic model and random forest regressor. Activity with respect to the documents is monitored, and expected behavior for a user determined using a trained recurrent neural network. Ongoing user activity is processed to determine whether the activity excessively deviates from the expected user activity. The activity can also be compared against the activity of user peers to determine whether the activity is also anomalous among the user peer group. For anomalous activity, risk scores of the accessed documents can be analyzed to determine whether to generate an alert.

Citations

19 Claims

1. A computer-implemented method, comprising:
- training a topic model using a set of training documents, each training document of the set having at least one identified topic and an assigned risk score;
  
  training a random forest regressor using the set of training documents;
  
  crawling a plurality of documents, stored for an entity across an electronic resource environment, to index the plurality of documents;
  
  determining, using at least the topic model, one or more topics for each document of the plurality of documents;
  
  determining, using at least the random forest regressor, a risk score for each document of the plurality of documents;
  
  training a recurrent neural network using historical activity with respect to the plurality of documents in the electronic resource environment;
  
  determining, using the recurrent neural network, an expected activity of a specified user with respect to the plurality of documents over at least one determined period time;
  
  detecting user activity with respect to at least a specified document of the plurality of documents, the user activity associated with the specified user;
  
  processing the activity using the recurrent neural network to determine whether the user activity deviates from the expected type of activity, the determination further based at least in part upon at least one topic determined for the specified document; and
  
  generating a security alert if the user activity is determined to deviate unacceptably from the expected activity and a risk score for at least one of the user activity or the specified document at least meets an alert threshold.
- View Dependent Claims (2, 3, 4)
- - 2. The computer-implemented method of claim 1, further comprising:
    - processing, using a Kalman filter, a result of the processing by the recurrent neural network to analyze the user activity over a plurality of periods of time to further determine whether the user activity deviates more than an allowable amount from the expected activity.
  - 3. The computer-implemented method of claim 1, further comprising:
    - comparing the user activity further against peer activity for peers in a peer group including the specified user; and
      
      determining whether the user activity deviates unacceptably from the expected activity based further upon a second deviation of the user activity with respect to the peer activity.
  - 4. The computer-implemented method of claim 3, further comprising:
    - determining the peer group, including the specified user, using an unsupervised classifier trained using monitored activity data with respect to the plurality of documents and a plurality of users of the electronic resource environment.

5. A computer-implemented method, comprising:
- training a neural network using historical activity with respect to a plurality of documents stored, on behalf of an entity, in an electronic resource environment;
  
  determining, using the neural network, an expected activity of a specified user with respect to the plurality of documents over at least one determined period time;
  
  detecting user activity, over at least a determined period of time, with respect to at least a specified document of the plurality of documents, the user activity associated with the specified user;
  
  determining at least one topic associated with the specified document;
  
  comparing the at least one topic against topics associated with the expected activity;
  
  processing the user activity using the neural network to determine whether the user activity deviates from the expected type of activity, the determination based at least in part upon a topic distance, in a topic vector space, between the at least one topic and the topics associated with the expected activity; and
  
  performing a determined action if the user activity is determined to deviate unacceptably from the expected type of activity.
- View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
- - 6. The computer-implemented method of claim 5, further comprising:
    - determining the action to be performed based at least in part upon a determined risk score, at least one risk threshold associated with a possible action to be performed.
  - 7. The computer-implemented method of claim 6, wherein the action is one of a plurality of possible actions each associated with a respective range of risk scores, the possible actions including at least one of generating a security alert, logging anomalous activity data, or adjusting access permissions associated with at least one of the specified user or the specified document.
  - 8. The computer-implemented method of claim 5, further comprising:
    - processing, using a Kalman filter, a result of the processing by the neural network to analyze the user activity over a plurality of periods of time to further determine whether the user activity deviates unacceptably from the expected activity.
  - 9. The computer-implemented method of claim 8, further comprising:
    - processing a result of the Kalman filter processing using a trained classifier to determine whether the user activity deviates unacceptably from the expected activity.
  - 10. The computer-implemented method of claim 5, further comprising:
    - comparing the user activity further against peer activity for peers in a peer group including the specified user; and
      
      determining whether the user activity deviates unacceptably from the expected user activity based at least in part upon a second deviation of the user activity with respect to the peer activity.
  - 11. The computer-implemented method of claim 10, further comprising:
    - determining the peer group, including the specified user, using an unsupervised classifier trained using monitored activity data with respect to the plurality of documents and a plurality of users of the electronic resource environment.
  - 12. The computer-implemented method of claim 5, wherein the user activity includes at least one of a type of access, a frequency of access, a total number of access attempts over a period of time, a source address for the access, a topic accessed, a type of document accessed, a location of the access, a day or time of the access, or an application programming interface (API) call used to obtain the access.

13. A system, comprising:
- at least one processor; and
  
  memory including instructions that, when executed by the at least one processor, cause the system to;
  
  train a topic model using a set of training documents, each training document of the set having at least one identified topic and an assigned risk score;
  
  crawl a plurality of documents, stored for an entity across an electronic resource environment, to locate and index the plurality of documents;
  
  determine, using at least the topic model, one or more topics for each document of the plurality of documents;
  
  determine a risk score for each document of the plurality of documents using a trained random forest regressor; and
  
  provide security information for access by an authorized user associated with the entity, the security information including information for the identified topics and risk scores for the plurality of documents stored for the entity.
- View Dependent Claims (14, 15, 16, 17, 18, 19)
- - 14. The system of claim 13, wherein the instructions when executed further cause the system to:
    - detect updated document data corresponding to at least one of new documents or document changes stored for the entity is the electronic resource environment; and
      
      further train the topic model for each instance of the updated document data.
  - 15. The system of claim 13, wherein the instructions when executed further cause the system to:
    - utilize natural language understanding (NLU) to analyze the plurality of documents to determine one or more topics associated with each document of the plurality of documents.
  - 16. The system of claim 13, wherein the instructions when executed further cause the system to:
    - determine a plurality of elements contained in the plurality of documents, each element of the plurality of elements posing a potential security risk to the entity;
      
      assign a respective risk score for each element of the plurality of elements; and
      
      determine the risk score for a specified document of the plurality of documents based at least in part upon a highest respective risk score for one of the elements associated with the specified document.
  - 17. The system of claim 13, wherein the instructions when executed further cause the system to:
    - detect a new document stored for the entity in the electronic resource environment;
      
      determine one or more topics associated with the new document;
      
      assign the new document to a document bucket associated with other documents having the one or more topics of the new document; and
      
      assign a risk score to the new document based at least in part upon a bucket risk score for the document bucket.
  - 18. The system of claim 13, wherein the instructions when executed further cause the system to:
    - cause new topics to be learned by processing the plurality of documents using the trained topic model.
  - 19. The system of claim 13, wherein the instructions when executed further cause the system to:
    - enable types of documents to be classified by the topic model that are specific to an industry of the entity and do not contain content previously associated with a topic.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Watson, Alexander, Brim, Daniel, Simmons, Christopher, Radulovic, Paul, Bray, Tyler Stuart, Brinkley, Jennifer Anne, Johnson, Eric, Chin, Victor, Rasgaitis, Jack, Cai, Nai Qin, Gough, Michael, Anger, Max
Primary Examiner(s)
Song, Hosuk

Application Number

US15/443,801
Publication Number

US 20180248895A1
Time in Patent Office

834 Days
Field of Search

726 2- 4, 726 22- 27
US Class Current
CPC Class Codes

G06F 16/35   Clustering; Classification

G06F 16/951   Indexing; Web crawling tech...

G06F 21/554   involving event detection a...

G06N 20/20   Ensemble learning

G06N 3/044   Recurrent networks, e.g. Ho...

G06N 3/08   Learning methods

G06N 3/084   Backpropagation, e.g. using...

G06N 5/01   Dynamic search techniques; ...

G06N 5/045   Explanation of inference; E...

G06N 7/01   Probabilistic graphical mod...

G06Q 20/4016   involving fraud or risk lev...

H04L 63/083   using passwords cryptograph...

H04L 63/0861   using biometrical features,...

H04L 63/101   Access control lists [ACL]

H04L 63/1416   Event detection, e.g. attac...

Intelligent security management

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Intelligent security management

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links