×

Estimating data topics of computers using external text content and usage information of the users

  • US 9,600,577 B2
  • Filed: 09/26/2013
  • Issued: 03/21/2017
  • Est. Priority Date: 08/01/2013
  • Status: Expired due to Fees
First Claim
Patent Images

1. A system to automatically estimate content topics of inaccessible content in a computer system without inspecting data files in the computer system comprising:

  • a processor; and

    a module operator to execute on the processor and further operable to gather accessible content, the module further operator to analyze the accessible content to estimate one or more topics of the inaccessible content without inspecting the inaccessible content,the inaccessible content comprising privileged data protected from access due to one or more of data privacy and computer security, wherein the one or more topics of the inaccessible content is estimated while preserving the one or more of data privacy and computer security,the module estimating the one or more topics at least by;

    identifying users of the computer system and access counts of the users accessing the computer system, retrieving the accessible content generated by the users of the computer system, analyzing user information and external text content associated with the users that are available in an organization'"'"'s online space outside of the computer system;

    for each of the users, generating a document comprising a bag-of-words representation for the inaccessible content generated by the user, the bag-of-words representation comprising words occurring in the accessible content and counts of the words, the counts of the words scaled as a function of a number of occurrences of a word in the accessible content and a computer system access count associated with the user;

    generating an asset document associated with the computer system by aggregating the document associated with each user for all users; and

    executing a topic modeling algorithm on the asset document that estimates the one or more topics,wherein based on the one or more topics, the module automatically determines security level of information stored in the computer system.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×