×

Aggregation and classification of secure data

  • US 9,779,260 B1
  • Filed: 05/30/2013
  • Issued: 10/03/2017
  • Est. Priority Date: 06/11/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • on a computer system comprising at least one server computer and a plurality of distinct data classification engines, managing and controlling a plurality of data-access credentials;

    wherein the plurality of distinct data classification engines comprise an a priori classification engine, a posteriori classification engine, and a heuristics engine;

    accessing, by the computer system, data from a plurality of sources in a plurality of data formats, the plurality of sources comprising sources that are internal to the computer system and sources that are external to the computer system;

    wherein the accessing comprises using one or more data-access credentials of the plurality of data-access credentials, the one or more data-access credentials being associated with at least a portion of the plurality of data sources;

    abstracting, by the computer system, the data into a standardized format for further analysis, the abstracting comprising selecting the standardized format based on a type of the data;

    applying, by the computer system, a security policy to the data;

    wherein the applying comprises identifying at least a portion of the data for exclusion from storage based on the security policy;

    the computer system filtering from storage any data identified for exclusion;

    storing, by the computer system, the filtered data in the standardized format;

    prior to storing, classifying, using at least one of the plurality of distinct data classification engines, the data based on one or more characteristics of metadata associated with the data;

    wherein the a posteriori classification engine is operable to perform an a posteriori classification, the a posteriori classification comprises utilization of one or more probabilistic algorithms, wherein the one or more probabilistic algorithms determine a probability that a set of data comprises a particular classification based on a combination of probabilistic determinations associated with subsets of the set of data, parameters, and metadata associated with the set of data; and

    wherein the posteriori classification engine is configured to reclassify, at predefined time intervals, the previously classified data in response to user feedback, wherein the user feedback comprises indications by users of an accuracy of the previously classified data and update the one or more probabilistic algorithms based on the user feedback.

View all claims
  • 23 Assignments
Timeline View
Assignment View
    ×
    ×