×

Internal dataset-based outlier detection for confidential data in a computer system

  • US 10,025,939 B2
  • Filed: 09/28/2016
  • Issued: 07/17/2018
  • Est. Priority Date: 09/28/2016
  • Status: Active Grant
First Claim
Patent Images

1. A system comprising:

  • one or more hardware processors;

    a non-transitory computer-readable medium having instructions stored there on, which, when executed by the one or more hardware processor, cause the system to;

    receive, via a first computerized user interface implemented as a screen of a graphical user interface, a submission of a confidential data value of a first confidential data type from a first user, entered into a field of the screen of the graphical user interface;

    identify, using the one or more hardware processors, one or more attributes of the first user;

    retrieve a plurality of previously submitted confidential data values of a first confidential data type for a cohort matching the one or more attributes of the first user, the previously submitted confidential data values having been encrypted on an external data source, the cohort being a grouping of data pertaining to a combination of user attributes for users who submitted the confidential data values;

    calculate, using the one or more hardware processors, a plurality of percentiles for the plurality of previously submitted confidential data values;

    calculate, using the one or more hardware processors, an interquartile range for a first and a second of the plurality of percentiles, wherein the value for the first of the plurality of percentiles is lower than the value for the second of the plurality of percentiles;

    compute, using the one or more hardware processors, a lower limit for the first confidential data type and the cohort by taking a maximum of zero or a difference between the value for the first of the plurality of percentiles and a product of a preset alpha parameter and the interquartile range;

    determine, using the one or more hardware processors, whether the confidential data value submitted by the first user is an outlier by determining if the confidential data value submitted by the user is lower than the lower limit; and

    in response to a determination that the confidential data value submitted by the first user is not outlier, permitting, using the one or more hardware processors, the confidential data value submitted by the first user to be used for insights provided to other users.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×