×

Identifying cohorts with anomalous confidential data submissions using matrix factorization and completion techniques

  • US 10,037,437 B1
  • Filed: 06/09/2017
  • Issued: 07/31/2018
  • Est. Priority Date: 06/09/2017
  • Status: Active Grant
First Claim
Patent Images

1. A system comprising:

  • one or more hardware processors;

    a non-transitory computer-readable medium having instructions stored thereon, which, when executed by the one or more hardware processors, cause the system to;

    for each value of a plurality of values of a first attribute of members of a social networking service who have submitted confidential data;

    calculate, using the one or more hardware processors, an allowed range for confidential data values submitted by members having the value for the first attribute, across all values of a second attribute, the confidential data values having been entered on screens of graphical user interfaces and encrypted on an external data source;

    calculate, using the one or more hardware processors, a median for the confidential data values by selecting a midpoint in a distribution of the confidential data values;

    identify a set of candidate data transformation functions, each candidate data transformation function applying a different function on the confidential data values;

    shift, using the one or more hardware processors, the allowed range based on an inferred median confidential data value relative to the median of the confidential data values, the inferred median confidential data value being inferred by optimizing parameters of an objective function that minimize an error function for the objective function to select one candidate data transformation functions from the set of candidate data transformation functions, applying the selected one candidate data transformation function to the confidential data values to transform the confidential data values, and calculating a median for the transformed confidential data values;

    for each value of the plurality of values of the first attribute of members of the social networking service who have submitted confidential data and each value of a plurality of values of the second attribute of members of the social networking service who have submitted confidential data;

    determine, using the one or more hardware processors, whether the submitted confidential data from members having the value of the first attribute and the value of the second attribute is outside the shifted allowed range for the value of the first attribute; and

    in response to a determination that the submitted confidential data from members having the value of the first attribute and the value of the second attribute is outside the shifted allowed range for the value of the first attribute, mark, using the one or more hardware processors, a combination of the value of the first attribute and the value of the second attribute as anomalous.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×