×

Systems and methods for partitioning sets of features for a Bayesian classifier

  • US 10,163,056 B2
  • Filed: 05/23/2016
  • Issued: 12/25/2018
  • Est. Priority Date: 08/29/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method of building a partition list of feature subsets having probabilistic interdependence among features in the feature subsets for use with a classifier to detect fraudulent user registrations, the method including:

  • accessing an input set including an input tuple comprising feature-values assigned to features, wherein the features are of user registration data records and wherein the feature-values of the input tuple are values from a user registration data record;

    identifying, from the input tuple, input subtuples comprising unique feature subsets;

    accessing a tuple instance count data structure stored in memory that provides counts of tuples in a data set;

    computing class entropy scores for the identified input subtuples that have at least a threshold support count of instances in the tuple instance count data structure, wherein the class entropy scores are based on class labels of the input subtuples, and wherein a class label for an input subtuple has a class value that indicates either a fraudulent user registration or a non-fraudulent user registration;

    building the partition list including;

    ordering at least some of the scored input subtuples by non-decreasing class entropy score; and

    traversing the ordered input subtuples, including;

    adding a feature subset of a current ordered input subtuple to the partition list, andpruning, from subsequent ordered input subtuples, input subtuples including features that overlap with features of the feature subset corresponding to the current ordered input subtuple;

    storing the partition list in a memory, whereby it becomes available to use with the classifier; and

    using the partition list with the classifier to classify additional user registration data records as either fraudulent or non-fraudulent.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×