×

Method for generating synthetic data sets at scale with non-redundant partitioning

  • US 10,891,311 B2
  • Filed: 10/14/2016
  • Issued: 01/12/2021
  • Est. Priority Date: 10/14/2016
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • receiving, by a clustering module, a plurality of data sets, wherein each data set of the plurality of data sets includes a plurality of attributes;

    partitioning, by the clustering module, the plurality of data sets into a plurality of clustered data sets including at least a first clustered data set and a second clustered data set, wherein each data set of the plurality of data sets is partitioned into one of the plurality of clustered data sets;

    assigning, by a training module, a respective stochastic model to each respective clustered data set of the plurality of clustered data sets including;

    assigning a first stochastic model to the first clustered data set, andassigning a second stochastic model to the second clustered data set;

    selecting, by a first machine including a first memory and one or more processors in communication with the first memory, the first clustered data set and the first stochastic model;

    selecting, by a second machine that is different from the first machine, the second machine including a second memory and one or more processors in communication with the second memory, the second clustered data set and the second stochastic model;

    generating, by the first machine with the first stochastic model, a first synthetic data set, wherein the first synthetic data set has generated data for each one of the plurality of attributes;

    generating, by the second machine with the second stochastic model, a second synthetic data set, wherein the second synthetic data set has generated data for each one of the plurality of attributes; and

    testing at least one of an application and a database using each of the first synthetic data set and the second synthetic data set.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×