×

System, method, and computer-accessible medium for evaluating multi-dimensional synthetic data using integrated variants analysis

  • US 10,635,939 B2
  • Filed: 10/04/2018
  • Issued: 04/28/2020
  • Est. Priority Date: 07/06/2018
  • Status: Active Grant
First Claim
Patent Images

1. A non-transitory computer-accessible medium having stored thereon computer-executable instructions for evaluating at least one synthetic dataset, wherein, when a computer arrangement executes the instructions, the computer arrangement is configured to perform procedures comprising:

  • receiving at least one original dataset;

    receiving the at least one synthetic dataset;

    training at least one model using the at least one original dataset and the at least one synthetic dataset;

    generating a statistical correlation score based on the at least one synthetic dataset and the at least one original dataset;

    generating an evaluation score by evaluating the at least one synthetic dataset based on the training of the least one model, wherein the evaluation score includes (i) a statistical correlation score, (ii) a data similarity score, and (iii) a data quality score;

    determining a region for the at least one synthetic dataset based on the evaluation score, wherein the region includes one of(i) a normal region where the at least one synthetic dataset is unlikely to contain synthetic data that is similar to original data within the at least one original dataset,(ii) a warning region where the at least one synthetic dataset potentially contains the synthetic data that is similar to the original data, or(iii) a red flag region where the at least one synthetic dataset is likely to contain the synthetic data that is similar to the original data; and

    generating a suggestion based on the evaluation score and the determined region, wherein the suggestion includes one of (i) indicating that the at least one synthetic dataset is adequate or (ii) warning that the at least one synthetic dataset potentially contains information similar to the at least one original dataset.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×