×

System for identifying anomalies by automatically generating and analyzing a structure

  • US 10,140,319 B2
  • Filed: 07/26/2016
  • Issued: 11/27/2018
  • Est. Priority Date: 07/26/2016
  • Status: Active Grant
First Claim
Patent Images

1. A system for automatically analyzing a data model and identifying anomalies within the data model, the system comprising:

  • a memory device comprising computer-readable program code;

    a communication device; and

    a processing device operatively coupled to the memory device and the communication device, wherein the processing device is configured to execute the computer-readable program code to;

    based on input received from a graphical user interface, retrieve data comprising a plurality of elements and a plurality of tables, the data retrieved from a database;

    identify elements within the plurality of tables as facts or dimensions by analyzing the elements and the plurality of tables based on a data type corresponding to the elements, a column name corresponding to the elements, and one or more constraints corresponding to the elements within the plurality of tables, wherein identifying the elements comprises;

    identifying when the elements comprise a qualitative non-numerical data type and identifying the elements as the dimensions, and identifying when the elements have a quantitative numerical type and identifying the elements as the facts;

    identifying when the column name of the elements meet a list of terms stored as associated with the facts or the dimensions and identifying the elements as the facts or the dimensions based on the lists of terms; and

    identifying when the one or more constraints comprise rules that limit the elements to entries that are the facts or the dimensions based on the rules and identifying the elements as the facts or the dimensions based on the rules;

    identify each of the tables associated with the data as either a fact table or a dimension table based on identifying the elements within the tables as the facts or the dimensions, wherein the plurality of tables comprises at least one fact table and at least one dimension table;

    determine that a set of elements within a single column of a first dimension table is a primary key for the dimension table by determining that each element of the set is unique to one another;

    identify a reference key corresponding to the primary key;

    based on identifying the reference key corresponding to the primary key, determine at least one relationship between at least two tables;

    generate the data model of the data retrieved from the database;

    identify data anomalies from the at least one relationship, wherein data anomalies are ragged hierarchies, many to many relationships, and double counting;

    generate a data analysis report comprising;

    generating a list of columns of the tables associated with the data model, a list of other tables that share at least one column with the at least one dimension table, one or more lists of data anomalies, and a list of parents and children, wherein a child is a column associated with the reference key and a parent is a column associated with the primary key;

    compiling the list of the columns, the list of other tables, the one or more lists of data anomalies, and the list of parents and children; and

    present the data analysis report to a user.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×