×

Methods and systems for assessing data quality

  • US 10,248,672 B2
  • Filed: 09/19/2011
  • Issued: 04/02/2019
  • Est. Priority Date: 09/19/2011
  • Status: Active Grant
First Claim
Patent Images

1. A method, comprising:

  • selecting, by a microprocessor, a group of proposed critical data elements from a plurality of proposed critical data elements consisting at least in part of type of account, original balance, origination date, number of deposits, and number of loans based at least in part on ranking each of the plurality of proposed critical data elements according to weighted criteria consisting at least in part of ease of access to each proposed critical data element, regulatory risk associated with each proposed critical data element, financial risk associated with each proposed critical data element, and reputation risk associated with each proposed critical data element;

    collecting, by the microprocessor, samples of data for each of the proposed critical data elements in said group of proposed critical data elements from a database storing a population of data elements representing attributes of each of a plurality of different financial transactions;

    identifying, by the microprocessor, a portion of said group of proposed critical data elements based at least in part on a ranking of respective degrees of correlation between said data samples for each of the proposed critical data elements in said group of proposed critical data elements;

    generating, by the microprocessor, a plurality of different, overlapping sets of data quality rules at least in part in terms of data completeness and data validity for each of the proposed critical data elements in said portion of said group of proposed critical data elements, each set of data quality rules comprising a different number of data quality rules for the same proposed critical data elements in said portion of said group of proposed critical data elements;

    identifying, by the microprocessor, one of the plurality of different, overlapping sets of data quality rules for monitoring a quality of data in said database based at least in part on a difference between a value for each of said sets of data quality rules as a function of accuracy or completeness of data in the database and a sum of a cost of creating each set of data quality rules as a function of number, complexity, and interdependency of rules in each of said sets of data quality rules;

    monitoring, by the microprocessor, the quality of data within said database using said identified one of the plurality of different, overlapping sets of data quality rulesidentifying, by the microprocessor, critical data elements that produce a pre-defined high number of outliers in said data within said database based on said monitoring the quality of data in said database indicative of a likelihood that a process is out of control; and

    identifying, by the microprocessor, causes for the pre-defined high number of outliers produced by said critical data elements in said data within said database.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×