×

Techniques for application data scrubbing, reporting, and analysis

  • US 8,838,652 B2
  • Filed: 03/18/2008
  • Issued: 09/16/2014
  • Est. Priority Date: 03/18/2008
  • Status: Expired due to Fees
First Claim
Patent Images

1. A machine-implemented method for executing on a machine, comprising:

  • acquiring, by the machine, a first schema for a first data source and a second schema for a second data source;

    using, by the machine, the first and second schemas to parse both data sources based on syntax and structure defined in the first and second schemas to detect data types and patterns for the data types in both the data sources;

    matching, by the machine, some first patterns associated with the first data source to other second patterns associated with the second data source in response to matching rules, the matching rules provide a link between the patterns detected in the first data source and the second data source, the matching rules obtained from a meta schema that ties the first schema to the second schema and the matching rules are acquired in response to a predefined policy that associates patterns or data types between the two schemas and the matching rules permit a first data type in the first data source to be mapped to a second data type in the second data source even when the first data type is different from the second data type;

    generating, by the machine, a report that identifies the matched first patterns of the first data source to the second patterns of the second source and the report includes metrics for the first data source and the second data source, the metrics including pattern variations for both of the data types, frequency of a particular pattern for a particular one of the data types that occurs within one of the data sources, identifying data source entries where sub data types are missing under a parent data type when required to present in accordance with one of the data source schemas; and

    iterating, by the machine and in response to interaction with a data analyst, the method processing based on modifications supplied by the data analyst for the report and the matching rules based on the metrics to produce a revised report for each iteration and on a last iteration producing a master data source that conforms to enterprise data policies and a final revised report that reports on a state of the first data source and the second data source that comprise the master data source.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×