×

Validating code of an extract, transform and load (ETL) tool

  • US 9,547,702 B2
  • Filed: 11/30/2015
  • Issued: 01/17/2017
  • Est. Priority Date: 07/15/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method of validating code of an extract, transform and load (ETL) tool, the method comprising the steps of:

  • responsive to a receipt of naming, coding, and performance standards for the code of the ETL tool and an export of the code of the ETL tool to a job definition file, a computer parsing the code of the ETL tool in the job definition file;

    the computer determining violations of the naming, coding, and performance standards in part by determining the parsed code of the ETL tool does not match the naming, coding, and performance standards;

    the computer generating a report which identifies the violations;

    based at least in part on a review of the report and a rework of the code of the ETL tool to comply with the naming, coding and performance standards and responsive to an export of the reworked code of the ETL tool to another job definition file, the computer parsing the reworked code of the ETL tool in the other job definition file, determining that the parsed reworked code of the ETL tool does not include the violations of the naming, coding and performance standards, and generating a second report that indicates that the reworked code of the ETL tool does not include the violations;

    the computer receiving maximum numbers of aggregator stages of a job included in the code of the ETL tool, transformer stages of the job, occurrences of repartitioning of data sets in the job, sort stages of the job, database read/write operations of the job, and sequential file read/write operations of the job;

    the computer receiving a minimum ratio of a number of stages of the job to a number of stages of the job that are annotated;

    the computer receiving minimum sizes of a transaction for any insert, update or delete operation of the job and an array employed for any insert, update or delete operation of the job; and

    based on aggregator stages of the job exceeding the maximum number of aggregator stages, transformer stages of the job exceeding the maximum number of transformer stages of the job, occurrences of repartitioning of data sets in the job exceeding the maximum number of occurrences of repartitioning of data sets in the job, sort stages of the job exceeding the maximum number of sort stages, database read/write operations of the job exceeding the maximum number of database read/write operations, sequential file read/write operations of the job exceeding the maximum number of sequential file read/write operations, a ratio of the number of stages of the job to the number of stages of the job that are annotated being less than the minimum ratio of the number of stages to the number of stages that are annotated, a size of a transaction for an insert, update or delete operation of the job being less than the minimum size of the transaction, or a size of an array employed for an insert, update or delete operation of the job being less than the minimum size of the array, the computer determining a violation of a performance standard included in the naming, coding, and performance standards.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×