×

PLUGGABLE FAULT DETECTION TESTS FOR DATA PIPELINES

  • US 20170220403A1
  • Filed: 10/07/2015
  • Published: 08/03/2017
  • Est. Priority Date: 09/14/2015
  • Status: Active Grant
First Claim
Patent Images

1. A method for detecting faults related to a pipeline of a data pipeline system, the method comprising:

  • a fault detection system receiving a plugin comprising a) one or more instructions representing a test to perform on data processed by the data pipeline system and b) one or more configuration points;

    wherein the data pipeline system receives source data from one or more data sources and applies one or more transformations to the source data to produce transformed data before storage of the transformed data in one or more data sinks;

    the fault detection system receiving, via a first graphical user interface, one or more settings corresponding to the one or more configuration points;

    the fault detection system receiving test data from the data pipeline system, wherein the test data comprises a sample of the transformed data after the one or more transformations;

    the fault detection system determining to run the test defined by the plugin on the data pipeline system including executing the one or more instructions of the plugin based on the one or more settings for the one or more configuration points and the test data, wherein a result of executing the one or more instructions includes at least a test result status indicator;

    wherein the transformed data comprises tabular data;

    wherein the sample comprises a portion of the tabular data;

    wherein the test result status indicator is based, at least in part, on the result of executing the one or more instructions including determining;

    (a) whether the sample contains a correct number of columns according to a schema for the transformed data,(b) whether data in each column of the sample adheres to a data type of the column as specified in a schema for the transformed data,(c) whether data in each column of the sample improperly contains NULL values according to a schema for the transformed data, orany combination of (a), (b), or (c); and

    the fault detection system causing display of a second graphical user interface that visibly presents at least the test result status indicator.

View all claims
  • 8 Assignments
Timeline View
Assignment View
    ×
    ×