Data unit test-based data management system
First Claim
1. A method for a data unit test framework, the method comprising:
- applying first data unit test instructions to a first data set generated as a first step in a data transformation process to test the first data set, wherein applying first data unit test instructions further comprises;
defining, according to the first data unit test instructions, a first variable for one or more elements of the first data set;
evaluating, according to the first data unit test instructions, that a precondition of a first value for the first variable is satisfied; and
executing a matcher on the first value;
determining, according to the matcher, that the first value fails a first condition in relation to an expected value for the first variable;
in response to determining that the first value fails the first condition, transmitting a first notification indicating that the first data set failed a first data unit test; and
applying second data unit test instructions to a second data set generated as a second step in the data transformation process to test the second data set, wherein applying second data unit test instructions further comprises;
defining, according to the second data unit test instructions, a second variable for one or more elements of the second data set; and
determining, according to the second data unit test instructions, that a plurality of second values for the second variable fails a second condition in relation to an expected threshold for the plurality of second values; and
in response to determining that the plurality of second values for the second variable fails the second condition,transmitting a second notification indicating that the second data set failed a second data unit test,wherein the method is performed by one or more computer hardware processors.
8 Assignments
0 Petitions
Accused Products
Abstract
An improved unit test framework that validates large datasets generated by a data management system is described herein. Typical unit test frameworks validate functions. However, the improved unit test framework validates the underlying data. For example, after each step of a data transformation process implemented by the data management system, the data management system can execute a data unit test that loads data sets into memory, checks a set of preconditions, and applies unit test logic to the loaded data sets. In some embodiments, the data management system executes the data unit tests asynchronously with the data transformation processes and therefore do not interfere with the data transformation processes. Rather, the data management system generates and transmits a notification when any step of the data transformation process fails a particular data unit test.
-
Citations
16 Claims
-
1. A method for a data unit test framework, the method comprising:
-
applying first data unit test instructions to a first data set generated as a first step in a data transformation process to test the first data set, wherein applying first data unit test instructions further comprises; defining, according to the first data unit test instructions, a first variable for one or more elements of the first data set; evaluating, according to the first data unit test instructions, that a precondition of a first value for the first variable is satisfied; and executing a matcher on the first value; determining, according to the matcher, that the first value fails a first condition in relation to an expected value for the first variable; in response to determining that the first value fails the first condition, transmitting a first notification indicating that the first data set failed a first data unit test; and applying second data unit test instructions to a second data set generated as a second step in the data transformation process to test the second data set, wherein applying second data unit test instructions further comprises; defining, according to the second data unit test instructions, a second variable for one or more elements of the second data set; and determining, according to the second data unit test instructions, that a plurality of second values for the second variable fails a second condition in relation to an expected threshold for the plurality of second values; and in response to determining that the plurality of second values for the second variable fails the second condition, transmitting a second notification indicating that the second data set failed a second data unit test, wherein the method is performed by one or more computer hardware processors. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
-
at least one processor; and a storage device configured to store computer-executable instructions, the computer-executable instructions, when executed by the at least one processor, cause the system to at least; define a first variable for one or more elements of a first data set generated as a first step in a data transformation process to test the first data set, wherein the first variable corresponds to a plurality of data values of the first data set; filter the plurality of data values based on a first precondition to obtain a subset of the plurality of data values; execute a matcher on the subset of the plurality of data values; determine that a threshold percentage of the subset of the plurality of data values satisfy a first condition; define a second variable for one or more elements of a second data set generated as a second step in the data transformation process to test the second data set, wherein the second variable corresponds to a second value of the second data set; determine that the second value does not satisfy a second condition; and in response to the determination that the second value does not satisfy the second condition, generate and transmit a first notification indicating that the second data set failed a second data unit test. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15)
-
-
16. One or more non-transitory, computer-readable storage media storing computer-executable instructions, which if performed by one or more processors, cause the one or more processors to at least:
-
define a first variable for one or more elements of a first data set generated as a first step in a data transformation process to test the first data set, wherein the first variable corresponds to a plurality of data values of the first data set; filter the plurality of data values based on a first precondition to obtain a subset of the plurality of data values; execute a matcher on the subset of the plurality of data values; determine that a threshold percentage of the subset of the plurality of data values satisfy a first condition; define a second variable for one or more elements of a second data set generated as a second step in the data transformation process to test the second data set, wherein the second variable corresponds to a second value of the second data set; determine that the second value does not satisfy a second condition; and generate and transmit a first notification indicating that the second data set failed a second data unit test.
-
Specification