Method, system, program, and data structure for cleaning a database table
First Claim
1. A computer implemented method for performing a clean operation on an input table having an input table name, comprising:
- receiving user input identifying the input table name and at least one rule definition, wherein each rule definition indicates a find criteria, a replacement value, an input data column in the input table and an optional output table where cleaned data from the input table is to be placed, wherein each rule definition is associated with a rule table including multiple find criteria, a corresponding replacement value for each find criteria, and a sort-key column name for ordering the multiple find criteria;
in response to the user input, generating an Application Program Interface (API) call that includes the identified input table name and the at least one rule definition; and
for each rule definition included in the API call,searching the input data column for any fields that match the find criteria;
determining whether the rule definition included in the API call specifies an output table;
in response to determining that the rule definition in the API call specifies an output table, inserting the replacement value in the specified output table; and
in response to determining that the rule definition does not specify an output table, directly inserting the replacement value in the fields in the input data column that match the find criteria, wherein subsequent applications of additional rule definitions applied to the same input data column operate on replacement values inserted in the input data column in previously applied rule definitions.
0 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a method, system, program, and data structure for performing a clean operation on an input table. The input table to clean is indicated in an input data table name. At least one rule definition is processed to clean the input table. Each rule definition indicates a find criteria, a replacement value, and an input data column in the input table. For each rule definition, the input data column is searched for any fields that match the find criteria. The replacement value for the particular rule definition is inserted in the fields in the input data column that match the find criteria. Subsequent applications of additional rule definitions applied to the same input data column operate on replacement values inserted in the input data column during previously applied rule definitions.
42 Citations
27 Claims
-
1. A computer implemented method for performing a clean operation on an input table having an input table name, comprising:
-
receiving user input identifying the input table name and at least one rule definition, wherein each rule definition indicates a find criteria, a replacement value, an input data column in the input table and an optional output table where cleaned data from the input table is to be placed, wherein each rule definition is associated with a rule table including multiple find criteria, a corresponding replacement value for each find criteria, and a sort-key column name for ordering the multiple find criteria; in response to the user input, generating an Application Program Interface (API) call that includes the identified input table name and the at least one rule definition; and for each rule definition included in the API call, searching the input data column for any fields that match the find criteria; determining whether the rule definition included in the API call specifies an output table; in response to determining that the rule definition in the API call specifies an output table, inserting the replacement value in the specified output table; and in response to determining that the rule definition does not specify an output table, directly inserting the replacement value in the fields in the input data column that match the find criteria, wherein subsequent applications of additional rule definitions applied to the same input data column operate on replacement values inserted in the input data column in previously applied rule definitions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer implemented system for performing a clean operation on an input table having an input table name, comprising:
-
a processor; a clean transform program, wherein the clean transform program further; receiving user input identifying the input table name and at least one rule definition, wherein each rule definition indicates a find criteria, a replacement value, an input data column in the input table and an optional output table where cleaned data from the input table is to be placed, wherein each rule definition is associated with a rule table including multiple find criteria, a corresponding replacement value for each find criteria, and a sort-key column name for ordering the multiple find criteria; in response to the user input, generating an Application Program Interface (API) call that includes the received input table name and the at least one rule definition; and for each rule definition included in the API call, searching the input data column for any fields that match the find criteria; determining whether the rule definition included in the API call specifies an output table; in response to determining that the rule definition in the API call specifies an output table, inserting the replacement value in the specified output table; in response to determining that the rule definition does not specify an output table, directly inserting the replacement value in the fields in the input data column that match the find criteria, wherein subsequent applications of additional rule definitions applied to the same input data column operate on replacement values inserted in the input data column in previously applied rule definitions. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. An article of manufacture for performing a clean operation on an input table in a database having an input table name, the article of manufacture comprising computer readable media including at least one computer program embedded therein that causes a computer to perform:
-
receiving user input identifying the input table name and at least one rule definition, wherein each rule definition indicates a find criteria, a replacement value, an input data column in the input table and an optional output table where cleaned data from the input table is to be placed, wherein each rule definition is associated with a rule table including multiple find criteria, a corresponding replacement value for each find criteria, and a sort-key column name for ordering the multiple find criteria; in response to the user input, generating an Application Program Interface (API) call that includes the identified input table name and the at least one rule definition; and for each rule definition included in the API call, searching the input data column for any fields that match the find criteria; determining whether the rule definition included in the API call specifies an output table; in response to determining that the rule definition in the API call specifies an output table, inserting the replacement value in the spec fled output table; and in response to determining that the rule definition does not specify an output table, directly inserting the replacement value in the fields in the input data column that match the find criteria, wherein subsequent applications of additional rule definitions applied to the same input data column operate on replacement values inserted in the input data column in previously applied rule definitions. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27)
-
Specification