×

System and method of data cleansing using rule based formatting

  • US 8,150,814 B2
  • Filed: 04/07/2009
  • Issued: 04/03/2012
  • Est. Priority Date: 04/07/2009
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for data cleansing using rule based formatting, comprising:

  • obtaining a first input data from a first data source and a second input data from a second data source, wherein said first input data is tokenized according to a data dictionary, wherein said second input data is tokenized according to said data dictionary;

    parsing, by a rule-based parsing module implemented by a hardware server, said first input data and said second input data using a predefined parsing rule including an option operator, wherein the option operator indicates that a particular index defined in the predefined parsing rule is optional;

    obtaining a formatting rule, wherein said formatting rule includes one or more formatting rule components including at least one conditional format operator, wherein the at least one conditional format operator indicates whether to include a particular string literal in an output data based on the existence of a particular token;

    including a first token in a first output data if a first formatting rule component in the formatting rule is a first valid index to said first tokenized input data, wherein said first token is associated with said first valid index, and including a first string literal in said first output data if said first formatting rule component in the formatting rule is a string literal;

    including a second token in a second output data if said first formatting rule component in the formatting rule is a second valid index to said second tokenized input data, wherein said second token is associated with said second valid index and including a second string literal in said second output data if said first formatting rule component in the formatting rule is the string literal;

    formatting, by a formatting module implemented by the hardware server, said first output data and said second output data according to the formatting rule; and

    outputting said first output data and said second output data having been formatted.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×