×

Normalizing ingested data sets based on fuzzy comparisons to known data sets

  • US 9,529,863 B1
  • Filed: 12/21/2015
  • Issued: 12/27/2016
  • Est. Priority Date: 12/21/2015
  • Status: Active Grant
First Claim
Patent Images

1. A method for ingesting data for a data model using a network computer that employs one or more processors to execute instructions that perform actions, comprising:

  • providing one or more raw data sets to an ingestion engine, wherein each raw data set includes one or more raw records;

    providing one or more ingestion rules associated with one or more confidence scores and one or more known data sets based on a type of the one or more raw records;

    employing the ingestion engine to iteratively execute the one or more ingestion rules, performing further actions, including;

    providing a comparison of one or more portions of the one or more raw records to the one or more known data sets;

    transforming contents of the one or more raw records into one or more model record values based on the comparison to the one or more known data sets;

    storing the one or more model record values in one or more model records;

    providing a score value that indicates a confidence level that the one or more model records are correct based on the one or more confidence scores; and

    storing an association of the one or more ingestion rules used to transform the raw record contents into the model record values stored in the one or more model records; and

    when the score value that indicates the confidence level of the one or more model records is less than a threshold value, performing further actions, including;

    providing a user-interface to interactively edit the one or more raw records or the one or more ingestion rules, wherein the edited one or more ingestion rules produce an increase change or a decrease change in the one or more confidence scores, wherein the one or more changed confidence scores are employed to provide the score value; and

    storing the one or more model records in a data store, wherein the one or more model records are added to the data model.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×