×

Data warehouse compatibility

  • US 9,460,188 B2
  • Filed: 06/03/2013
  • Issued: 10/04/2016
  • Est. Priority Date: 06/03/2013
  • Status: Active Grant
First Claim
Patent Images

1. A system for establishing compatibility between an open-source data warehouse and a proprietary data warehouse, the system comprising:

  • a distribution processing module which receives a set of unprocessed, raw data points;

    a distributed file system which stores the set of data points according to a first data format;

    a first data warehouse which executes an extract, transform, and load (ETL) operation on the set of data points, the ETL operation comprising an extract process which selects from the set of data points a subset of data points for transformation and loading into a second data warehouse that uses a second data format; and

    a workflow scheduler which schedules execution of the ETL operation according to a workflow implemented as a directed acyclic graph comprising a plurality of action nodes that each specify an action to perform for the ETL operation wherein the plurality of action nodes comprise at least one action node that specifies a map and reduce process to perform on the subset of data points;

    wherein the ETL operation further comprises one or more transformation processes, each transformation process comprising transforming individual data points in the subset of data points to obtain a set of transformed data points having the second data format;

    wherein the one or more transformation processes comprise a group rank process, the group rank process comprising ranking a data subset that comprises a plurality of data points from the set of data points, the ranking based on a plurality of metrics, and the group rank process further comprising storing the data subset in a ranked data table comprising a first column corresponding to a first metric of the plurality of metrics, a second column corresponding to a second metric of the plurality of metrics, and a third column corresponding to a rank; and

    wherein the ETL operation further comprises a load process comprising loading the set of transformed data points into the second data warehouse.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×