×

Populating a data warehouse using a pipeline approach

  • US 6,721,749 B1
  • Filed: 07/06/2000
  • Issued: 04/13/2004
  • Est. Priority Date: 07/06/2000
  • Status: Active Grant
First Claim
Patent Images

1. A data collection and warehousing system comprising one or more computers that implement a processing pipeline, wherein the pipeline receives individual log files from a plurality of different servers on a periodic basis and passes the received files through a sequence of operations, the operations comprising:

  • parsing the log files to generate, for each data file, (a) a fact file containing one or more primary key IDs and metrics for eventual use in a data warehouse, and (b) a dimension file containing one or more primary key IDs and strings for eventual use in the data warehouse;

    parsing the fact files to generate, for each fact file, a plurality of fact tables corresponding to different fact tables of the data warehouse, each fact table containing one or more primary key IDs and corresponding metrics;

    parsing the dimension files to generate, for each dimension file, a plurality of dimension tables corresponding to different dimension tables of the data warehouse, each dimension table containing one or more primary key IDs and dimension strings;

    merging tables corresponding to the same data warehouse table to generate fact and dimension tables that each correspond to a single one of the data warehouse tables; and

    loading the merged tables into the data warehouse tables.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×