System and method for integrating and accessing multiple data sources within a data warehouse architecture
First Claim
1. A method for maintaining a data warehouse, comprising:
- identifying a data source of interest;
updating a metadata to reflect information available from said source, wherein said metadata comprises domain specific knowledge obtained by analyzing said data source, and wherein said metadata represents at least one abstract concept, at least one database description, at least one transformation and at least one mapping;
automatically generating a mediator based on said metadata, wherein said mediator comprises data management code, wherein said code defines a translation library and a mediator class; and
writing a wrapper for said source which calls said mediator, wherein said method is applied to data warehousing applications in the domain of functional genomics and proteomics.
4 Assignments
0 Petitions
Accused Products
Abstract
A system and method is disclosed for integrating and accessing multiple data sources within a data warehouse architecture. The metadata formed by the present method provide a way to declaratively present domain specific knowledge, obtained by analyzing data sources, in a consistent and useable way. Four types of information are represented by the metadata: abstract concepts, databases, transformations and mappings. A mediator generator automatically generates data management computer code based on the metadata. The resulting code defines a translation library and a mediator class. The translation library provides a data representation for domain specific knowledge represented in a data warehouse, including “get” and “set” methods for attributes that call transformation methods and derive a value of an attribute if it is missing. The mediator class defines methods that take “distinguished” high-level objects as input and traverse their data structures and enter information into the data warehouse.
316 Citations
28 Claims
-
1. A method for maintaining a data warehouse, comprising:
-
identifying a data source of interest; updating a metadata to reflect information available from said source, wherein said metadata comprises domain specific knowledge obtained by analyzing said data source, and wherein said metadata represents at least one abstract concept, at least one database description, at least one transformation and at least one mapping; automatically generating a mediator based on said metadata, wherein said mediator comprises data management code, wherein said code defines a translation library and a mediator class; and writing a wrapper for said source which calls said mediator, wherein said method is applied to data warehousing applications in the domain of functional genomics and proteomics. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A method for maintaining a data warehouse, comprising:
-
identifying a data source of interest; updating a metadata to reflect information available from said source, wherein said metadata comprises domain specific knowledge obtained by analyzing said data source, and wherein said metadata represents at least one abstract concept, at least one database description, at least one transformation and at least one mapping; automatically generating a mediator based on said metadata, wherein said mediator comprises data management code, wherein said code defines a translation library and a mediator class; and writing a wrapper for said source which calls said mediator, wherein said method is applied to data warehousing applications in the domain of protein sequence and structure analysis.
-
-
17. A computer-useable medium embodying computer program code for maintaining a data warehouse by executing the steps of:
-
identifying a data source of interest; updating a metadata to reflect information available from said source, wherein said metadata comprises domain specific knowledge obtained by analyzing said data source, and wherein said metadata represents at least one abstract concept, at least one database description, at least one transformation and at least one mapping; automatically generating a mediator based on said metadata, wherein said mediator comprises data management code, wherein said code defines a translation library and a mediator class; and writing a wrapper for said source which calls said mediator, wherein said method is applied to data warehousing applications in the domain of functional genomics and proteomics. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. A computer-useable medium embodying computer program code for maintaining a data warehouse by executing the steps of:
-
identifying a data source of interest; updating a metadata to reflect information available from said source, wherein said metadata comprises domain specific knowledge obtained by analyzing said data source, and wherein said metadata represents at least one abstract concept, at least one database description, at least one transformation and at least one mapping; automatically generating a mediator based on said metadata, wherein said mediator comprises data management code, wherein said code defines a translation library and a mediator class; and writing a wrapper for said source which calls said mediator, wherein said method is applied to data warehousing applications in the domain of protein sequence and structure analysis.
-
-
28. A method for maintaining a data warehouse, comprising:
-
identifying a data source of interest; updating a metadata to reflect information available from said source, wherein said metadata comprises domain specific knowledge obtained by analyzing said data source, and wherein said metadata represents at least one abstract concept, at least one database description, at least one transformation and at least one mapping; automatically generating a mediator based on said metadata, wherein said mediator comprises data management code, wherein said code defines a translation library and a mediator class; and writing a wrapper for said source which calls said mediator, wherein said method is applied to data warehousing applications in the domain of astrophysics and climate modeling.
-
Specification