Business information warehouse toolkit and language for warehousing simplification and automation
First Claim
1. A method for information warehouse construction in memory of a hardware computer processor, comprising:
- accepting input from a user at a front-end graphical user interface;
generating a plurality of commands employing declarative statements in an information warehouse language and performing loading or copying of data using;
information warehouse (IW) level commands that when implemented provide support for both structured and unstructured data,indexing engine level commands configured to perform tasks specified by the commands from a user interface,source mapping commands configured to perform at least one of;
a data format transformation, source-to-target complex schema matching, and complex data value mapping,data loading commands configured to perform at least one of;
a data format transformation, source-to-target complex schema matching, and complex data value mapping, anddimension level commands that when implemented provide support for both structured and unstructured data and include at least one of;
creating a dimension, altering the dimension, and dropping the dimension; and
constructing an information warehouse based on a script comprising the IW level commands and the dimension level commands according to the language specification for information warehouse construction; and
providing error correction for data loaded from a source file into the information warehouse using;
an abort command that uses a pending log to return the information warehouse to a consistent break point prior to a data load into the information warehouse so that operational details of maintaining data integrity for both the structured and the unstructured data are hidden from a user of the abort command, such that a load command automatically resumes loading data into the information warehouse from a last consistent breakpoint of a data load, while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the load command;
an undo command, operating with regard to the source file, that uses a checkpoint log to automatically return the information warehouse to a state of the information warehouse prior to loading of the source file, while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the undo command; and
a redo command, operating with regard to the source file, that uses the checkpoint log to automatically return the information warehouse to a state of the information warehouse after loading of the source file, wherein the loading of the source file has previously been undone, and while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the redo command.
0 Assignments
0 Petitions
Accused Products
Abstract
A method for use with an information (or data) warehouse comprises managing the information warehouse with instructions in a declarative language. The instructions specify information warehouse-level tasks to be done without specifying certain details of how the tasks are to be implemented, for example, using databases and text indexers. The details are hidden from the user and include, for example, in an information warehouse having a FACT table that joins two or more dimension tables, details of database level operations when structured data are being handled, including database command line utilities, database drivers, and structured query language (SQL) statements; and details of text-indexing engines when unstructured data are being handled. The information warehouse is managed in a dynamic way in which different tasks—such as data loading tasks and information warehouse construction tasks—may be interleaved (i.e., there is no particular order in which the different tasks must be completed).
21 Citations
26 Claims
-
1. A method for information warehouse construction in memory of a hardware computer processor, comprising:
-
accepting input from a user at a front-end graphical user interface; generating a plurality of commands employing declarative statements in an information warehouse language and performing loading or copying of data using; information warehouse (IW) level commands that when implemented provide support for both structured and unstructured data, indexing engine level commands configured to perform tasks specified by the commands from a user interface, source mapping commands configured to perform at least one of;
a data format transformation, source-to-target complex schema matching, and complex data value mapping,data loading commands configured to perform at least one of;
a data format transformation, source-to-target complex schema matching, and complex data value mapping, anddimension level commands that when implemented provide support for both structured and unstructured data and include at least one of;
creating a dimension, altering the dimension, and dropping the dimension; andconstructing an information warehouse based on a script comprising the IW level commands and the dimension level commands according to the language specification for information warehouse construction; and providing error correction for data loaded from a source file into the information warehouse using; an abort command that uses a pending log to return the information warehouse to a consistent break point prior to a data load into the information warehouse so that operational details of maintaining data integrity for both the structured and the unstructured data are hidden from a user of the abort command, such that a load command automatically resumes loading data into the information warehouse from a last consistent breakpoint of a data load, while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the load command; an undo command, operating with regard to the source file, that uses a checkpoint log to automatically return the information warehouse to a state of the information warehouse prior to loading of the source file, while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the undo command; and a redo command, operating with regard to the source file, that uses the checkpoint log to automatically return the information warehouse to a state of the information warehouse after loading of the source file, wherein the loading of the source file has previously been undone, and while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the redo command. - View Dependent Claims (2, 3, 4)
-
-
5. A method for data loading in an information warehouse stored in memory of a hardware computer processor ,comprising:
-
accepting input from a user at a front-end graphical user interface; generating a plurality of commands employing declarative statements in an information warehouse language and performing loading or copying of the data using; information warehouse (IW) level commands that when implemented provide support for both structured and unstructured data, indexing engine level commands, dimension level commands that when implemented include at least one of;
creating a dimension, altering a dimension, and dropping the dimension, and provide support for both structured and unstructured data,source mapping commands configured to perform at least one of;
performing a data format transformation, performing source-to-target complex schema matching, and performing complex data value matching, anddata loading commands configured to perform at least one of;
performing a data format transformation, performing source-to-target complex schema matching, and performing complex data value matching;loading data into an information warehouse based on a script including both the source mapping commands and the data loading commands according to the language specification for data loading; and providing error correction for data loaded from a source file into the information warehouse according to at least one of;
the information warehouse level commands, the dimension level commands, the source mapping commands and the data loading commands,an abort command that uses a pending log to return the information warehouse to a consistent break point prior to a data load into the information warehouse so that operational details of maintaining data integrity for both the structured and the unstructured data are hidden from a user of the abort command, such that a load command automatically resumes loading data into the information warehouse from a last consistent breakpoint of a data load, while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the load command; an undo command, operating with regard to the source file, that uses a checkpoint log to automatically return the information warehouse to a state of the information warehouse prior to loading of the source file, while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the undo command; and a redo command, operating with regard to the source file, that uses the checkpoint log to automatically return the information warehouse to a state of the information warehouse after loading of the source file, wherein the loading of the source file has previously been undone, and while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the redo command. - View Dependent Claims (6, 7)
-
-
8. A method for data maintenance of an information warehouse stored in memory of a hardware computer processor, comprising:
-
providing a language specification for information warehouse (IW) maintenance in the information warehouse; accepting input from a user at a front-end graphical user interface; generating a plurality of commands employing declarative statements in an information warehouse language and performing loading or copying by; implementing indexing engine level commands configured to perform tasks specified by the commands from an interface; implementing dimension level commands that when implemented provide support for both structured and unstructured data and include at least one of;
creating a dimension, altering the dimension, and dropping the dimension;implementing IW level commands that when implemented provide support for both structured and unstructured data; implementing source mapping commands configured to perform at least one of;
a data transformation, source-to-target complex schema matching, and complex data value mapping; andimplementing data loading commands configured to perform at least one of;
a data format transformation, source-to-target complex schema matching, and complex data value mapping; andperforming an IW maintenance operation on the information warehouse based on a script including at least one of failure recovery commands, error correction commands according to the language specification for IW maintenance, an abort command that uses a pending log to return the information warehouse to a consistent break point prior to a data load into the information warehouse so that operational details of maintaining data integrity for both structured and unstructured data are hidden from a user of the abort command, such that a load command automatically resumes loading data into the information warehouse from a last consistent breakpoint of a data load, while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the load command, an undo command, operating with regard to a source file, that uses a checkpoint log to automatically return the information warehouse to a state of the information warehouse prior to loading of the source file, while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the undo command, and a redo command, operating with regard to the source file, that uses the checkpoint log to automatically return the information warehouse to a state of the information warehouse after loading of the source file, wherein the loading of the source file has previously been undone, and while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the redo command. - View Dependent Claims (9)
-
-
10. An information warehouse system comprising:
-
a hardware computer processor including memory; a front-end graphical user interface that accepts inputs from a user and generates a plurality of commands employing declarative statements in an information warehouse language; an information warehouse language processor stored on the memory that receives a plurality of user interface commands from the user interface and performs loading using at least one of; system level commands; database level commands that when implemented provide support for both structured and unstructured data; indexing engine level commands configured to perform tasks specified by the commands from the interface; source mapping commands configured to perform at least one of;
a data transformation, source-to-target complex schema matching, and complex data value mapping;data loading commands configured to perform at least one of;
a data format transformation, source-to-target complex schema matching, and complex data value mapping; anddimension level commands that when implemented provide support for both structured and unstructured data and include at least one of;
creating a dimension, altering the dimension, and dropping the dimension;a warehouse builder that; receives the at least one of;
the system level commands, the database level commands, and the indexing engine level commands;performs tasks specified by the at least one of system level, database level, and indexing engine level commands; builds an information warehouse; and provides error correction for data loaded from a source file into the information warehouse using; an abort command that uses a pending log to return the information warehouse to a consistent break point prior to a data load into the information warehouse so that operational details of maintaining data integrity for both the structured and the unstructured data are hidden from a user of the abort command, such that a load command automatically resumes loading data into the information warehouse from a last consistent breakpoint of a data load, while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the load command; an undo command, operating with regard to the source file, that uses a checkpoint log to automatically return the information warehouse to a state of the information warehouse prior to loading of the source file, while maintaining data integrity for both structured and unstructured data by performing system level tasks that are hidden from a user of the undo command; and a redo command, operating with regard to the source file, that uses the checkpoint log to automatically return the information warehouse to a state of the information warehouse after loading of the source file, wherein the loading of the source file has previously been undone, and while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the redo command. - View Dependent Claims (11, 12, 13, 14, 15, 16)
-
-
17. A computer program product for use with an information warehouse, the computer program product comprising a non-transitory computer readable medium including a computer executable program, wherein the computer executable program when executed on a computer causes the computer to:
-
accept commands of an information warehouse language using declarative statements provided by a script; parse the commands from the script; generate data description language commands from the parsed commands including information warehouse level commands that when implemented provide support for both structured and unstructured data; execute the data description language commands to construct an information warehouse; perform loading or copying of the structured and unstructured data, using; indexing engine level commands configured to perform tasks specified by commands from an interface; source mapping commands configured to perform at least one of;
a data format transformation, source-to-target complex schema matching, and complex data value mapping;data loading commands configured to perform at least one of;
a data format transformation, source-to-target complex schema matching, and complex data value mapping;dimension level commands that when implemented include at least one of;
creating a dimension, altering the dimension, and dropping the dimension, and provide support for both structured and unstructured data; andprovide error correction for data loaded from a source file into the information warehouse using; an abort command that uses a pending log to return the information warehouse to a consistent break point prior to a data load into the information warehouse so that operational details of maintaining data integrity for both the structured and the unstructured data are hidden from a user of the abort command, such that a load command automatically resumes loading data into the information warehouse from a last consistent breakpoint of a data load, while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the load command; an undo command, operating with regard to a source file, that uses a checkpoint log to automatically return the information warehouse to a state of the information warehouse prior to loading of the source file, while maintaining data integrity for both structured and unstructured data by performing system level tasks that are hidden from a user of the undo command; and a redo command, operating with regard to the source file, that uses the checkpoint log to automatically return the information warehouse to a state of the information warehouse after loading of the source file, wherein the loading of the source file has previously been undone, and while maintaining data integrity for both structured and unstructured data by performing system level tasks that are hidden from a user of the redo command. - View Dependent Claims (18, 19, 20, 21)
-
-
22. A computer program product for use with an information warehouse, the computer program product comprising a non-transitory computer readable medium including a computer executable program, wherein the computer readable program when executed on a computer causes the computer to:
-
accept commands of an information warehouse language using declarative statements provided by a script; parse the commands from the script; extract desired data from a source file according to the parsed commands; transform the extracted data into a target format according to the parsed commands; load the transformed data into information warehouse tables, file system locations according to the parsed commands; perform loading or copying of the transformed data, using; information warehouse level commands that provide support for both structured and unstructured data, indexing engine level commands configured to perform tasks specified by the commands from an interface; source mapping commands configured to perform at least one of;
a data format transformation, source-to-target complex schema matching, and complex data value mapping;data loading commands configured to perform at least one of;
a data format transformation, source-to-target complex schema matching, and complex data value mapping; anddimension level commands that include at least one of;
creating a dimension, altering the dimension, and dropping the dimension and provide support for both structured and unstructured data; andprovide error correction for data loaded from a source file into the information warehouse using; an abort command that uses a pending log to return the information warehouse to a consistent break point prior to a data load into the information warehouse so that operational details of maintaining data integrity for both the structured and the unstructured data are hidden from a user of the abort command, such that a load command automatically resumes loading data into the information warehouse from a last consistent breakpoint of a data load, while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the load command; an undo command, operating with regard to the source file, that uses a checkpoint log to automatically return the information warehouse to a state of the information warehouse prior to loading of the source file, while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the undo command; and a redo command, operating with regard to the source file, that uses the checkpoint log to automatically return the information warehouse to a state of the information warehouse after loading of the source file, wherein the loading of the source file has previously been undone, and while maintaining data integrity for both the structured and the unstructured data by performing system level tasks that are hidden from a user of the redo command. - View Dependent Claims (23, 24, 25, 26)
-
Specification