Method and apparatus for loading data into a database in a multiprocessor environment
First Claim
1. A method of loading data having a predetermined order of data records from a source into a database using parallel processing comprising:
- using one or more reading agents, reading data records from the source in which said data records are stored;
storing said records in groups of records;
tagging each group with a sequence identifier corresponding to said predetermined order of data;
formatting the data records in said groups using a plurality of formatting agents in parallel;
storing said formatted records in formatted record groups;
writing said sequence identifier of each said group to the respective formatted record group;
assigning page locations for said records using said sequence identifiers to maintain said predetermined order.
3 Assignments
0 Petitions
Accused Products
Abstract
This invention provides a method and apparatus for loading data having a predetermined order of data records from a source into a database using parallel processing. Using one or more reading agents, data records are read from the source in which they are stored. These data records are stored in groups of records. Each group of stored records are tagged with a sequence identifier corresponding to the predetermined order of data. Apparatus is provided for formatting the data records in the groups using a plurality of formatting agents in parallel. The formatted records are stored in formatted record groups. Sequence identifiers are written for each of the groups to the respective formatted record group. A record identification apparatus is provided for assigning page locations for the records using the sequence identifiers to maintain the predetermined order.
74 Citations
13 Claims
-
1. A method of loading data having a predetermined order of data records from a source into a database using parallel processing comprising:
-
using one or more reading agents, reading data records from the source in which said data records are stored; storing said records in groups of records; tagging each group with a sequence identifier corresponding to said predetermined order of data; formatting the data records in said groups using a plurality of formatting agents in parallel; storing said formatted records in formatted record groups; writing said sequence identifier of each said group to the respective formatted record group; assigning page locations for said records using said sequence identifiers to maintain said predetermined order. - View Dependent Claims (2, 3, 4)
-
-
5. For a MULTI PROCESSOR data processing system capable of parallel processing, a method of loading data having a predetermined order of data records into a database using parallel processing comprising:
-
using one or more reading agents, reading data records from the source in which said data records are stored; storing said records in memory buffers; tagging each buffer with a sequence identifier corresponding to said predetermined order of data; transferring control of each said buffer to the control of one of a plurality of formatting agents; converting said records of data into a suitable database storage format; storing said formatted records in formatted record buffers; writing said sequence identifier of each said buffer to the respective said formatted record buffer; forwarding said formatted record buffers to a record identifier agent; assigning a page location for each record in said formatted record buffers using said sequence identifiers to maintain said predetermined order. - View Dependent Claims (6, 7, 8)
-
-
9. For a parallel processing system, a method of loading data records having a pre selected order from a source location, in parallel, into a database, while maintaining said order, comprising:
-
reading a record from said source location; storing said record in memory buffers of predetermined size; tagging each buffer with a sequence number corresponding to said pre selected order of said record in its source location; transferring control of said buffers to a plurality of formatter agents which operate in parallel; converting each record to a suitable database storage format; storing said formatted records in formatted record buffers; writing a sequence identification number of each said buffer to said corresponding formatted record buffer; transferring control of said formatted record buffers to a RIDer agent; for each formatted record buffer being received in sequence assigning a page location to each record; transferring control of said formatted data buffer to a plurality of writer agents; assembling formatted records into pages writing said pages to said assigned storage locations. - View Dependent Claims (10)
-
-
11. Apparatus for loading data records having a pre selected order using a parallel data processing system to load said data into the data storage of a database using said data processing system while maintaining said pre selected data order, comprising;
-
at least one reader agent to read said data; buffer storage means for storing said data in buffers; tagging means for tagging each buffer containing said data with a sequence number corresponding to the pre selected order of the data; a plurality of formatter agents for converting said data records read by said at least one reader agent to a format suitable for said database; means for storing said formatted records in formatted record buffers, with the sequence numbers of the buffers containing said data; a RIDer agent for assigning page locations for each said formatted data records; means for assembling said formatted records into pages; a plurality of writer agents for writing each said pages to said assigned storage locations.
-
-
12. A computer program product comprising:
a computer usable medium having computer readable program code means embodied therein for causing a computer to load user data, the computer program product comprising; computer readable program code means for causing a computer to effect apparatus for loading data records having a pre selected order using a parallel data processing system to load said data into the data storage of a database using said data processing system while maintaining said pre selected data order, comprising; computer readable program code means for causing a computer to effect at least one reader agent to read said data; computer readable program code means for causing a computer to effect buffer storage means for storing said data in buffers; computer readable program code means for causing a computer to effect tagging means for tagging each buffer containing said data with a sequence number corresponding to the pre selected order of the data; computer readable program code means for causing a computer to effect a plurality of formatter agents for converting said data records read by said at least one reader agent to a format suitable for said database; computer readable program code means for causing a computer to effect means for storing said formatted records in formatted record buffers, with the sequence numbers of the buffers containing said data; computer readable program code means for causing a computer to effect a (RIDer) agent for assigning page locations for each said formatted data records; computer readable program code means for causing a computer to effect means for assembling said formatted records into pages, and; computer readable program code means for causing a computer to effect a plurality of writer agents for writing each said pages to said assigned storage locations.
-
13. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform method steps for loading data having a predetermined order of data records from a source into a database using parallel processing said method steps comprising:
-
using one or more reading agents, reading data records from the source in which they are stored; storing said records in groups of records; tagging each group with a sequence identifier corresponding to said predetermined order of data; formatting the data records in said groups using a plurality of formatting agents in parallel; storing said formatted records in formatted record groups; writing said sequence identifier of each said group to the respective formatted record group; assigning page locations for said records using said sequence identifiers to maintain said predetermined order.
-
Specification