Component model for batch computing in a distributed object environment
First Claim
1. A computer-implemented method of batch processing in a batch component model within a distributed object environment, the computer-implemented method comprising:
- instantiating a batch component by a processor of a data processing system, for use with a batch job within the distributed object environment;
initializing the batch component with a set of deployment descriptors and an instance of a batch container to form a contractual relationship between the batch component and the batch container, wherein the set of deployment descriptors is a set of declarative policies for the batch component;
wrapping the contractual relationship between the batch component and the batch container to form an adapter, wherein the adapter isolates the batch component from different implementations of the batch container;
dynamically computing by the batch container, for each use of a checkpoint interval, a size of the checkpoint interval for the batch job based on the set of deployment descriptors and other processing workloads;
managing operation of the batch component in the batch component model by the batch container in accordance with the set of deployment descriptors and the other processing workloads; and
committing, by the batch container on the processor, at an end of the checkpoint interval, checkpoint cursors and data of the batch job that are updated during the batch processing to a storage of the data processing system, wherein context information, including the size of the checkpoint interval and resource dependencies, is persisted and passed to downstream batch containers.
1 Assignment
0 Petitions
Accused Products
Abstract
A batch component model is provided within a distributed object environment. The batch component is designed to capture the iterative logic of a batch program as it reads from one or more input streams, invokes operations on other business component functions, and generates output to one or more output streams. Deployment descriptors express declarative policies for the component that will influence how the component is managed including the streams it uses, business components it depends on, how processing costs are accounted, and the resource demands the job will put on the system. Input streams and output streams are encapsulated in objects that hide the actual source of input and output data so that the component can be redeployed in different execution environments to different physical data sources without requiring the program to be changed. A batch container enforces the deployment policies declared for the batch component.
23 Citations
59 Claims
-
1. A computer-implemented method of batch processing in a batch component model within a distributed object environment, the computer-implemented method comprising:
-
instantiating a batch component by a processor of a data processing system, for use with a batch job within the distributed object environment; initializing the batch component with a set of deployment descriptors and an instance of a batch container to form a contractual relationship between the batch component and the batch container, wherein the set of deployment descriptors is a set of declarative policies for the batch component; wrapping the contractual relationship between the batch component and the batch container to form an adapter, wherein the adapter isolates the batch component from different implementations of the batch container; dynamically computing by the batch container, for each use of a checkpoint interval, a size of the checkpoint interval for the batch job based on the set of deployment descriptors and other processing workloads; managing operation of the batch component in the batch component model by the batch container in accordance with the set of deployment descriptors and the other processing workloads; and committing, by the batch container on the processor, at an end of the checkpoint interval, checkpoint cursors and data of the batch job that are updated during the batch processing to a storage of the data processing system, wherein context information, including the size of the checkpoint interval and resource dependencies, is persisted and passed to downstream batch containers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented method of batch processing with a batch container of a batch component model in a distributed object environment, the computer-implemented method comprising:
-
forming a contractual relationship between a batch component and the batch container; wrapping the contractual relationship between the batch component and the batch container to form an adapter, wherein the adapter isolates the batch component from different implementations of the batch container; computing, by the batch container on a processor of a data processing system, for each use of a checkpoint interval, a size of the checkpoint interval for a batch component of a batch job based on a set of deployment descriptors and other processing workloads, wherein the set of deployment descriptors is a set of declarative policies for the batch component; running the batch component within the checkpoint interval, by the processor; and upon completion of the checkpoint interval, committing by the batch container, data updated during the checkpoint interval and at least one checkpoint cursor to a storage of the data processing system, wherein context information, including the size of the checkpoint interval and resource dependencies, is persisted and passed to downstream batch containers. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A computer-implemented method of batch processing a batch job in a batch component model using a batch container, the computer-implemented method comprising:
-
dividing, by the batch container on a processor of a data processing system, an input stream into a plurality of batch job partitions of the input stream, to form a contractual relationship between each batch component and the batch container; wrapping the contractual relationship between each batch component and the batch container to form an adapter, wherein the adapter isolates each batch component from different implementations of the batch container; computing, by the batch container, on the processor, for each use of a checkpoint interval, a size of the checkpoint interval for a batch component of a batch job using a set of deployment descriptors and other processing workloads, wherein the set of deployment descriptors is a set of declarative policies for the batch component, and wherein context information, including the size of the checkpoint interval and resource dependencies, is persisted and passed to downstream batch containers; running, through a distribution of a scheduler, a plurality of instances of a batch component, by the batch container, within the checkpoint interval, wherein each instance of the batch component within the plurality of instances of the batch component operates on a respective batch job partition of the input stream to produce a separate output stream; and assembling the separate output streams produced by each respective batch job partition. - View Dependent Claims (23, 24, 25)
-
-
26. An apparatus for batch processing in a batch component model within a distributed object environment, the apparatus comprising:
-
a system bus; a local memory in communication with the system bus, wherein the local memory contains computer executable instructions; a processor connected to the system bus, wherein the processor executes the computer executable instructions to; instantiate a batch component on a processor of a data processing system, for use with a batch job within the distributed object; initialize the batch component with a set of deployment descriptors and an instance of a batch container environment to form a contractual relationship between the batch component and the batch container, wherein the set of deployment descriptors is a set of declarative policies for the batch component; wrap the contractual relationship between the batch component and the batch container to form an adapter, wherein the adapter isolates the batch component from different implementations of the batch container; dynamically compute by the batch container a size of a checkpoint interval for the batch job using the set of deployment descriptors and other processing workloads, for each use of the checkpoint interval; manage operation of the batch component in the batch component model within the distributed object environment in accordance with the set of deployment descriptors and the other processing workloads by the batch container; and commit by the batch container checkpoint cursors and data of the batch job that are updated during the batch processing to a storage of the a data processing system on completion of the checkpoint interval, wherein context information, including the size of the checkpoint interval and resource dependencies is persisted and passed to downstream batch containers. - View Dependent Claims (27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56)
-
-
57. A computer program product, for batch processing in a batch component model within a distributed object environment, the computer program product comprising:
-
a computer readable recordable type media having computer executable instructions stored thereon, the computer executable instructions comprising; computer executable instructions for instantiating a batch component for use with a batch job within the distributed object environment; computer executable instructions for initializing the batch component with a set of deployment descriptors and an instance of a batch container to form a contractual relationship between the batch component and the batch container, wherein the set of deployment descriptors is a set of declarative policies for the batch component; computer executable instructions for wrapping the contractual relationship between the batch component and the batch container to form an adapter, wherein the adapter isolates the batch component from different implementations of the batch container; computer executable instructions for dynamically computing, by the batch container, a size of a checkpoint interval for the batch job using the set of deployment descriptors and other processing workloads, for each use of the checkpoint interval; computer executable instructions for managing operation of the batch component within the distributed object environment by the batch container in accordance with the set of deployment descriptors and the other processing workloads; and computer executable instructions for committing by the batch container on completion of the checkpoint interval, checkpoint cursors and data of the batch job that are updated during the batch processing to a storage of a data processing system, wherein context information including the size of the checkpoint interval and resource dependencies is persisted and passed to downstream batch containers. - View Dependent Claims (58, 59)
-
Specification