System for distributed data processing with auto-recovery
First Claim
1. A data processing system for distributed data processing, the data processing system comprising:
- a memory device with computer-readable program code stored thereon;
a communication device;
a processing device operatively coupled to the memory device and the communication device, wherein the processing device is configured to execute the computer-readable program code to;
access a master queue of data processing work comprising a plurality of data processing jobs stored in a long term memory cache;
select at least one of the plurality of data processing jobs from the master queue of data processing work;
divide the at least one data processing job into a plurality of data processing items;
allocate each of the plurality of data processing items to a different one of a distributed network comprising a plurality of distributed user systems to ensure maximum efficiency in processing the at least one data processing job;
actively synchronize some or all the plurality of data processing items among the plurality of distributed user systems, the actively synchronizing comprising;
repeatedly or periodically saving results of the data processing; and
processing the data processing items at a smallest block level allowed by each of the distributed user systems, thereby maximizing efficiency of automatic recovery of completed data processing work.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments enable distributed data processing with automatic caching at multiple system levels by accessing a master queue of data processing work comprising a plurality of data processing jobs stored in a long term memory cache; selecting at least one of the plurality of data processing jobs from the master queue of data processing work; pushing the selected data processing jobs to an interface layer including (i) accessing the selected data processing jobs from the long term memory cache; and (ii) saving the selected data processing jobs in an interface layer cache of data processing work; and pushing at least a portion of the selected data processing jobs to a memory cache of a first user system for minimizing latency in user data processing of the pushed data processing jobs.
277 Citations
14 Claims
-
1. A data processing system for distributed data processing, the data processing system comprising:
-
a memory device with computer-readable program code stored thereon; a communication device; a processing device operatively coupled to the memory device and the communication device, wherein the processing device is configured to execute the computer-readable program code to; access a master queue of data processing work comprising a plurality of data processing jobs stored in a long term memory cache; select at least one of the plurality of data processing jobs from the master queue of data processing work; divide the at least one data processing job into a plurality of data processing items; allocate each of the plurality of data processing items to a different one of a distributed network comprising a plurality of distributed user systems to ensure maximum efficiency in processing the at least one data processing job; actively synchronize some or all the plurality of data processing items among the plurality of distributed user systems, the actively synchronizing comprising; repeatedly or periodically saving results of the data processing; and processing the data processing items at a smallest block level allowed by each of the distributed user systems, thereby maximizing efficiency of automatic recovery of completed data processing work. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer program product for distributed data processing, the computer program product comprising at least one non-transitory computer-readable medium having computer-readable program code portions embodied therein, the computer-readable program code portions comprising:
-
an executable portion configured for accessing a master queue of data processing work comprising a plurality of data processing jobs stored in a long term memory cache; an executable portion configured for selecting at least one of the plurality of data processing jobs from the master queue of data processing work; an executable portion configured for dividing the at least one data processing job into a plurality of data processing items; an executable portion configured for allocating each of the plurality of data processing items to a different one of a distributed network comprising a plurality of distributed user systems to ensure maximum efficiency in processing the at least one data processing job; an executable portion configured for actively synchronizing some or all the plurality of data processing items among the plurality of distributed user systems, the actively synchronizing comprising; repeatedly or periodically saving results of the data processing; and processing the data processing items at a smallest block level allowed by each of the distributed user systems, thereby maximizing efficiency of automatic recovery of completed data processing work. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer-implemented method for distributed data processing, the method comprising:
-
accessing a master queue of data processing work comprising a plurality of data processing jobs stored in a long term memory cache; selecting at least one of the plurality of data processing jobs from the master queue of data processing work; dividing the at least one data processing job into a plurality of data processing items; allocating each of the plurality of data processing items to a different one of a distributed network comprising a plurality of distributed user systems to ensure maximum efficiency in processing the at least one data processing job; and actively synchronizing some or all the plurality of data processing items among the plurality of distributed user systems, the actively synchronizing comprising; repeatedly or periodically saving results of the data processing; and processing the data processing items at a smallest block level allowed by each of the distributed user systems, thereby maximizing efficiency of automatic recovery of completed data processing work. - View Dependent Claims (14)
-
Specification