APPARATUS, SYSTEMS, AND METHODS FOR BATCH AND REALTIME DATA PROCESSING
First Claim
1. A computing system for generating a summary data of a set of data, the computing system comprising:
- one or more processors configured to run one or more modules stored in non-tangible computer readable medium, wherein the one or more modules are operable to;
receive a first set of data and a second set of data, wherein the first set of data comprises a larger number of data items compared to the second set of data;
process the first set of data to format the first set of data into a first structured set of data;
generate a first summary data using the first structured set of data by operating rules for summarizing the first structured set of data, and store the first summary data in a data store;
process the second set of data to format the second set of data into a second structured set of data;
generate a second summary data based on the first structured set of data and the second structured set of data by operating rules for summarizing the first structured set of data and the second structured set of data;
determine a difference between the first summary data and the second summary data; and
update the data store based on the difference between the first summary data and the second summary data.
5 Assignments
0 Petitions
Accused Products
Abstract
A traditional data processing system is configured to process input data either in batch or in real-time. On one hand, a batch data processing system is limiting because the batch data processing often cannot take into account any data received during the batch data processing. On the other hand, a real-time data processing system is limiting because the real-time system often cannot scale. The real-time data processing system is often limited to dealing with primitive data types and/or a small amount of data. Therefore, it is desirable to address the limitations of the batch data processing system and the real-time data processing system by combining the benefits of the batch data processing system and the real-time data processing system into a single data processing system.
44 Citations
23 Claims
-
1. A computing system for generating a summary data of a set of data, the computing system comprising:
one or more processors configured to run one or more modules stored in non-tangible computer readable medium, wherein the one or more modules are operable to; receive a first set of data and a second set of data, wherein the first set of data comprises a larger number of data items compared to the second set of data; process the first set of data to format the first set of data into a first structured set of data; generate a first summary data using the first structured set of data by operating rules for summarizing the first structured set of data, and store the first summary data in a data store; process the second set of data to format the second set of data into a second structured set of data; generate a second summary data based on the first structured set of data and the second structured set of data by operating rules for summarizing the first structured set of data and the second structured set of data; determine a difference between the first summary data and the second summary data; and update the data store based on the difference between the first summary data and the second summary data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
14. A method for generating a summary data of a set of data, the method comprising:
-
receiving, at an input module operating on a processor of a computing system, a first set of data and a second set of data, wherein the first set of data comprises a larger number of data items compared to the second set of data; processing, at a first input processing module of the computing system, the first set of data to format the first set of data into a first structured set of data; generating, at a first summary generation module of the computing system, a first summary data using the first structured set of data by operating rules for summarizing the first structured set of data; maintaining the first summary data in a data store in the computing system; processing, at a second input processing module of the computing system, the second set of data to format the second set of data into a second structured set of data; generating, at a second summary generation module of the computing system, a second summary data using the first structured set of data and the second structured set of data by operating rules for summarizing the first structured set of data and the second structured set of data; determining, at a difference generation module of the computing system, a difference between the first summary data and the second summary data; and updating, by the computing system, the data store based on the difference between the first summary data and the second summary data. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. A computer program product, tangibly embodied in a non-transitory computer-readable storage medium, the computer program product including instructions operable to cause a data processing system to:
-
receive a first set of data and a second set of data, wherein the first set of data comprises a larger number of data items compared to the second set of data; process the first set of data to format the first set of data into a first structured set of data; generate a first summary data using the first structured set of data by operating rules for summarizing the first structured set of data, and store the first summary data in a data store; process the second set of data to format the second set of data into a second structured set of data; generate a second summary data using the first structured set of data and the second structured set of data by operating rules for summarizing the first structured set of data and the second structured set of data; determine a difference between the first summary data and the second summary data; and update the data store based on the difference between the first summary data and the second summary data. - View Dependent Claims (21, 22, 23)
-
Specification