DISTRIBUTED COMPUTING SYSTEM FOR LARGE-SCALE DATA HANDLING
First Claim
Patent Images
1. a system for processing data on a distributed computing environment, the system comprising:
- a input data storage module containing input data from a weblog;
a map module in communication with the input data storage module to receive a split of the input data and configured to execute mapper code for manipulating the input data to generate mapped data.a reduce module in communication with the map module to receive the map module to receive the mapped data, the reduce module being configured to execute reducer code for analyzing the mapped data and generate result data.a result data storage module in communication with the reduce module to receive the result data from the reduce module.a master module for coordinating the selection, set-up, and data flow of the map module and the reduce module, the master module loading the mapper code onto the mapper module and the reducer code onto the reducer module; and
a central storage module containing a mapper executable file and a reducer executable file, wherein the mapper code accesses the central storage module and loads the mapper executable file onto the mapper module and the reducer code loads the reducer executable file onto the reducer module.
3 Assignments
0 Petitions
Accused Products
Abstract
A method for processing data on a distributed computing environment is provided. Input data that is to be processed may be stored on an input storage module. Mapper code can be loaded onto a map module and executed. The mapper code can load a mapper executable file onto the map module from a central storage unit and instantiate the mapper executable file. The mapper code, then, can pass the input data to the mapper executable file. The mapper executable file can generate mapped data based on the input data and pass the mapped data back to the mapper code.
-
Citations
23 Claims
-
1. a system for processing data on a distributed computing environment, the system comprising:
-
a input data storage module containing input data from a weblog; a map module in communication with the input data storage module to receive a split of the input data and configured to execute mapper code for manipulating the input data to generate mapped data. a reduce module in communication with the map module to receive the map module to receive the mapped data, the reduce module being configured to execute reducer code for analyzing the mapped data and generate result data. a result data storage module in communication with the reduce module to receive the result data from the reduce module. a master module for coordinating the selection, set-up, and data flow of the map module and the reduce module, the master module loading the mapper code onto the mapper module and the reducer code onto the reducer module; and a central storage module containing a mapper executable file and a reducer executable file, wherein the mapper code accesses the central storage module and loads the mapper executable file onto the mapper module and the reducer code loads the reducer executable file onto the reducer module. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for processing data on a distributed computing environment, the method comprising:
-
storing input data from a weblog on an input storage module; loading mapper code onto a map module through a master module; executing the mapper code on the map module; loading a mapper executable file onto the map module from a central storage module; instantiating the mapper executable file on the map module; retrieving a split of the input data from the input storage module; passing the input data from the mapper code to the mapper executable file; manipulating the input data to generate mapped data; passing the mapped data from the mapper executable file to the mapper code; loading reducer code onto a reduce module through a master module; executing the reducer code on the reduce module; loading a reducer executable file onto the reduce module from a central storage module; instantiating the reducer executable file on the reduce module; receiving the mapped data from the map module; passing the input data from the reducer code to the reducer executable file; manipulating the mapped data to generate result data; passing the result data from the reducer executable file to the reducer code; and storing the result data from the reducer on a result storage module. - View Dependent Claims (8, 9, 10)
-
-
11. A method for processing data on a distributed computing environment, the method comprising:
-
storing input data on an input storage module; loading mapper code onto a map module; executing the mapper code on the map module; loading a mapper executable file onto the map module from a central storage module through the mapper code; instantiating the mapper executable file on the map module; retrieving a split of the input data from the input storage module; passing the input data from the mapper code to the mapper executable file; manipulating the input data to generate mapped data; and passing the mapped data from the mapper executable file to the mapper code. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A computer readable medium having stored therein instructions executable by a programmed processor for ranking results, the computer readable medium comprising instructions for:
-
storing input data from a weblog on an input storage module; loading mapper code onto a map module; executing the mapper code on the map module; loading a mapper executable file onto the map module from a central storage module using a fetch instruction in the mapper code; instantiating the mapper executable file on the map module; retrieving a split of the input data from the input storage module; passing the input data from the mapper code to the mapper executable file; manipulating the input data to generate mapped data; passing the mapped data from the mapper executable file to the mapper code; loading reducer code onto a reduce module; executing the reducer code on the reduce module; loading a reducer executable file onto the reduce module from a central storage module using a fetch instruction in the reducer code; instantiating the reducer executable file on the reduce module; receiving the mapped data from the map module; passing the input data from the reducer code to the reducer executable file; manipulating the mapped data to generate result data; passing the result data from the reducer executable file to the reducer code; and storing the result data from the reducer on a result storage module. - View Dependent Claims (21)
-
-
22. The method for according to claim 22, wherein the mapped data is impression\geographic region data.
-
23. The method for according to claim 23, wherein the result data is statistical data regarding a geographical region.
Specification