Method and process for enabling distributing cache data sources for query processing and distributed disk caching of large data and analysis requests
First Claim
1. A method for distributed processing of data, comprising the steps of:
- acquiring, by an adaptor neuron executing on a processor, data from a data source;
creating, by the adaptor neuron on a first computer, a self-describing disk cache file based at least in part upon the data acquired from the data source;
creating, by the adaptor neuron, a copy of the adaptor neuron and a copy of the self-describing disk cache file on a second computer;
permitting access to the self-describing disk cache file through the adaptor neuron by a first participating application that is configured to perform a first data processing operation;
permitting access to the copy of the self-describing disk cache file through the copy of the adaptor neuron by a second participating action that is configured to perform a second data processing operation;
maintaining, by the adaptor neuron and the copy of the adaptor neuron, a memory tree map of change history to both the self-describing disk cache file and the copy of the self-describing disk cache file; and
retrieving, by the copy of the adaptor neuron, in response to a request for information in the copy of the self-describing disk cache file at a point in the change history, the information from the copy of the self-describing disk cache file at memory location in the memory tree map corresponding to the point in the change history.
3 Assignments
0 Petitions
Accused Products
Abstract
A method and system for large data and distributed disk cache processing in a Pneuron platform 100. The system and method include three specific interoperable but distributed functions: the adapter/cache Pneuron 14 and distributed disk files 34, a dynamic memory mapping tree 50, and distributed disk file cleanup 28. The system allows for large data processing considerations and the ability to access and acquire information from large data sets 102a, 102b and rapidly distribute and provide the information to subsequent Pneurons 104 for processing. The system also provides the ability to store large result sets, the ability to deal with sequential as well as asynchronous parallel processing, the ability to address large unstructured data; web logs, email, web pages, etc., as well as the ability to handle failures to large block processing.
-
Citations
8 Claims
-
1. A method for distributed processing of data, comprising the steps of:
-
acquiring, by an adaptor neuron executing on a processor, data from a data source; creating, by the adaptor neuron on a first computer, a self-describing disk cache file based at least in part upon the data acquired from the data source; creating, by the adaptor neuron, a copy of the adaptor neuron and a copy of the self-describing disk cache file on a second computer; permitting access to the self-describing disk cache file through the adaptor neuron by a first participating application that is configured to perform a first data processing operation; permitting access to the copy of the self-describing disk cache file through the copy of the adaptor neuron by a second participating action that is configured to perform a second data processing operation; maintaining, by the adaptor neuron and the copy of the adaptor neuron, a memory tree map of change history to both the self-describing disk cache file and the copy of the self-describing disk cache file; and retrieving, by the copy of the adaptor neuron, in response to a request for information in the copy of the self-describing disk cache file at a point in the change history, the information from the copy of the self-describing disk cache file at memory location in the memory tree map corresponding to the point in the change history. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
Specification