Model-based self-optimizing distributed information management
First Claim
1. A method for managing data collection in a distributed processing system, the method on an information processing system comprising:
- dynamically collecting at least one statistical query pattern associated with a plurality of queries received from a plurality of client nodes in a distributed processing system;
dynamically monitoring at least one operating attribute distribution across a plurality of overlay nodes, wherein the at least one operating attribute distribution is associated with an operating attribute that has been queried by at least one of the client nodes for the plurality of overlay nodes in the distributed processing system, wherein an overlay node performs one or more data stream processing functions, and wherein an operating attribute is a distributed resource consumable by the at least one of the client nodes;
dynamically, and without user intervention, selecting a first set of overlay nodes from the plurality of overlay nodes based on the at least one statistical query pattern and the at least one operating attribute distribution; and
dynamically configuring without user intervention, based on the query pattern and the operating attribute distribution, the first group of overlay nodes to periodically push a first set of operating attributes associated with each overlay node in the selected group to a managing node associated with at least the first group of overlay nodes, wherein the first group of overlay nodes and the first set of operating attributes are selected so that a majority of queries received by client nodes are resolved by the first set of operating attributes that have been pushed, and wherein on-demand pull operations are performed on a second group overlay nodes within the distributed processing system to acquire a second set of operating attributes to resolve queries received from client nodes in which the first set of operating attributes that have been pushed have failed to resolve.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed are a method, information processing system, and computer readable medium for managing data collection in a distributed processing system. The method includes dynamically collecting at least one statistical query pattern associated with a selected group of information processing nodes. The statistical query pattern is dynamically collected from a plurality of information processing nodes in a distributed processing system. At least one operating attribute distribution associated with an operating attribute that has been queried for the selected group is dynamically monitored. The selected group is dynamically configured, based on the query pattern and the operating attribute distribution, to periodically push a set of attributes associated with the each information processing node in the selected group.
-
Citations
20 Claims
-
1. A method for managing data collection in a distributed processing system, the method on an information processing system comprising:
-
dynamically collecting at least one statistical query pattern associated with a plurality of queries received from a plurality of client nodes in a distributed processing system; dynamically monitoring at least one operating attribute distribution across a plurality of overlay nodes, wherein the at least one operating attribute distribution is associated with an operating attribute that has been queried by at least one of the client nodes for the plurality of overlay nodes in the distributed processing system, wherein an overlay node performs one or more data stream processing functions, and wherein an operating attribute is a distributed resource consumable by the at least one of the client nodes; dynamically, and without user intervention, selecting a first set of overlay nodes from the plurality of overlay nodes based on the at least one statistical query pattern and the at least one operating attribute distribution; and dynamically configuring without user intervention, based on the query pattern and the operating attribute distribution, the first group of overlay nodes to periodically push a first set of operating attributes associated with each overlay node in the selected group to a managing node associated with at least the first group of overlay nodes, wherein the first group of overlay nodes and the first set of operating attributes are selected so that a majority of queries received by client nodes are resolved by the first set of operating attributes that have been pushed, and wherein on-demand pull operations are performed on a second group overlay nodes within the distributed processing system to acquire a second set of operating attributes to resolve queries received from client nodes in which the first set of operating attributes that have been pushed have failed to resolve. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. An information processing system for managing data collection in a distributed processing system, the information processing system comprising:
-
a memory; a processor communicatively to the memory; and an information management system communicatively coupled to the memory and the processor, the information management system for; dynamically collecting at least one statistical query pattern associated with a plurality of queries received from a plurality of client nodes in a distributed processing system; dynamically monitoring at least one operating attribute distribution across a plurality of overlay nodes, wherein the at least one operating attribute distribution is associated with an operating attribute that has been queried by at least one of the client nodes for the plurality of overlay nodes in the distributed processing system, wherein an overlay node performs one or more data stream processing functions, and wherein the at least one operating attribute is a distributed resource consumable by the at least one of the client nodes; dynamically, and without user intervention, selecting a first set of overlay nodes from the plurality of overlay nodes based on the at least one statistical query pattern and the at least one operating attribute distribution; and dynamically configuring, based on the query pattern and the operating attribute distribution, the first group of overlay nodes to periodically push a first set of operating attributes associated with each overlay node in the selected group to a managing node associated with at least the first group of overlay nodes, wherein the first group of overlay nodes and the first set of operating attributes are selected so that a majority of queries received by client nodes are resolved by the first set of operating attributes that have been pushed, and wherein on-demand pull operations are performed on a second group overlay nodes within the distributed processing system to acquire a second set of operating attributes to resolve queries received from client nodes in which the first set of operating attributes that have been pushed have failed to resolve. - View Dependent Claims (13, 14, 15)
-
-
16. A tangible computer readable medium for managing data collection in a distributed processing system, the computer readable medium comprising instructions for:
-
dynamically monitoring at least one operating attribute distribution across a plurality of overlay nodes, wherein the at least one operating attribute distribution is associated with an operating attribute that has been queried by at least one of a plurality of client nodes for the plurality of overlay nodes in the distributed processing system, wherein an overlay node performs one or more data stream processing functions, and wherein the at least one operating attribute is a distributed resource consumable by the at least one of the client nodes; dynamically, and without user intervention, selecting a first set of overlay nodes from the plurality of overlay nodes based on the at least one statistical query pattern and the at least one operating attribute distribution; and dynamically configuring, based on the query pattern and the operating attribute distribution, the first group of overlay nodes to periodically push a first set of operating attributes associated with each overlay node in the selected group to a managing node associated with at least the first group of overlay nodes, wherein the first group of overlay nodes and the first set of operating attributes are selected so that a majority of queries received by client nodes are resolved by the first set of operating attributes that have been pushed, and wherein on-demand pull operations are performed on a second group overlay nodes within the distributed processing system to acquire a second set of operating attributes to resolve queries received from client nodes in which the first set of operating attributes that have been pushed have failed to resolve. - View Dependent Claims (17, 18, 19, 20)
-
Specification