Dispatcher for adaptive data collection
First Claim
1. A computer-implemented method, comprising:
- receiving a data collection task;
identifying a group of data collection endpoints associated with the data collection task;
receiving, by a computing device, data collection statistics related to a server associated with the group of data collection endpoints;
generating a forecast performance of the server;
determining a real-time performance of the server by collecting metadata of real-time data retrieval and calculating a weight factor with a timestamp for the data collection task, wherein the weight factor is associated with a first performance of the server during the data collection task;
determining that one or more of the forecast performance or the real-time performance of the server satisfy a threshold level of performance;
sending a data request associated with the data collection task to the server associated with the group of data collection endpoints;
continuing to receive data collection statistics related to the server associated with the group of data collection endpoints;
determining an updated forecast performance of the server including calculating a second weight factor for a second data collection task, wherein the second weight factor is associated with a second performance of the server during the second data collection task, wherein the updated forecast performance of the server is calculated using a series of historical weight factors correlated to historical server performance, wherein the series of historical weight factors include the first weight factor and the second weight factor, wherein the historical server performance includes the first performance of the server and the second performance of the server;
determining, in view of the second weight factor, that the updated forecast performance of the server does not satisfy the threshold level of performance; and
refraining from sending a future data request to the server.
22 Assignments
0 Petitions
Accused Products
Abstract
This disclosure describes systems, methods, and computer-readable media for optimizing data collection in a distributed environment by leveraging real-time and historical data collection performance statistics and server performance data. In some configurations, a computing device can be initially configured for data collection. In such configurations, the initial configuration can include preferred target servers for a particular task. The computing device can request batches of data from the preferred target servers, and process the information through a buffer. Techniques and technologies described herein collect the batches of data from servers as well as corresponding data collection statistics (e.g., server performance per task, server historical performance, etc.) and server performance data (e.g. server status).
11 Citations
20 Claims
-
1. A computer-implemented method, comprising:
-
receiving a data collection task; identifying a group of data collection endpoints associated with the data collection task; receiving, by a computing device, data collection statistics related to a server associated with the group of data collection endpoints; generating a forecast performance of the server; determining a real-time performance of the server by collecting metadata of real-time data retrieval and calculating a weight factor with a timestamp for the data collection task, wherein the weight factor is associated with a first performance of the server during the data collection task; determining that one or more of the forecast performance or the real-time performance of the server satisfy a threshold level of performance; sending a data request associated with the data collection task to the server associated with the group of data collection endpoints; continuing to receive data collection statistics related to the server associated with the group of data collection endpoints; determining an updated forecast performance of the server including calculating a second weight factor for a second data collection task, wherein the second weight factor is associated with a second performance of the server during the second data collection task, wherein the updated forecast performance of the server is calculated using a series of historical weight factors correlated to historical server performance, wherein the series of historical weight factors include the first weight factor and the second weight factor, wherein the historical server performance includes the first performance of the server and the second performance of the server; determining, in view of the second weight factor, that the updated forecast performance of the server does not satisfy the threshold level of performance; and refraining from sending a future data request to the server. - View Dependent Claims (2, 3, 4, 5, 6, 19)
-
-
7. A device comprising:
-
an adaptive dispatcher configured to; receive a data collection task; identify a group of data collection endpoints associated with the data collection task; request data from a server associated with the group of data collection endpoints based on data collection task; an historical collection analyzer module configured to; receive data collection statistics; receive a server status; generate a forecast performance of the server; and send the forecast performance of the server to the adaptive dispatcher; and a real-time collection analyzer module configured to; receive the data collection statistics; determine a real-time performance of the server by collecting metadata of real-time data retrieval and calculating a weight factor with a timestamp for the data collection task, wherein the weight factor is associated with a first performance of the server during the data collection task; send the real-time performance of the server to the adaptive dispatcher, wherein the adaptive dispatcher is configured to send a data request associated with the data collection task to the server associated with the group of data collection endpoints based at least in part on one or more of the forecast performance of the server or the real-time performance of the server; continue to receive data collection statistics related to the server associated with the group of data collection endpoints; and send the real-time performance of the server to the adaptive dispatcher, wherein the adaptive dispatcher is configured to; determine an updated forecast performance of the server by calculating a second weight factor for a second data collection task, wherein the second weight factor is associated with a second performance of the server during the second data collection task, wherein the updated forecast performance of the server is calculated using a series of historical weight factors correlated to historical server performance, wherein the series of historical weight factors include the first weight factor and the second weight factor, wherein the historical server performance includes the first performance of the server and the second performance of the server; determine, in view of the second weight factor, that the updated forecast performance of the server does not satisfy the threshold level of performance; and refrain from sending a future data request to the server. - View Dependent Claims (8, 9, 10, 11, 12, 20)
-
-
13. A data collection system, comprising:
-
a processor; and a non-transitory computer-readable medium coupled to the processor and having instructions stored thereon that, when executed by the processor, cause the processor to perform operations comprising; receive a data collection task; identify a group of data collection endpoints associated with the data collection task; receive data collection statistics from a server associated with the group of data collection endpoints; generate a forecast performance of the server based at least in part on the data collection statistics; determine a real-time performance of the server based at least in part on the data collection statistics by collecting metadata of real-time data retrieval and calculating a weight factor with a timestamp for the data collection task, wherein the weight factor is associated with a first performance of the server during the data collection task; determine that at least one of the forecast performance or the real-time performance of the server meet a threshold performance level; send a data request associated with the data collection task to the server associated with the group of data collection endpoints based at least in part on the forecast performance of the server or the real-time performance of the server; continue to receive data collection statistics related to the server associated with the group of data collection endpoints; determine an updated forecast performance of the server by calculating a second weight factor for a second data collection task, wherein the second weight factor is associated with a second performance of the server during the second data collection task, wherein the updated forecast performance of the server is calculated using a series of historical weight factors correlated to historical server performance, wherein the series of historical weight factors include the first weight factor and the second weight factor, wherein the historical server performance includes the first performance of the server and the second performance of the server; determine, in view of the second weight factor, that the updated forecast performance of the server does not satisfy the threshold level of performance; and refrain from sending a future data request to the server. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification