Dynamic scheduling of tasks for collecting and processing data from external sources
First Claim
1. A computer-implemented method, comprising:
- identifying, by a scheduler, a plurality of jobs, wherein each job in the plurality of jobs comprises collecting data from one or more external sources;
job includes collecting data from an external source;
transmitting, at a direction of the scheduler, credentials to each forwarder in a set of multiple forwarders, wherein each forwarder stores the credentials;
generating, by the scheduler, a configuration token for each job, wherein the configuration token includes identification of data and a target source from which the data is to be collected;
selecting, by the scheduler, for each job a particular forwarder and assigning the job to the particular forwarder selected, wherein assigning the job to the particular forwarder takes into account information received from the particular forwarder on completion of previously assigned jobs;
transmitting, by the scheduler, the configuration token to the particular forwarder;
using, by the particular forwarder, the configuration token and the stored credentials in combination as needed to execute the job;
establishing, by the particular forwarder, communication with the target source using the credentials;
collecting, by the particular forwarder, the data from the target source using the configuration token and then forwarding the collected data to a particular indexer in a plurality of indexers to be indexed; and
receiving, by the scheduler, a communication from each particular forwarder wherein the communication is indicative of whether the assigned job has been completed.
1 Assignment
0 Petitions
Accused Products
Abstract
A scheduler manages execution of a plurality of data-collection jobs, assigns individual jobs to specific forwarders in a set of forwarders, and generates and transmits tokens (e.g., pairs of data-collection tasks and target sources) to assigned forwarders. The forwarder uses the tokens, along with stored information applicable across jobs, to collect data from the target source and forward it onto an indexer for processing. For example, the indexer can then break a data stream into discrete events, extract a timestamp from each event and index (e.g., store) the event based on the timestamp. The scheduler can monitor forwarders'"'"' job performance, such that it can use the performance to influence subsequent job assignments. Thus, data-collection jobs can be efficiently assigned to and executed by a group of forwarders, where the group can potentially be diverse and dynamic in size.
21 Citations
26 Claims
-
1. A computer-implemented method, comprising:
-
identifying, by a scheduler, a plurality of jobs, wherein each job in the plurality of jobs comprises collecting data from one or more external sources;
job includes collecting data from an external source;transmitting, at a direction of the scheduler, credentials to each forwarder in a set of multiple forwarders, wherein each forwarder stores the credentials; generating, by the scheduler, a configuration token for each job, wherein the configuration token includes identification of data and a target source from which the data is to be collected; selecting, by the scheduler, for each job a particular forwarder and assigning the job to the particular forwarder selected, wherein assigning the job to the particular forwarder takes into account information received from the particular forwarder on completion of previously assigned jobs; transmitting, by the scheduler, the configuration token to the particular forwarder; using, by the particular forwarder, the configuration token and the stored credentials in combination as needed to execute the job; establishing, by the particular forwarder, communication with the target source using the credentials; collecting, by the particular forwarder, the data from the target source using the configuration token and then forwarding the collected data to a particular indexer in a plurality of indexers to be indexed; and receiving, by the scheduler, a communication from each particular forwarder wherein the communication is indicative of whether the assigned job has been completed. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A system, comprising:
-
a plurality of data processors associated with a scheduler and a set of forwarders; and a non-transitory computer-readable storage medium containing instructions which when executed on the plurality of data processors, cause the processors to perform operations including; identifying, by a scheduler, a plurality of jobs, wherein each job in the plurality of jobs comprises collecting data from one or more external sources; transmitting, at a direction of the scheduler, credentials to each forwarder in a set of multiple forwarders, wherein each forwarder stores the credentials; generating, by the scheduler, a configuration token for each job, wherein the configuration token includes identification of data and a target source from which the data is to be collected; selecting, by the scheduler, for each job a particular forwarder and assigning the job to the particular forwarder selected, wherein assigning the job to the particular forwarder takes into account information received from the particular forwarder on completion of previously assigned jobs; transmitting, by the scheduler, the configuration token to the particular forwarder; using, by the particular forwarder, the configuration token and the stored credentials in combination as needed to execute the job; establishing, by the particular forwarder, communication with the target source using the credentials; collecting, by the particular forwarder, the data from the target source using the configuration token and then forwarding the collected data to a particular indexer in a plurality of indexers to be indexed; and receiving, by the scheduler, a communication from each particular forwarder wherein the communication is indicative of whether the assigned job has been completed. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24, 25)
-
-
26. A computer-program product tangibly embodied in a non-transitory machine-readable storage medium, including instructions configured to cause a plurality of data processors associated with a scheduler and a set of forwarders to:
-
identify, by a scheduler, a plurality of jobs, wherein each job in the plurality of jobs comprises collecting data from one or more external sources; transmit, at a direction of the scheduler, credentials to each forwarder in a set of multiple forwarders, wherein each forwarder stores the credentials; generate, by the scheduler, a configuration token for each job, wherein the configuration token includes identification of data and a target source from which the data is to be collected; select, by the scheduler, for each job a particular forwarder and assigning the job to the particular forwarder selected, wherein assigning the job to the particular forwarder takes into account information received from the particular forwarder on completion of previously assigned jobs; transmit, by the scheduler, the configuration token to the particular forwarder; use, by the particular forwarder, the configuration token and the stored credentials in combination as needed to execute the job; establish, by the particular forwarder, communication with the target source using the credentials; collect, by the particular forwarder, the data from the target source using the configuration token and then forwarding the collected data to a particular indexer in a plurality of indexers to be indexed; and receive, by the scheduler, a communication from each particular forwarder, wherein the communication is indicative of whether the assigned job has been completed.
-
Specification