Predictive estimation for ingestion, performance and utilization in a data indexing and query system
First Claim
Patent Images
1. A method comprising:
- receiving, by a data intake and query system, user input specifying that a first data source is to be a subject of a storage related estimate;
receiving, by the data intake and query system, raw data generated by the first data source;
parsing, by the data intake and query system, the raw data generated by the first data source into a first plurality of events;
generating the storage related estimate, by the data intake and query system, based on at least some of the first plurality of events, the storage related estimate being an estimate of an amount of storage space that would be needed for the data intake and query system to index, including to persistently store, all data received by the data intake and query system from the first data source for a time period;
causing an indication of the storage related estimate to be output to a user of the data intake and query system; and
completing, by the data intake and query system, an indexing of only a sample of the received first plurality of events based on a sampling criterion, the sample being fewer than all of the first plurality of events, and not completing indexing of a remainder of the first plurality of events in the absence of a user input indicative that the first data source should be indexed, wherein completing indexing includes committing data being indexed or to be indexed to persistent storage.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed herein is a data estimation technique for a data intake and query system. The system receives user inputs indicative that a first data source is to be the subject of a storage related estimate. The system receives a first plurality of events generated by the first data source. The system indexes only a sample of the received first plurality of events, based on a sampling criterion, where the sample is fewer than all of the first plurality of events. The system generates the storage related estimate based on at least some of the first plurality of events, and causes an indication of the estimate to be output to a user.
45 Citations
29 Claims
-
1. A method comprising:
-
receiving, by a data intake and query system, user input specifying that a first data source is to be a subject of a storage related estimate; receiving, by the data intake and query system, raw data generated by the first data source; parsing, by the data intake and query system, the raw data generated by the first data source into a first plurality of events; generating the storage related estimate, by the data intake and query system, based on at least some of the first plurality of events, the storage related estimate being an estimate of an amount of storage space that would be needed for the data intake and query system to index, including to persistently store, all data received by the data intake and query system from the first data source for a time period; causing an indication of the storage related estimate to be output to a user of the data intake and query system; and completing, by the data intake and query system, an indexing of only a sample of the received first plurality of events based on a sampling criterion, the sample being fewer than all of the first plurality of events, and not completing indexing of a remainder of the first plurality of events in the absence of a user input indicative that the first data source should be indexed, wherein completing indexing includes committing data being indexed or to be indexed to persistent storage. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27)
-
-
28. A system comprising:
-
a communication device through which to communicate on a computer network; and at least one processor operatively coupled to the communication device and configured to perform operations including receiving user inputs specifying that a first data source is to be a subject of a storage related estimate; receiving data generated by the first data source; parsing the data generated by the first data source into a first plurality of events; generating the storage related estimate, based on at least some of the first plurality of events, the storage related estimate being an estimate of an amount of storage space that would be needed for the data intake and query system to index, including to persistently store, all data received by the data intake and query system from the first data source for a time period; causing an indication of the storage related estimate to be output to a user; and completing an indexing of only a sample of the received first plurality of events based on a sampling criterion, and not completing indexing of a remainder of the first plurality of events in the absence of a user input indicative that the first data source should be indexed, the sample being fewer than all of the first plurality of events, wherein completing indexing includes committing data being indexed or to be indexed to persistent storage.
-
-
29. A non-transitory machine-readable storage medium for use in a processing system of a data intake and query system, the non-transitory machine-readable storage medium storing instructions, an execution of which in the processing system causes the processing system to perform operations comprising:
-
receiving user inputs specifying that a first data source is to be a subject of a storage related estimate; receiving raw data generated by the first data source; parsing the raw data generated by the first data source into a first plurality of events; generating the storage related estimate, based on at least some of the first plurality of events, the storage related estimate being an estimate of an amount of storage space that would be needed for the data intake and query system to index, including to persistently store, all data received by the data intake and query system from the first data source for a time period; causing an indication of the storage related estimate to be output to a user; and completing indexing of only a sample of the received first plurality of events based on a sampling criterion, and not completing indexing of a remainder of the first plurality of events in the absence of a user input indicative that the first data source should be indexed, the sample being fewer than all of the first plurality of events, wherein completing indexing includes committing data being indexed or to be indexed to persistent storage.
-
Specification