SAMPLING OF EVENTS TO USE FOR DEVELOPING A FIELD-EXTRACTION RULE FOR A FIELD TO USE IN EVENT SEARCHING
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments are directed towards generating a representative sampling as a subset from a larger dataset that includes unstructured data. A graphical user interface enables a user to provide various data selection parameters, including specifying a data source and one or more subset types desired, including one or more of latest records, earliest records, diverse records, outlier records, and/or random records. Diverse and/or outlier subset types may be obtained by generating clusters from an initial selection of records obtained from the larger dataset. An iteration analysis is performed to determine whether a sufficient number of clusters and/or cluster types have been generated that exceed at least one threshold and when not exceeded, additional clustering is performed on additional records. From the resultant clusters, and/or other subtype results, a subset of records is obtained as the representative sampling subset.
23 Citations
43 Claims
-
1-30. -30. (canceled)
-
31. A computer-implemented method, comprising:
-
receiving machine data at a computing device; generating a plurality of events, wherein each event in the plurality of events includes a portion of the machine data; associating a time with each event in the plurality of events, the time for each event extracted from the machine data included in that event; storing the plurality of events in a data store such that they are searchable at least by their associated times; selecting a set of events from the plurality of events using a procedure to identify diverse events, a procedure to identify outlier events, a procedure to identify events associated with relatively early times compared to other events in the plurality of events, a procedure to identify events with relatively late times compared to other events in the plurality of events, or a procedure randomly to identify events in the plurality of events; displaying one or more events in the set of events in a graphical user interface that enables development of a field-extraction rule that specifies how to extract, from the machine data included in each of the one or more events, a value for a field that is defined for each of the one or more events, wherein each of the one or more events is searchable using the field. - View Dependent Claims (32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43)
-
Specification