Batch searches in data fabric service system
First Claim
1. A computer-implemented method for performing a batch search operation across one or more worker nodes, the method comprising:
- receiving, using a first computing device, partial search results from a plurality of data sources, wherein each of the partial search results satisfies a portion of a search query received by a data intake and query system,wherein at least one first partial search result of the partial search results is received from a subset of internal data sources of the data intake and query system, wherein the at least one first partial search result comprises one or more first events in a first format, each first event corresponding to at least one second event stored in the subset of internal data sources, wherein each second event includes raw machine data associated with a timestamp and reflects activity within an information technology infrastructure, andwherein at least one second partial search result of the partial search results is received from one or more external data sources apart from the data intake and query system, wherein the at least one second partial search result includes data in a second format that is different from the first format;
transforming the data of the at least one second partial search result into the first format, using the first computing device, to produce a commonly formatted partial search result;
processing the commonly formatted partial search result and the at least one first partial search result to generate finalized partial search results,wherein the commonly formatted partial search result is processed based on a determination that the commonly formatted partial search result was received from the one or more external data sources, andwherein the at least one first partial search result is processed based on a determination that the at least one first partial search result was received from the one or more internal data stores; and
communicating the finalized partial search results to a second computing device, wherein the second computing device processes the finalized partial search results with other finalized partial search results to generate finalized search results.
1 Assignment
0 Petitions
Accused Products
Abstract
The disclosed embodiments include a technique to obtain search results from the application of transformation operations on partial search results obtained from across internal and/or external data sources. Examples of transformation operations include arithmetic operations such as an average, mean, count, or the like. Examples of reporting transformations include join operations, statistics, sort, top head. Hence, the search results of a search query can be derived from partial search result rather than include the actual partial search results. In this case, the ordering of the search results may be nonessential. An example of a search query that requires a transformation operation is a “batch” or “reporting” search query. The related disclosed techniques involve obtaining data stored in the bid data ecosystem, and returning that data or data derived from that data.
119 Citations
30 Claims
-
1. A computer-implemented method for performing a batch search operation across one or more worker nodes, the method comprising:
-
receiving, using a first computing device, partial search results from a plurality of data sources, wherein each of the partial search results satisfies a portion of a search query received by a data intake and query system, wherein at least one first partial search result of the partial search results is received from a subset of internal data sources of the data intake and query system, wherein the at least one first partial search result comprises one or more first events in a first format, each first event corresponding to at least one second event stored in the subset of internal data sources, wherein each second event includes raw machine data associated with a timestamp and reflects activity within an information technology infrastructure, and wherein at least one second partial search result of the partial search results is received from one or more external data sources apart from the data intake and query system, wherein the at least one second partial search result includes data in a second format that is different from the first format; transforming the data of the at least one second partial search result into the first format, using the first computing device, to produce a commonly formatted partial search result; processing the commonly formatted partial search result and the at least one first partial search result to generate finalized partial search results, wherein the commonly formatted partial search result is processed based on a determination that the commonly formatted partial search result was received from the one or more external data sources, and wherein the at least one first partial search result is processed based on a determination that the at least one first partial search result was received from the one or more internal data stores; and communicating the finalized partial search results to a second computing device, wherein the second computing device processes the finalized partial search results with other finalized partial search results to generate finalized search results. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A non-transitory computer-readable medium including instructions that, when executed by a processor included in a first computing device, cause the processor to perform the steps of:
-
receiving partial search results from a plurality of data sources, wherein each of the partial search results satisfies a portion of a search query received by a data intake and query system, wherein at least one first partial search result of the partial search results is received from a subset of internal data sources of the data intake and query system, wherein the at least one first partial search result comprises one or more first events in a first format, each first event corresponding to at least one second event stored in the subset of internal data sources, wherein each second event includes raw machine data associated with a timestamp and reflects activity within an information technology infrastructure, and wherein at least one second partial search result of the partial search results is received from one or more external data sources apart from the data intake and query system, wherein the at least one second partial search result includes data in a second format that is different from the first format; transforming the data of the at least one second partial search result into the first format to produce a commonly formatted partial search result; processing the commonly formatted partial search result and the at least one first partial search result to generate finalized partial search results, wherein the commonly formatted partial search result is processed based on a determination that the commonly formatted partial search result was received from the one or more external data sources, and wherein the at least one first partial search result is processed based on a determination that the at least one first partial search result was received from the one or more internal data stores; and communicating the finalized partial search results to a second computing device, wherein the second computing device processes the finalized partial search results with other finalized partial search results to generate finalized search results. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A system, comprising:
-
one or more worker nodes, wherein each worker node includes a processor and a memory that stores respective instructions, and, when each processor executes the respective instructions, wherein at least one worker node is configured to; receive partial search results from a plurality of data sources, wherein each of the partial search results satisfies a portion of a search query received by a data intake and query system, wherein at least one first partial search result of the partial search results is received from a subset of internal data sources of the data intake and query system, wherein the at least one first partial search result comprises one or more first events in a first format, each first event corresponding to at least one second event stored in the subset of internal data sources, wherein each second event includes raw machine data associated with a timestamp and reflects activity within an information technology infrastructure, and wherein at least one second partial search result of the partial search results is received from one or more external data sources apart from the data intake and query system, wherein the at least one second partial search result includes data in a second format that is different from the first format; transform the data of the at least one second partial search result into the first format, to produce commonly formatted partial search results; processing the commonly formatted partial search result and the at least one first partial search result to generate finalized partial search results, wherein the commonly formatted partial search result is processed based on a determination that the commonly formatted partial search result was received from the one or more external data sources, and wherein the at least one first partial search result is processed based on a determination that the at least one first partial search result was received from the one or more internal data stores; and communicating the finalized partial search results to a second computing device, wherein the second computing device processes the finalized partial search results with other finalized partial search results to generate finalized search results. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30)
-
Specification