Universal link to extract and classify log data
First Claim
1. A system, comprising:
- a memory configured to store arbitrary log data; and
a processor coupled to the memory and configured to;
identify in said arbitrary log data a set of candidate data values that match a top level pattern that is common to two or more types of data value of interest;
process said candidate data values through a plurality of successive filtering stages, each stage of which includes determining which, if any, of said candidates match a more specific pattern associated more specifically with a specific one of said types of data value of interest;
classifying said candidates, if any, that match the more specific pattern as being of said corresponding specific one of said types of data value of interest; and
removing from the set of candidate data values any candidate data values so identified and classified; and
generate and store a structured data record that associates each candidate data value determined to be of a corresponding one of said types of data value of interest with said corresponding one of said types of data value of interest;
wherein the processor is further configured to apply one or more heuristics to more specifically classify and label one or more values determined to match a pattern associated with a specific one of said types of data value of interest; and
wherein said heuristics include heuristics based on one or more of presence in the arbitrary log data of a characteristic string;
placement within the log data of such a string relative to a given candidate data value;
location of a given candidate data value within the arbitrary log data; and
location within the arbitrary log data of a given candidate data value relative to one or more other candidate data values of the same type.
1 Assignment
0 Petitions
Accused Products
Abstract
A universal link to extract and classify log data is disclosed. In various embodiments, a set of candidate data values that match a top level pattern that is common to two or more types of data value of interest is identified. The candidate data values are processed through a plurality of successive filtering stages, each stage of which includes determining which, if any, of said candidates match a more specific pattern associated more specifically with a specific data value type. Candidates, if any, which match the more specific pattern are classified as being of a corresponding specific data type and are removed from the set of candidate data values. A structured data record that associates each candidate data value determined to be of a corresponding one of said types of data value of interest with said corresponding one of said types of data value of interest is generated and stored.
-
Citations
20 Claims
-
1. A system, comprising:
-
a memory configured to store arbitrary log data; and a processor coupled to the memory and configured to; identify in said arbitrary log data a set of candidate data values that match a top level pattern that is common to two or more types of data value of interest; process said candidate data values through a plurality of successive filtering stages, each stage of which includes determining which, if any, of said candidates match a more specific pattern associated more specifically with a specific one of said types of data value of interest;
classifying said candidates, if any, that match the more specific pattern as being of said corresponding specific one of said types of data value of interest; and
removing from the set of candidate data values any candidate data values so identified and classified; andgenerate and store a structured data record that associates each candidate data value determined to be of a corresponding one of said types of data value of interest with said corresponding one of said types of data value of interest; wherein the processor is further configured to apply one or more heuristics to more specifically classify and label one or more values determined to match a pattern associated with a specific one of said types of data value of interest; and wherein said heuristics include heuristics based on one or more of presence in the arbitrary log data of a characteristic string;
placement within the log data of such a string relative to a given candidate data value;
location of a given candidate data value within the arbitrary log data; and
location within the arbitrary log data of a given candidate data value relative to one or more other candidate data values of the same type. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method, comprising:
-
using a processor to identify in an arbitrary log data a set of candidate data values that match a top level pattern that is common to two or more types of data value of interest; using the processor to process said candidate data values through a plurality of successive filtering stages, each stage of which includes determining which, if any, of said candidates match a more specific pattern associated more specifically with a specific one of said types of data value of interest;
classifying said candidates, if any, that match the more specific pattern as being of said corresponding specific one of said types of data value of interest; and
removing from the set of candidate data values any candidate data values so identified and classified; andusing the processor to generate and store a structured data record that associates each candidate data value determined to be of a corresponding one of said types of data value of interest with said corresponding one of said types of data value of interest; wherein the processor is further used to apply one or more heuristics to more specifically classify and label one or more values determined to match a pattern associated with a specific one of said types of data value of interest; and wherein said heuristics include heuristics based on one or more of presence in the arbitrary log data of a characteristic string;
placement within the log data of such a string relative to a given candidate data value;
location of a given candidate data value within the arbitrary log data; and
location within the arbitrary log data of a given candidate data value relative to one or more other candidate data values of the same type. - View Dependent Claims (16, 17)
-
-
18. A computer program product embodied in a non-transitory computer readable medium and comprising computer instructions for:
-
identifying in an arbitrary log data a set of candidate data values that match a top level pattern that is common to two or more types of data value of interest; processing said candidate data values through a plurality of successive filtering stages, each stage of which includes determining which, if any, of said candidates match a more specific pattern associated more specifically with a specific one of said types of data value of interest; classifying said candidates, if any, that match the more specific pattern as being of said corresponding specific one of said types of data value of interest; and
removing from the set of candidate data values any candidate data values so identified and classified;generating and storing a structured data record that associates each candidate data value determined to be of a corresponding one of said types of data value of interest with said corresponding one of said types of data value of interest; and applying one or more heuristics to more specifically classify and label one or more values determined to match a pattern associated with a specific one of said types of data value of interest; wherein said heuristics include heuristics based on one or more of presence in the arbitrary log data of a characteristic string;
placement within the log data of such a string relative to a given candidate data value;
location of a given candidate data value within the arbitrary log data; and
location within the arbitrary log data of a given candidate data value relative to one or more other candidate data values of the same type. - View Dependent Claims (19, 20)
-
Specification