COMPUTER SYSTEM PROGRAMMED TO IDENTIFY COMMON SUBSEQUENCES IN LOGS
First Claim
1. A method comprising:
- using a computer, receiving a stream of digital data comprising a plurality of objects;
using programmed tokenizer instructions executed using the computer, in response to receiving a first object of the plurality of objects, tokenizing the first object to create a first tokenized object and electronically digitally storing the first tokenized object in a token database that comprises a plurality of other tokenized objects and using an electronic digital storage device;
using the computer, comparing the first tokenized object to the plurality of other tokenized objects stored in the token database, computing a first pattern associated with the first tokenized object, and storing the first pattern in a pattern database that comprises a plurality of patterns;
using the computer, managing a size of the pattern database by;
identifying, from the plurality of patterns, a subset of patterns that are eligible for deletion from the pattern database based on an age of each pattern and storing in computer memory data identifying the subset of patterns;
ranking each pattern of the subset based on a quality metric and a popularity metric, by marking the data identifying the subset of patterns with rank values;
identifying, based on the ranking and from the subset, a second pattern and deleting the second pattern from the pattern database to produce an updated database;
repeating the tokenizing, comparing and storing using the updated database;
wherein the method is executed using one or more computing devices.
1 Assignment
0 Petitions
Accused Products
Abstract
A data processing method includes receiving a stream of digital data with a plurality of objects and, in response to receiving an object, tokenizing the object to create a tokenized object, and storing the tokenized object in a token database. The method further includes comparing the tokenized object to a plurality of other tokenized objects stored in the token database, computing a pattern associated with the tokenized object, storing the pattern in a pattern database, and managing a size of the pattern database by identifying, a subset of patterns that are eligible for deletion from the pattern database based on an age of each pattern, ranking each pattern of the subset based on a quality and a popularity metric, identifying, based on the ranking and from the subset, a second pattern and deleting the second pattern from the pattern database to produce an updated database.
19 Citations
23 Claims
-
1. A method comprising:
-
using a computer, receiving a stream of digital data comprising a plurality of objects; using programmed tokenizer instructions executed using the computer, in response to receiving a first object of the plurality of objects, tokenizing the first object to create a first tokenized object and electronically digitally storing the first tokenized object in a token database that comprises a plurality of other tokenized objects and using an electronic digital storage device; using the computer, comparing the first tokenized object to the plurality of other tokenized objects stored in the token database, computing a first pattern associated with the first tokenized object, and storing the first pattern in a pattern database that comprises a plurality of patterns; using the computer, managing a size of the pattern database by; identifying, from the plurality of patterns, a subset of patterns that are eligible for deletion from the pattern database based on an age of each pattern and storing in computer memory data identifying the subset of patterns; ranking each pattern of the subset based on a quality metric and a popularity metric, by marking the data identifying the subset of patterns with rank values; identifying, based on the ranking and from the subset, a second pattern and deleting the second pattern from the pattern database to produce an updated database; repeating the tokenizing, comparing and storing using the updated database; wherein the method is executed using one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method comprising:
-
using a computer, managing a size of a pattern database that stores a plurality of patterns by; identifying, from the plurality of patterns, a subset of patterns that are eligible for deletion from the pattern database based on an age of each pattern and storing in computer memory data identifying the subset of patterns; ranking each pattern of the subset based on a quality metric and a popularity metric, by marking the data identifying the subset of patterns with rank values; identifying, based on the ranking and from the subset, a pattern for deletion; deleting the pattern from the pattern database to produce an updated database; wherein the method is executed using one or more computing devices. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A method comprising:
-
using a computer, receiving a stream of digital data comprising a plurality of objects; using programmed tokenizer instructions executed using the computer, in response to receiving a first object of the plurality of objects, tokenizing the first object to create a first tokenized object and electronically digitally storing the first tokenized object in a token database that comprises a plurality of other tokenized objects and using an electronic digital storage device; using the computer, comparing the first tokenized object to the plurality of other tokenized objects stored in the token database, computing a first pattern associated with the first tokenized object, and storing the first pattern in a pattern database that comprises a plurality of patterns, wherein the plurality of patterns comprises a set of hierarchical patterns; using the computer, receiving an indication from an application that the set of hierarchical patterns matched an input, wherein a pattern from the set of hierarchical patterns with a largest hit count is selected as a hit; increasing, in the pattern database, in response to the indication, a hit count associated with each pattern of the set of hierarchical patterns; when deleting a pattern of the set of hierarchical patterns from the pattern database; deleting a more specific pattern of the set of hierarchical patterns when the more specific pattern comprises a first hit count below a first threshold relative to a second hit count of a more general pattern of the set of hierarchical patterns and the second hit count is below a second threshold relative to a total sum of all hit counts in the pattern database; deleting the more general pattern if the more specific pattern is not deleted; wherein the method is executed using one or more computing devices. - View Dependent Claims (16, 17, 18, 19, 20)
-
-
21. A method comprising:
-
using a computer, receiving an indication from an application that the set of hierarchical patterns matched an input, wherein a pattern from the set of hierarchical patterns with a largest hit count is selected as a hit; increasing, in the pattern database, in response to the indication, a hit count associated with each pattern of the set of hierarchical patterns; when deleting a pattern of the set of hierarchical patterns from the pattern database; deleting a more specific pattern of the set of hierarchical patterns when the more specific pattern comprises a first hit count below a first threshold relative to a second hit count of a more general pattern of the set of hierarchical patterns and the second hit count is below a second threshold relative to a total sum of all hit counts in the pattern database; deleting the more general pattern if the more specific pattern is not deleted; wherein the method is executed using one or more computing devices. - View Dependent Claims (22, 23)
-
Specification