Systems and methods for highly scalable system log analysis, deduplication and management
First Claim
Patent Images
1. Non-transitory computer-readable storage media encoded with a computer program including instructions executable by a processor to create an application for system log analysis and deduplication, the application comprising:
- (a) a user interface allowing a user to configure one or more parser modules or a plurality of storage modules, wherein the plurality of storage modules are distributed to two or more servers;
(b) the one or more parser modules configured to;
(1) receive raw system log data automatically generated by a system, wherein the raw system log data comprises system events logged in a plain text format;
(2) parse and transform the raw system log data into structured system log data by a hash function, wherein the structured system log data comprises the system events organized in a defined data format; and
(3) transmit the structured system log data to the plurality of storage modules, wherein data entries of the structured system log data associated with a same system event and happening within a first defined time window are transmitted to a same storage module; and
(c) the plurality of storage modules without a database service, the storage modules configured to;
(1) receive the structured system log data from the one or more parser modules;
(2) identify duplicated data entries in the structured system log data, wherein the duplicated data entries are happening within a second defined time window and comprise distinct timestamps and identical system events;
(3) generate a representative system event of the duplicated data entries;
(4) generate and store a serialized data record in a binary format, the serialized data record comprising the distinct timestamps of the duplicated data entries and the representative system event of the duplicated data entries.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for parsing raw log data into structured log data, and removing duplicate entries, storing the deduplicated log data into binary format, and managing system events. The subject matter can increase speed of log data analysis and storage, reduce data storage for log data, and manage easily system events.
-
Citations
5 Claims
-
1. Non-transitory computer-readable storage media encoded with a computer program including instructions executable by a processor to create an application for system log analysis and deduplication, the application comprising:
-
(a) a user interface allowing a user to configure one or more parser modules or a plurality of storage modules, wherein the plurality of storage modules are distributed to two or more servers; (b) the one or more parser modules configured to; (1) receive raw system log data automatically generated by a system, wherein the raw system log data comprises system events logged in a plain text format; (2) parse and transform the raw system log data into structured system log data by a hash function, wherein the structured system log data comprises the system events organized in a defined data format; and (3) transmit the structured system log data to the plurality of storage modules, wherein data entries of the structured system log data associated with a same system event and happening within a first defined time window are transmitted to a same storage module; and (c) the plurality of storage modules without a database service, the storage modules configured to; (1) receive the structured system log data from the one or more parser modules; (2) identify duplicated data entries in the structured system log data, wherein the duplicated data entries are happening within a second defined time window and comprise distinct timestamps and identical system events; (3) generate a representative system event of the duplicated data entries; (4) generate and store a serialized data record in a binary format, the serialized data record comprising the distinct timestamps of the duplicated data entries and the representative system event of the duplicated data entries. - View Dependent Claims (2, 3, 4, 5)
-
Specification