×

REAL-TIME CATEGORIZATION OF LOG EVENTS

  • US 20160196174A1
  • Filed: 03/17/2015
  • Published: 07/07/2016
  • Est. Priority Date: 01/02/2015
  • Status: Active Grant
First Claim
Patent Images

1. A method for categorizing a real-time log event, the method comprising:

  • computing a Term Frequency-Inverse Document Frequency (TF-IDF) vector for the real-time log event based on a pre-calculated TF-IDF matrix of a log corpus and a number of new words in the real-time log event, wherein the log corpus comprises one or more pre-existing log events, and wherein the real-time log event is indicative of an error message;

    calculating a distance between the TF-IDF vector and a cluster centroid of each cluster in the log corpus;

    identifying, from amongst the clusters, a cluster having a closest cluster centroid based on the distance between the TF-IDF vector and the cluster centroid of each of the clusters, wherein the closest cluster centroid is a cluster centroid closest to the TF-IDF vector; and

    categorizing the real-time log event into one or more log categories based on a comparison of the distance between the TF-IDF vector and the closest cluster centroid with a pre-determined silhouette threshold corresponding to the cluster with the closest cluster centroid.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×