SCALABLE AND ACCURATE MINING OF CONTROL FLOW FROM EXECUTION LOGS ACROSS DISTRIBUTED SYSTEMS
First Claim
1. A method of efficiently mining a control flow graph from execution logs of a distributed system, said method comprising:
- utilizing at least one processor to execute computer code that performs the steps of;
receiving a plurality of execution logs;
generating, using at least one text clustering technique, at least two text clusters, from the plurality of execution logs;
generating at least one approximate template based on the at least two text clusters;
creating at least one refined template via refining the at least one approximate template using multimodal sequencing;
creating the control flow graph, based on the at least one refined template; and
detecting at least one anomaly in the control flow graph.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods and arrangements for efficiently mining a control flow graph from execution logs of a distributed system. Using at least one text clustering technique, two text clusters are generated from the plurality of execution logs. At least one approximate template is generated based on the at least two text clusters. At least one refined template is created via refining the at least one approximate template using multimodal sequencing. The control flow graph is created based on the at least one refined template. An anomaly is detected in the control flow graph.
19 Citations
20 Claims
-
1. A method of efficiently mining a control flow graph from execution logs of a distributed system, said method comprising:
-
utilizing at least one processor to execute computer code that performs the steps of; receiving a plurality of execution logs; generating, using at least one text clustering technique, at least two text clusters, from the plurality of execution logs; generating at least one approximate template based on the at least two text clusters; creating at least one refined template via refining the at least one approximate template using multimodal sequencing; creating the control flow graph, based on the at least one refined template; and detecting at least one anomaly in the control flow graph. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. An apparatus for efficiently mining a control flow graph from execution logs of a distributed system, said apparatus comprising:
-
at least one processor; and a computer readable storage medium having computer readable program code embodied therewith and executable by the at least one processor, the computer readable program code comprising; computer readable program code that receives a plurality of execution logs; computer readable program code that generates, using at least one text clustering technique, at least two text clusters, from the plurality of execution logs; computer readable program code that creates at least one refined template via refining the at least one approximate template using multimodal sequencing; computer readable program code that creates the control flow graph, based on the at least one refined template; and computer readable program code that detects at least one anomaly in the control flow graph.
-
-
11. A computer program product to efficiently mine a control flow graph from execution logs of a distributed system, said computer program comprising:
-
at least one processor; and at least one processor; and a computer readable storage medium having computer readable program code embodied therewith and executable by the at least one processor, the computer readable program code comprising; computer readable program code that receives a plurality of execution logs; computer readable program code that generates, using at least one text clustering technique, at least two text clusters, from the plurality of execution logs; computer readable program code that creates at least one refined template via refining the at least one approximate template using multimodal sequencing; computer readable program code that creates the control flow graph, based on the at least one refined template; and computer readable program code that detects at least one anomaly in the control flow graph. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A method of efficiently mining the control flow graph from execution logs of a distributed system, said method comprising:
-
utilizing at least one processor to execute computer code that performs the steps of; receiving a plurality of execution logs; mining at least one template from the plurality of execution logs in the first-phase; said mining comprising creating at least one template, via employing a two-stage template mining technique; said first-stage creating approximate-templates via a dictionary based logline transformation in order to attain scalability and said second-stage refining the mined approximate-templates by leveraging the multimodal (text+temporal-vicinity) signature of each approximate-template; and generating the control-flow graph between the mined templates in the second-phase via a two-stage technique; said first-stage creating for each template, the set of its temporally co-occurring templates, referred to as its Nearest-Neighbor-Group, by leveraging the time-series of occurrence of each template; and said second-stage, in a single-pass of the logstream, determining for each template, its immediate predecessors/successors by tracking predecessors/successors on the projected logstream on the Nearest-Neighbor group of the template, and stitching the mined successors of each template to construct the desired control flow graph.
-
Specification