S. Ghiasvand, and F. Ciorba. International Supercomputing Conference, Frankfurt, Germany, (June 2018)
Abstract
System logs are valuable source of information for analyzing computing systems behavior.
The message part of each system log entry includes detailed information about its respective event.
RFC5424 provides general guidelines for generating system logs.
However, the message part of system logs are unstructured and every software generates its system log messages independently.
Automatic methods of analysis are required to analyze the large number of system logs generated on modern computing systems.
Unstructured nature of system log messages, is a main challenge towards automatic analysis.
Automatic text classification is a well-known approach to address this challenge.
However, to use this approach, the target classes must be predefined.
The common method is to apply machine learning on pre-classification text samples to generate a specific classifier for the respective text format.
Our study indicates that in case of a high repetition frequency of system logs, the automatic classification of system log messages without pre-classified samples is possible.
The preliminary results of analyzing one month of system logs on a production high performance cluster, indicate a very high accuracy.
The classification accuracy has a direct relation with the available amount of system logs.
The proposed classification method is automatic, unsupervised, and general.
15-ghiasvand-presentation.pptx:D\:\\Documents\\Zotero\\storage\\Q3KT5DYT\\15-ghiasvand-presentation.pptx:application/vnd.openxmlformats-officedocument.presentationml.presentation;Ghiasvand and Ciorba - Automatic Classification of System Logs.pdf:D\:\\Documents\\Zotero\\storage\\SBBCTRF5\\Ghiasvand and Ciorba - Automatic Classification of System Logs.pdf:application/pdf
%0 Conference Paper
%1 ghiasvand2018automatic
%A Ghiasvand, Siavash
%A Ciorba, Florina M
%B International Supercomputing Conference
%C Frankfurt, Germany
%D 2018
%K myOwn
%T Automatic Classification of System Logs
%X System logs are valuable source of information for analyzing computing systems behavior.
The message part of each system log entry includes detailed information about its respective event.
RFC5424 provides general guidelines for generating system logs.
However, the message part of system logs are unstructured and every software generates its system log messages independently.
Automatic methods of analysis are required to analyze the large number of system logs generated on modern computing systems.
Unstructured nature of system log messages, is a main challenge towards automatic analysis.
Automatic text classification is a well-known approach to address this challenge.
However, to use this approach, the target classes must be predefined.
The common method is to apply machine learning on pre-classification text samples to generate a specific classifier for the respective text format.
Our study indicates that in case of a high repetition frequency of system logs, the automatic classification of system log messages without pre-classified samples is possible.
The preliminary results of analyzing one month of system logs on a production high performance cluster, indicate a very high accuracy.
The classification accuracy has a direct relation with the available amount of system logs.
The proposed classification method is automatic, unsupervised, and general.
@inproceedings{ghiasvand2018automatic,
abstract = {System logs are valuable source of information for analyzing computing systems behavior.
The message part of each system log entry includes detailed information about its respective event.
RFC5424 provides general guidelines for generating system logs.
However, the message part of system logs are unstructured and every software generates its system log messages independently.
Automatic methods of analysis are required to analyze the large number of system logs generated on modern computing systems.
Unstructured nature of system log messages, is a main challenge towards automatic analysis.
Automatic text classification is a well-known approach to address this challenge.
However, to use this approach, the target classes must be predefined.
The common method is to apply machine learning on pre-classification text samples to generate a specific classifier for the respective text format.
Our study indicates that in case of a high repetition frequency of system logs, the automatic classification of system log messages without pre-classified samples is possible.
The preliminary results of analyzing one month of system logs on a production high performance cluster, indicate a very high accuracy.
The classification accuracy has a direct relation with the available amount of system logs.
The proposed classification method is automatic, unsupervised, and general.},
added-at = {2024-12-10T16:17:47.000+0100},
address = {Frankfurt, Germany},
author = {Ghiasvand, Siavash and Ciorba, Florina M},
biburl = {https://puma.scadsai.uni-leipzig.de/bibtex/2d4e36d8b96fc9b8ea394845e92ce5ff9/ghiasvan},
booktitle = {International {Supercomputing} {Conference}},
copyright = {All rights reserved},
file = {15-ghiasvand-presentation.pptx:D\:\\Documents\\Zotero\\storage\\Q3KT5DYT\\15-ghiasvand-presentation.pptx:application/vnd.openxmlformats-officedocument.presentationml.presentation;Ghiasvand and Ciorba - Automatic Classification of System Logs.pdf:D\:\\Documents\\Zotero\\storage\\SBBCTRF5\\Ghiasvand and Ciorba - Automatic Classification of System Logs.pdf:application/pdf},
interhash = {4beba6c6982ede94ec84ef802f7d3b75},
intrahash = {d4e36d8b96fc9b8ea394845e92ce5ff9},
keywords = {myOwn},
language = {en},
month = jun,
timestamp = {2024-12-10T16:26:44.000+0100},
title = {Automatic {Classification} of {System} {Logs}},
year = 2018
}