Detail publikačního výsledku

Towards Evaluating Quality of Datasets for Network Traffic Domain

ČEJKA, T.; HYNEK, K.; SOUKUP, D.; TISOVČÍK, P.

Originální název

Towards Evaluating Quality of Datasets for Network Traffic Domain

Anglický název

Towards Evaluating Quality of Datasets for Network Traffic Domain

Druh

Stať ve sborníku v databázi WoS či Scopus

Originální abstrakt

This paper deals with the quality of network traffic datasets created to train and validate machine learning classification and detection methods. Naturally, there is a long epoch of research targeted at data quality; however, it is focused mainly on data consistency, validity, precision, and other metrics, which are insufficient for network traffic use-cases. The rise of Machine learning usage in network monitoring applications requires a new methodology for evaluation datasets. There is a need to evaluate and compare traffic samples captured at different conditions and decide the usability of the already captured and annotated data. This paper aims to explain a use case of dataset creation, propose definitions regarding the quality of the network traffic datasets, and finally, describe a framework for datasets analysis.

Anglický abstrakt

This paper deals with the quality of network traffic datasets created to train and validate machine learning classification and detection methods. Naturally, there is a long epoch of research targeted at data quality; however, it is focused mainly on data consistency, validity, precision, and other metrics, which are insufficient for network traffic use-cases. The rise of Machine learning usage in network monitoring applications requires a new methodology for evaluation datasets. There is a need to evaluate and compare traffic samples captured at different conditions and decide the usability of the already captured and annotated data. This paper aims to explain a use case of dataset creation, propose definitions regarding the quality of the network traffic datasets, and finally, describe a framework for datasets analysis.

Klíčová slova

Dataset; Data Quality; Network traffic analysis

Klíčová slova v angličtině

Dataset; Data Quality; Network traffic analysis

Autoři

ČEJKA, T.; HYNEK, K.; SOUKUP, D.; TISOVČÍK, P.

Rok RIV

2023

Vydáno

20.12.2021

Nakladatel

Institute of Electrical and Electronics Engineers

Místo

Izmir

ISBN

978-3-903176-36-2

Kniha

Proceedings of the 17th International Conference on Network Service Management (CNSM 2021)

Strany od

264

Strany do

268

Strany počet

5

URL

BibTex

@inproceedings{BUT182953,
  author="Tomáš {Čejka} and Karel {Hynek} and Dominik {Soukup} and Peter {Tisovčík}",
  title="Towards Evaluating Quality of Datasets for Network Traffic Domain",
  booktitle="Proceedings of the 17th International Conference on Network Service Management (CNSM 2021)",
  year="2021",
  pages="264--268",
  publisher="Institute of Electrical and Electronics Engineers",
  address="Izmir",
  doi="10.23919/CNSM52442.2021.9615601",
  isbn="978-3-903176-36-2",
  url="https://ieeexplore.ieee.org/abstract/document/9615601"
}