Detail publikace

Neural Networks with Dilated Convolutions for Sound Event Recognition

MIKLÁNEK, Š.

Originální název

Neural Networks with Dilated Convolutions for Sound Event Recognition

Typ

článek ve sborníku mimo WoS a Scopus

Jazyk

angličtina

Originální abstrakt

Convolutional neural networks, most commonly deployed in image classification tasks, typically use square-shaped convolutional kernels, which are well suited for feature extraction from two-dimensional data. This study explores the effect of utilizing spectrally aware dilated convolutions specialized for sound event recognition. By extending the base kernels in the time or the frequency dimension, the features extracted from the spectral audio representations should, in theory, better capture the temporal and timbral information of different sound events. The baseline neural network model with squared kernels was compared against three models, which used an increasing dilation factor in the subsequent convolutional layers. The three models were purposefully tuned to focus towards the frequency and time feature extraction. The results have shown that the models with dilated convolutions performed noticeably better in comparison with the baseline model.

Klíčová slova

sound event recognition, convolutional neural networks, dilated convolution

Autoři

MIKLÁNEK, Š.

Vydáno

13. 7. 2021

Místo

Brno

ISBN

978-80-214-5942-7

Kniha

Proceedings I of the 27th Conference STUDENT EEICT 2021

Edice

1.

Strany počet

5

BibTex

@inproceedings{BUT171286,
  author="Štěpán {Miklánek}",
  title="Neural Networks with Dilated Convolutions for Sound Event Recognition",
  booktitle="Proceedings I of the 27th Conference STUDENT EEICT 2021",
  year="2021",
  series="1.",
  pages="5",
  address="Brno",
  isbn="978-80-214-5942-7"
}