Publication result detail

Brno University of Technology at MediaEval 2011 Genre Tagging Task

HRADIŠ, M.; ŘEZNÍČEK, I.; BEHÚŇ, K.

Original Title

Brno University of Technology at MediaEval 2011 Genre Tagging Task

English Title

Brno University of Technology at MediaEval 2011 Genre Tagging Task

Type

Paper in proceedings outside WoS and Scopus

Original Abstract

This paper briefly describes our approach to the video genre tagging task which was a part of MediaEval 2011. We focused mainly on visual and audio information, and we exploited metadata and automatic speech transcripts
only in a very basic way. Our approach relied on classification and on classifier fusion to combine
different sources of information. We did not use any additional training data except the very small
exemplary set provided by MediaEval (only 246 videos). The best performance was achieved by metadata alone.
Combination with the other sources of information did not improve results in the submitted runs. This was achieved
later by choosing more suitable weights in fusion. Excluding the metadata,
audio and video gave better results than speech transcripts. Using classifiers for 345 semantic classes
from TRECVID 2011 semantic indexing (SIN) task to project the data worked better than classifying directly from video and audio features.

English abstract

This paper briefly describes our approach to the video genre tagging task which was a part of MediaEval 2011. We focused mainly on visual and audio information, and we exploited metadata and automatic speech transcripts
only in a very basic way. Our approach relied on classification and on classifier fusion to combine
different sources of information. We did not use any additional training data except the very small
exemplary set provided by MediaEval (only 246 videos). The best performance was achieved by metadata alone.
Combination with the other sources of information did not improve results in the submitted runs. This was achieved
later by choosing more suitable weights in fusion. Excluding the metadata,
audio and video gave better results than speech transcripts. Using classifiers for 345 semantic classes
from TRECVID 2011 semantic indexing (SIN) task to project the data worked better than classifying directly from video and audio features.

Keywords

genre recognition, bag of words, SIFT, local features, SVM, classification, classifier fusion

Key words in English

genre recognition, bag of words, SIFT, local features, SVM, classification, classifier fusion

Authors

HRADIŠ, M.; ŘEZNÍČEK, I.; BEHÚŇ, K.

RIV year

2016

Released

01.09.2011

Publisher

CEUR-WS.org

Location

Pisa, Italy

Book

Working Notes Proceedings of the MediaEval 2011 Workshop

ISBN

1613-0073

Periodical

CEUR Workshop Proceedings

Number

9

State

Federal Republic of Germany

Pages from

1

Pages to

2

Pages count

2

URL

BibTex

@inproceedings{BUT91115,
  author="Michal {Hradiš} and Ivo {Řezníček} and Kamil {Behúň}",
  title="Brno University of Technology at MediaEval 2011 Genre Tagging Task",
  booktitle="Working Notes Proceedings of the MediaEval 2011 Workshop",
  year="2011",
  journal="CEUR Workshop Proceedings",
  number="9",
  pages="1--2",
  publisher="CEUR-WS.org",
  address="Pisa, Italy",
  issn="1613-0073",
  url="http://ceur-ws.org/Vol-807/Hradis_BUT_Genre_me11wn.pdf"
}