Publication result detail

Web Page Element Classification Based on Visual Features

BURGET, R.; BURGETOVÁ, I.

Original Title

Web Page Element Classification Based on Visual Features

English Title

Web Page Element Classification Based on Visual Features

Type

Paper in proceedings outside WoS and Scopus

Original Abstract

When applying the traditional data mining methods to World Wide Web documents, the typical problem is that a normal web page contains a variety of information of different kinds in addition to its main content. This additional information such as navigation, advertisement or copyright notices negatively influences the results of the data mining methods as for example the content classification. In this paper, we present a method of interesting area detection in a web page. This method is inspired by an assumed human reader approach to this task. First, basic visual blocks are detected in the page and subsequently, the purpose of these blocks is guessed based on their visual appearance. We describe a page segmentation method used for the visual block detection, we propose a way of the block classification based on the visual features and finally, we provide an experimental evaluation of the method on real-world data.

English abstract

When applying the traditional data mining methods to World Wide Web documents, the typical problem is that a normal web page contains a variety of information of different kinds in addition to its main content. This additional information such as navigation, advertisement or copyright notices negatively influences the results of the data mining methods as for example the content classification. In this paper, we present a method of interesting area detection in a web page. This method is inspired by an assumed human reader approach to this task. First, basic visual blocks are detected in the page and subsequently, the purpose of these blocks is guessed based on their visual appearance. We describe a page segmentation method used for the visual block detection, we propose a way of the block classification based on the visual features and finally, we provide an experimental evaluation of the method on real-world data.

Keywords

page segmentation, preprocessing, classification, visual features, visual blocks

Key words in English

page segmentation, preprocessing, classification, visual features, visual blocks

Authors

BURGET, R.; BURGETOVÁ, I.

RIV year

2010

Released

01.04.2009

Publisher

IEEE Computer Society

Location

Dong Hoi

ISBN

978-0-7695-3580-7

Book

1st Asian Conference on Intelligent Information and Database Systems ACIIDS 2009

Pages from

67

Pages to

72

Pages count

6

BibTex

@inproceedings{BUT33776,
  author="Radek {Burget} and Ivana {Burgetová}",
  title="Web Page Element Classification Based on Visual Features",
  booktitle="1st Asian Conference on Intelligent Information and Database Systems ACIIDS 2009",
  year="2009",
  pages="67--72",
  publisher="IEEE Computer Society",
  address="Dong Hoi",
  isbn="978-0-7695-3580-7"
}