Přístupnostní navigace
E-application
Search Search Close
Publication result detail
BURGET, R.
Original Title
HTML Document Analysis for Information Extraction
English Title
Type
Paper in proceedings outside WoS and Scopus
Original Abstract
The today's World Wide Web contains a vast amount ofinformation stored in HTML documents. However, the HTML languageprimarily describes the look of the documents and it doesn't containfacilities for the description of contained data structure. In thispaper we propose a model of a Web site that describes logical structureof contained data. Furthermore, we propose methods for creating such a model by analyzing the look and the structure of HTML documents.
English abstract
Keywords
HTML Analysis, Information Extraction
Key words in English
Authors
RIV year
2011
Released
25.04.2002
Publisher
Faculty of Information Technology BUT
Location
Brno
ISBN
80-214-2116-9
Book
Proceedings of 8th EEICT conference
Pages from
426
Pages to
430
Pages count
5
BibTex
@inproceedings{BUT10014, author="Radek {Burget}", title="HTML Document Analysis for Information Extraction", booktitle="Proceedings of 8th EEICT conference", year="2002", pages="426--430", publisher="Faculty of Information Technology BUT", address="Brno", isbn="80-214-2116-9" }