Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail aplikovaného výsledku
BURGET, R.; MILIČKA, M.
Originální název
Information Extraction Tools from CEUR Workshop Pages
Anglický název
Druh
Software
Abstrakt
This project implements the applications and tools for automatic information extraction from the CEUR-WS.org workshop proceedings pages. The tools take the CEUR HTML pages as an input and produce a structured linked dataset in RDF format. The implementation is based on the existing FITLayout document analysis framework with many extensions specific for the given task. The resulting data may be used for evaluating the quality of the individual CEUR workshops. The tools were created as a proposed solution of the Task 1 of the Semantic Publishing Challenge 2015 colocated with the Extended Semantic Web Conference 2015. They were awarded as the Best performing tool and the Most innovative approach. They provide a case study that demonstrates the developed document analysis methods.
Abstrakt aglicky
Klíčová slova
information extraction, web mining, document analysis, text classification
Klíčová slova anglicky
Umístění
https://github.com/FitLayout/ToolsEswc
Licenční poplatek
K využití výsledku jiným subjektem je vždy nutné nabytí licence
www