Applied result detail

WTF-LOD Extractor

OTRUSINA, L.; SMRŽ, P.

Original Title

WTF-LOD Extractor

English Title

WTF-LOD Extractor

Type

Software

Abstract

This software creates the Web TextFull linkage to Linked Open Data (WTF-LOD) dataset intended for large-scale evaluation of named entity recognition (NER) systems from the largest publically-available textual corpora, including Wikipedia dumps, monthly runs of the CommonCrawl, and ClueWeb09/12. The software performs de-duplication of the data and advanced cleaning procedures.

Abstract in English

This software creates the Web TextFull linkage to Linked Open Data (WTF-LOD) dataset intended for large-scale evaluation of named entity recognition (NER) systems from the largest publically-available textual corpora, including Wikipedia dumps, monthly runs of the CommonCrawl, and ClueWeb09/12. The software performs de-duplication of the data and advanced cleaning procedures.

Keywords

named entity evaluation, linked open data, CommonCrawl, ClueWeb, Wikipedia

Key words in English

named entity evaluation, linked open data, CommonCrawl, ClueWeb, Wikipedia

Location

http://www.fit.vutbr.cz/research/prod/index.php?id=480

Licence fee

In order to use the result by another entity, it is always necessary to acquire a license

www

Documents