Detail aplikovaného výsledku

proof_platform: Platform for automated analysis and archiving of data from the web

KOCMAN, T.; POLČÁK, L.

Original Title

proof_platform: Platform for automated analysis and archiving of data from the web

English Title

proof_platform: Platform for automated analysis and archiving of data from the web

Type

Software

Abstract

This platform enables scraping of web page content and storing the content in offline persistent database. The web crawl is performed using user-supplied regular expressions that may represent for example Torrent file names, Bitcoin wallets or keywords. Collected data may be used for law enforcement and other entitites, such as searching for information about a specific product. Archived data are stored in a database and available for later use without the possibility of modification due to web server updates.

Abstrakt aglicky

This platform enables scraping of web page content and storing the content in offline persistent database. The web crawl is performed using user-supplied regular expressions that may represent for example Torrent file names, Bitcoin wallets or keywords. Collected data may be used for law enforcement and other entitites, such as searching for information about a specific product. Archived data are stored in a database and available for later use without the possibility of modification due to web server updates.

Keywords

Web crawling, web scrapping.

Key words in English

Web crawling, web scrapping.

Location

https://gitlab.com/tomaskocman/proof_platform

Licence fee

In order to use the result by another entity, it is always necessary to acquire a license

www

Documents