This work focus at data and especially text mining from Web pages, an overview of programs for downloading the text and ways of their extraction. It also contains an overview of the most frequently used programs for extracting data from internet. The output of this thesis is a Java program that can download text from a selection of servers and save them into xml le.
Identifer | oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:219347 |
Date | January 2011 |
Creators | Mazal, Zdeněk |
Contributors | Morský, Ondřej, Fojtová, Lucie |
Publisher | Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií |
Source Sets | Czech ETDs |
Language | Czech |
Detected Language | English |
Type | info:eu-repo/semantics/masterThesis |
Rights | info:eu-repo/semantics/restrictedAccess |
Page generated in 0.0021 seconds