Global ETD Search

Return to search

Rámec pro extrakci informace z WWW / Framework for Information Exctration from WWW

Web environment has developed into the largest source of electronic documents, so it would be very useful, to process this information automatically. This is however not a trivial problem. Most documents are written in HTML (Hypertext Markup Language), which does not support semantic description of the content. The goal of this work is to create modular system for information extraction and further processing of this information from HTML documents. Further processing of information means to store this information in XML document or relational database. System modularity makes it possible to use various information extraction and storing methods, thus the system can be used for various tasks.

http://www.nusl.cz/ntk/nusl-236703

Identifer	oai:union.ndltd.org:nusl.cz/oai:invenio.nusl.cz:236703
Date	January 2009
Creators	Brychta, Filip
Contributors	Bartík, Vladimír, Burget, Radek
Publisher	Vysoké učení technické v Brně. Fakulta informačních technologií
Source Sets	Czech ETDs
Language	Czech
Detected Language	English
Type	info:eu-repo/semantics/masterThesis
Rights	info:eu-repo/semantics/restrictedAccess

Page generated in 0.002 seconds

Rámec pro extrakci informace z WWW / Framework for Information Exctration from WWW

Description

Links & Downloads

Tags

Additional Fields