This thesis focuses on entity and fact extraction from the web. Different knowledge representations and techniques for information extraction are discussed before the design for a knowledge extraction system, called WebKnox, is introduced. The main contribution of this thesis is the trust ranking of extracted facts with a self-supervised learning loop and the extraction system with its composition of known and refined extraction algorithms. The used
techniques show an improvement in precision and recall in most of the matters for entity and fact extractions compared to the chosen baseline approaches.
Identifer | oai:union.ndltd.org:DRESDEN/oai:qucosa.de:bsz:14-qucosa-23766 |
Date | 21 August 2009 |
Creators | Urbansky, David |
Contributors | Technische Universität Dresden, Fakultät Informatik, Dipl. Inf. Marius Feldmann, Prof. Dr. rer. nat. habil. Dr. h.c. Alexander Schill |
Publisher | Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden |
Source Sets | Hochschulschriftenserver (HSSS) der SLUB Dresden |
Language | deu |
Detected Language | English |
Type | doc-type:masterThesis |
Format | application/pdf |
Page generated in 0.0016 seconds