Global ETD Search

Return to search

Structured data extraction: separating content from noise on news websites

<p>In this thesis, we have treated the problem of separating content from noise on news websites. We have approached this problem by using TiMBL, a memory-based learning software. We have studied the relevance of the similarity in the training data and the effect of data size in the performance of the extractions.</p>

http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-9898
Local ntnudaim:4769

ntnudaim

SIF2 datateknikk

Intelligente systemer

Identifer	oai:union.ndltd.org:UPSALLA/oai:DiVA.org:ntnu-9898
Date	January 2009
Creators	Arizaleta, Mikel
Publisher	Norwegian University of Science and Technology, Department of Computer and Information Science, Institutt for datateknikk og informasjonsvitenskap
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, text

Page generated in 0.0017 seconds

Structured data extraction: separating content from noise on news websites

Description

Links & Downloads

Tags

Additional Fields