Global ETD Search

Return to search

Exploration of relationships from texts using self-organizing maps

<p>This thesis explored and visualized the relationships of documents data, based on the technique of self-organizing maps (SOM), a subtype of artificial neural network for visualizing high-dimensional data in low-dimensional views. The source data for this thesis are the full Extensible Markup Language (XML) texts of A Standard Corpus of Present Day Edited American English. The first step is transforming these XML files to produce a term-document matrix, including stop word removal, stemming, tf-idf (term frequency–inverse document frequency) weighting, global filtering; here rows of this matrix represent documents as n-dimensional vectors. Secondly, these vectors are clustered and visualized by SOM consisting of neurons, each neuron relatives to a set of documents with a certain number of same terms. Then a network has been constructed from SOM, with vertices set of neurons and documents, lines set of linkages between neurons and documents. Finally this network exports to the Pajek for analysis and final visualization.</p>

http://urn.kb.se/resolve?urn=urn:nbn:se:hig:diva-129

TECHNOLOGY

TEKNIKVETENSKAP

Identifer	oai:union.ndltd.org:UPSALLA/oai:DiVA.org:hig-129
Date	January 2007
Creators	Lu, Weiping
Publisher	University of Gävle, Department of Technology and Built Environment
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, text

Page generated in 0.0021 seconds

Exploration of relationships from texts using self-organizing maps

Description

Links & Downloads

Tags

Additional Fields