Return to search

What did they cover? : a cluster analysis of news stories published in the Botswana Daily News, January – December 2004

Thesis (MPhil (Information Science))--University of Stellenbosch, 2005. / ENGLISH ABSTRACT: In this study, a cluster analysis of news stories published in the Botswana Daily
News during the period January - December 2004 was undertaken. The study
was exploratory in nature and sought to find out what topics were predominant
during the study period. The approach we adopted can be divided into three
phases, namely data collection, document pre-processing, and cluster analysis.
The data used in the study was downloaded from the Botswana Daily News
website using a simple program developed specifically for that purpose. Document
pre-processing was concerned with transforming the raw documents
into a format that could be directly operated upon by the various clustering
algorithms. The documents themselves were represented using the vector
space model, with the tf.idf term weighting scheme. We experimented with
three clustering approaches, namely, direct k-way clustering, k-way clustering
through repeated bisections, and agglomerative clustering. Agglomerative
clustering performed poorly, and we thus discarded its results. Direct k-way
clustering and k-way clustering through repeated bisections produced similar
results, though the former performed better in terms of external isolation and
internal cohesion of the clusters produced. Consequently, we only retained the
results from direct k-way clustering, and subsequently performed a quarterly
analysis of our corpus using only the direct k-way clustering algorithm. Analysis
of the complete corpus identified a number of topics that were prevalent
over the study period. Interestingly, a quarterly analysis of the corpus revealed
other topics whose prevalence appears to have been limited to certain parts of
the year.

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:sun/oai:scholar.sun.ac.za:10019.1/1058
Date12 1900
CreatorsMogotsi, Isaac Carter
ContributorsVan der Walt, M. S., University of Stellenbosch. Faculty of Arts and Social Sciences. Dept. of Information Science. Information and Knowledge Management.
PublisherStellenbosch : University of Stellenbosch
Source SetsSouth African National ETD Portal
LanguageEnglish
Detected LanguageEnglish
TypeThesis
Format1032798 bytes, 119842 bytes, application/pdf, application/pdf
RightsUniversity of Stellenbosch

Page generated in 0.0025 seconds