• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

What did they cover? : a cluster analysis of news stories published in the Botswana Daily News, January – December 2004

Mogotsi, Isaac Carter 12 1900 (has links)
Thesis (MPhil (Information Science))--University of Stellenbosch, 2005. / ENGLISH ABSTRACT: In this study, a cluster analysis of news stories published in the Botswana Daily News during the period January - December 2004 was undertaken. The study was exploratory in nature and sought to find out what topics were predominant during the study period. The approach we adopted can be divided into three phases, namely data collection, document pre-processing, and cluster analysis. The data used in the study was downloaded from the Botswana Daily News website using a simple program developed specifically for that purpose. Document pre-processing was concerned with transforming the raw documents into a format that could be directly operated upon by the various clustering algorithms. The documents themselves were represented using the vector space model, with the tf.idf term weighting scheme. We experimented with three clustering approaches, namely, direct k-way clustering, k-way clustering through repeated bisections, and agglomerative clustering. Agglomerative clustering performed poorly, and we thus discarded its results. Direct k-way clustering and k-way clustering through repeated bisections produced similar results, though the former performed better in terms of external isolation and internal cohesion of the clusters produced. Consequently, we only retained the results from direct k-way clustering, and subsequently performed a quarterly analysis of our corpus using only the direct k-way clustering algorithm. Analysis of the complete corpus identified a number of topics that were prevalent over the study period. Interestingly, a quarterly analysis of the corpus revealed other topics whose prevalence appears to have been limited to certain parts of the year.

Page generated in 0.0505 seconds