• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Clustering User Behavior in Scientific Collections

Blixhavn, Øystein Hoel January 2014 (has links)
This master thesis looks at how clustering techniques can be appliedto a collection of scientific documents. Approximately one year of serverlogs from the CERN Document Server (CDS) are analyzed and preprocessed.Based on the findings of this analysis, and a review of thecurrent state of the art, three different clustering methods are selectedfor further work: Simple k-Means, Hierarchical Agglomerative Clustering(HAC) and Graph Partitioning. In addition, a custom, agglomerativeclustering algorithm is made in an attempt to tackle some of the problemsencountered during the experiments with k-Means and HAC. The resultsfrom k-Means and HAC are poor, but the graph partitioning methodyields some promising results.The main conclusion of this thesis is that the inherent clusters withinthe user-record relationship of a scientific collection are nebulous, butexisting. Furthermore, the most common clustering algorithms are notsuitable for this type of clustering.

Page generated in 0.0164 seconds