• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Word Space Models for Web User Clustering and Page Prefetching

Sundin, Albin January 2012 (has links)
This study evaluates methods for clustering web users via vector space models, for the purpose of web page prefetching for possible applications of server optimization. An experiment using Latent Semantic Analysis (LSA) is deployed to investigate whether LSA can reproduce the encouraging results obtained from previous research with Random Indexing (RI) and a chaos based optimization algorithm (CAS-C). This is not only motivated by LSA being yet another vector space model, but also by a study indicating LSA to outperform RI in a task similar to the web user clustering and prefetching task. The prefetching task was used to verify the applicability of LSA, where both RI and CAS-C have shown promising results. The original data set from the RI web user clustering and prefetching task was modeled using weighted (tf-idf) LSA. Clusters were defined using a common clustering algorithm (k-means). The least scattered cluster configuration for the model was identified by combining an internal validity measure (SSE) and a relative criterion validity measure (SD index). The assumed optimal cluster configuration was used for the web page prefetching task.   Precision and recall of the LSA based method is found to be on par with RI and CAS-C, in as much that it solves the web user clustering and web task with similar characteristics as unweighted RI. The hypothesized inherent gains to precision and recall by using LSA was neither confirmed nor conclusively disproved. The effects of different weighting functions for RI are discussed and a number of methodological factors are identified for further research concerning LSA based clustering and prefetching.

Page generated in 0.0263 seconds