Return to search

Analysing ranking algorithms and publication trends on scholarly citation networks

Thesis (MSc)--Stellenbosch University, 2014. / ENGLISH ABSTRACT: Citation analysis is an important tool in the academic community. It can aid universities,
funding bodies, and individual researchers to evaluate scientific work and direct resources
appropriately. With the rapid growth of the scientific enterprise and the increase of online
libraries that include citation analysis tools, the need for a systematic evaluation of these
tools becomes more important.
The research presented in this study deals with scientific research output, i.e., articles
and citations, and how they can be used in bibliometrics to measure academic success.
More specifically, this research analyses algorithms that rank academic entities such as
articles, authors and journals to address the question of how well these algorithms can
identify important and high-impact entities.
A consistent mathematical formulation is developed on the basis of a categorisation
of bibliometric measures such as the h-index, the Impact Factor for journals, and ranking
algorithms based on Google’s PageRank. Furthermore, the theoretical properties of each
algorithm are laid out.
The ranking algorithms and bibliometric methods are computed on the Microsoft
Academic Search citation database which contains 40 million papers and over 260 million
citations that span across multiple academic disciplines.
We evaluate the ranking algorithms by using a large test data set of papers and authors
that won renowned prizes at numerous Computer Science conferences. The results show
that using citation counts is, in general, the best ranking metric. However, for certain
tasks, such as ranking important papers or identifying high-impact authors, algorithms
based on PageRank perform better. As a secondary outcome of this research, publication
trends across academic disciplines are analysed to show changes in publication behaviour
over time and differences in publication patterns between disciplines. / AFRIKAANSE OPSOMMING: Sitasiesanalise is ’n belangrike instrument in die akademiese omgewing. Dit kan universiteite,
befondsingsliggams en individuele navorsers help om wetenskaplike werk te evalueer
en hulpbronne toepaslik toe te ken. Met die vinnige groei van wetenskaplike uitsette
en die toename in aanlynbiblioteke wat sitasieanalise insluit, word die behoefte aan ’n
sistematiese evaluering van hierdie gereedskap al hoe belangriker.
Die navorsing in hierdie studie handel oor die uitsette van wetenskaplike navorsing,
dit wil sê, artikels en sitasies, en hoe hulle gebruik kan word in bibliometriese studies
om akademiese sukses te meet. Om meer spesifiek te wees, hierdie navorsing analiseer
algoritmes wat akademiese entiteite soos artikels, outeers en journale gradeer. Dit wys
hoe doeltreffend hierdie algoritmes belangrike en hoë-impak entiteite kan identifiseer.
’n Breedvoerige wiskundige formulering word ontwikkel uit ’n versameling van bibliometriese
metodes soos byvoorbeeld die h-indeks, die Impak Faktor vir journaale en die
rang-algoritmes gebaseer op Google se PageRank. Verder word die teoretiese eienskappe
van elke algoritme uitgelê.
Die rang-algoritmes en bibliometriese metodes gebruik die sitasiedatabasis van Microsoft
Academic Search vir berekeninge. Dit bevat 40 miljoen artikels en meer as 260
miljoen sitasies, wat oor verskeie akademiese dissiplines strek.
Ons gebruik ’n groot stel toetsdata van dokumente en outeers wat bekende pryse op
talle rekenaarwetenskaplike konferensies gewen het om die rang-algoritmes te evalueer.
Die resultate toon dat die gebruik van sitasietellings, in die algemeen, die beste rangmetode
is. Vir sekere take, soos die gradeering van belangrike artikels, of die identifisering
van hoë-impak outeers, presteer algoritmes wat op PageRank gebaseer is egter beter. ’n
Sekondêre resultaat van hierdie navorsing is die ontleding van publikasie tendense in
verskeie akademiese dissiplines om sodoende veranderinge in publikasie gedrag oor tyd
aan te toon en ook die verskille in publikasie patrone uit verskillende dissiplines uit te
wys.

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:sun/oai:scholar.sun.ac.za:10019.1/96106
Date12 1900
CreatorsDunaiski, Marcel Paul
ContributorsVisser, Willem, Geldenhuys, Jaco, Stellenbosch University. Faculty of Science. Department of Mathematical Sciences.
PublisherStellenbosch : Stellenbosch University
Source SetsSouth African National ETD Portal
Languageen_ZA
Detected LanguageUnknown
TypeThesis
Formatxii, 128 p. : ill.
RightsStellenbosch University

Page generated in 0.0028 seconds