Return to search

Variants of compound models and their application to citation analysis

This thesis develops two variant statistical models for count data based upon compound models for contexts when the counts may be viewed as derived from two generations, which may or may not be independent. Unlike standard compound models, the variants model the sum of both generations. We consider cases where both generations are negative binomial or one is Poisson and the other is negative binomial. The first variant, denoted SVA, follows a zero restriction, where a zero in the first generation will automatically be followed by a zero in the second generation. The second variant, denoted SVB, is a convolution model that does not possess this zero restriction. The main properties of the SVA and SVB models are outlined and compared with standard compound models. The results show that the SVA distributions are similar to standard compound distributions for some fixed parameters. Comparisons of SVA, Poisson hurdle, negative binomial hurdle and their zero-inflated counterpart using simulated SVA data indicate that different models can give similar results, as the generating models are not always selected as the best fitting. This thesis focuses on the use of the variant models to model citation counts. We show that the SVA models are more suitable for modelling citation data than other previously used models such as the negative binomial model. Moreover, the application of SVA and SVB models may be used to describe the citation process. This thesis also explores model selection techniques based on log-likelihood methods, Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). The suitability of the models is also assessed using two diagrammatic methods, randomised quantile residual plots and Christmas tree plots. The Christmas tree plots clearly illustrate whether the observed data are within fluctuation bounds under the fitted model, but the randomised quantile residual plots utilise the cumulative distribution, and hence are insensitive to individual data values. Both plots show the presence of citation counts that are larger than expected under the fitted model in the data sets.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:715029
Date January 2017
CreatorsLow, Wan Jing
PublisherUniversity of Wolverhampton
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://hdl.handle.net/2436/620467

Page generated in 0.002 seconds