Return to search

Modeling and predicting time series of social activities with fat-tailed distributions

Fat-tailed distributions, characterized by the relation P(x) ∝ x^{−α−1}, are an emergent statistical signature of many complex systems, and in particular of social activities. These fat-tailed distributions are the outcome of dynamical processes that, contrary to the shape of the distributions, is in most cases are unknown. Knowledge of these processes’ properties sheds light on how the events in these fat tails, i.e. extreme events, appear and if it is possible to anticipate them. In this Thesis, we study how to model the dynamics that lead to fat-tailed distributions and the possibility of an accurate prediction in this context. To approach these problems, we focus on the study of attention to items (such as videos, forum posts or papers) in the Internet, since human interactions through the online media leave digital traces that can be analysed quantitatively. We collected four sets of time series of online activity that show fat tails and we characterize them.

Of the many features that items in the datasets have, we need to know which ones are the most relevant to describe the dynamics, in order to include them in a model; we select the features that show high predictability, i.e. the capacity of realizing an accurate prediction based on that information. To quantify predictability we propose to measure the quality of the optimal forecasting method for extreme events, and we construct this measure. Applying these methods to data, we find that more extreme events (i.e. higher value of activity) are systematically more predictable, indicating that the possibility of discriminate successful items is enhanced. The simplest model that describes the dynamics of activity is to relate linearly the increment of activity with the last value of activity recorded. This starting point is known as proportional effect, a celebrated and widely used class of growth models in complex systems, which leads to a distribution of activity that is fat-tailed. On the one hand, we show that this process can be described and generalized in the framework of Stochastic Differential Equations (SDE) with Normal noise; moreover, we formalize the methods to estimate the parameters of such SDE. On the other hand, we show that the fluctuations of activity resulting from these models are not compatible with the data. We propose a model with proportional effect and Lévy-distributed noise, that proves to be superior describing the fluctuations around the average of the data and predicting the possibility of an item to become an extreme event.

However, it is possible to model the dynamics using more than just the last value of activity; we generalize the growth models used previously, and perform an analysis that indicates that the most relevant variable for a model is the last increment in activity. We propose a new model using only this variable and the fat-tailed noise, and we find that, in our data, this model is superior to the previous models, including the one we proposed. These results indicate that, even if present, the relevance of proportional effect as a generative mechanism for fat-tailed distributions is greatly reduced, since the dynamical equations of our models contain this feature in the noise. The implications of this new interpretation of growth models to the quantification of predictability are discussed along with applications to other complex systems.

Identiferoai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:29878
Date17 August 2016
CreatorsMiotto, José Maria
ContributorsAltmann, Eduardo G., Kantz, Holger, Ketzmerick, Roland, Peinke, Joachim, Technische Universität Dresden
Source SetsHochschulschriftenserver (HSSS) der SLUB Dresden
LanguageEnglish
Detected LanguageEnglish
Typedoc-type:doctoralThesis, info:eu-repo/semantics/doctoralThesis, doc-type:Text
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0019 seconds