Global ETD Search

1	Central Limit Theorem of Some Statistics Associated with Self-Normalized Subordinators Agbewu, Bright Mawusi Komla January 2019 (has links) Consider a population of m-type individuals labelled by {1,2,...,m}. Let x=(x_1,x_2,...,x_m) denote the relative frequencies of all types with x_i denoting the relative frequency of type i for 1<i<m. For a random sample of size 2 from the population, the probability that the individuals of the sample are of the same type is given by H= sum of the squares of x_i's up to m. In this thesis, we focus on the case where x = (x_1,x_2,...x_m) is a random vector. The quantity H appears in various fields of study. For instance, it is associated with the Shannon entropy in communication, the Herfindahl-Hirschman index in economics and known as the homozygosity in population genetics. In Feng (2010), fluctuation theorems for the infinite dimensional case of H are considered. In this thesis we present, under a moment assumption, a Central Limit Theorem (CLT) associated with H and present as examples the Gamma subordinator case, which is a well known result by Griffiths (1979), and the generalized Gamma subordinator case. / Thesis / Master of Science (MSc) subordinators, Stein's method
2	Barabási-Albert random graphs, scale-free distributions and bounds for approximation through Stein's method Ford, Elizabeth January 2009 (has links) Barabási-Albert random graph models are a class of evolving random graphs that are frequently used to model social networks with scale-free degree distributions. It has been shown that Barabási-Albert random graph models have asymptotic scale-free degree distributions as the size of the graph tends to infinity. Real world networks, however, have finite size so it is important to know how close the degree distribution of a Barabási-Albert random graph of a given size is to its asymptotic distribution. Stein’s method is chosen as one main method for obtaining explicit bounds for the distance between distributions. We derive a new version of Stein’s method for a class of scale-free distributions and apply the method to a Barabási-Albert random graph. We compare the evolution of a sequence of Barabási-Albert random graphs with continuous time stochastic processes motivated by Yule’s model for evolution. Through a coupling of the models we bound the total variation distance between their degree distributions. Using these bounds, we extend degree distribution bounds that we find for specific models within the scheme to find bounds for every member of the scheme. We apply the Azuma-Hoeffding inequality and Chernoff bounds to find bounds between the degree sequences of the random graph models and the given scale-free distribution. These bounds prove that the degree sequences converge completely (and therefore also converge almost surely) to our scale-free distribution. We discuss the relationship between the random graph processes and the Chinese restaurant process. Aided by the construction of an inhomogeneous Markov chain, we apply our results for the degree distribution in a Barabási-Albert random graph to a particular statistic of the Chinese restaurant process. Finally, we explore how our methods can be adapted and extended to other evolving random graph processes. We study a Bernoulli evolving random graph process, for which we bound the distance between its degree distribution and a geometric distribution and we bound the distance between the number of triangles in the graph and a normal distribution. 519
3	Statistical Learning and Model Criticism for Networks and Point Processes Jiasen Yang (7027331) 16 August 2019 (has links) <div>Networks and point processes provide flexible tools for representing and modeling complex dependencies in data arising from various social and physical domains. Graphs, or networks, encode relational dependencies between entities, while point processes characterize temporal or spatial interactions among events.</div><div><br></div><div>In the first part of this dissertation, we consider dynamic network data (such as communication networks) in which links connecting pairs of nodes appear continuously over time. We propose latent space point process models to capture two different aspects of the data: (i) communication occurs at a higher rate between individuals with similar latent attributes (i.e., homophily); and (ii) individuals tend to reciprocate communications from others, but in a varied manner. Our framework marries ideas from point process models, including Poisson and Hawkes processes, with ideas from latent space models of static networks. We evaluate our models on several real-world datasets and show that a dual latent space model, which accounts for heterogeneity in both homophily and reciprocity, significantly improves performance in various link prediction and network embedding tasks.</div><div><br></div><div>In the second part of this dissertation, we develop nonparametric goodness-of-fit tests for discrete distributions and point processes that contain intractable normalization constants, providing the first generally applicable and computationally feasible approaches under those circumstances. Specifically, we propose and characterize Stein operators for discrete distributions, and construct a general Stein operator for point processes using the Papangelou conditional intensity function. Based on the proposed Stein operators, we establish kernelized Stein discrepancy measures for discrete distributions and point processes, which enable us to develop nonparametric goodness-of-fit tests for un-normalized density/intensity functions. We apply the kernelized Stein discrepancy tests to discrete distributions (including network models) as well as temporal and spatial point processes. Our experiments demonstrate that the proposed tests typically outperform two-sample tests based on the maximum mean discrepancy, which, unlike our goodness-of-fit tests, assume the availability of exact samples from the null model.</div><div><br></div> Statistics Network data Point process Model criticism Stein's method Kernel methods Goodness-of-fit testing
4	Stochastic Analysis of Networked Systems January 2020 (has links) abstract: This dissertation presents a novel algorithm for recovering missing values of co-evolving time series with partial embedded network information. The idea is to connect two sources of data through a shared low dimensional latent space. The proposed algorithm, named NetDyna, is an Expectation-Maximization algorithm, and uses the Kalman filter and matrix factorization approaches to infer the missing values both in the time series and embedded network. The experimental results on real datasets, including a Motes dataset and a Motion Capture dataset, show that (1) NetDyna outperforms other state-of-the-art algorithms, especially with partially observed network information; (2) its computational complexity scales linearly with the time duration of time series; and (3) the algorithm recovers the embedded network in addition to missing time series values. This dissertation also studies a load balancing algorithm, the so called power-of-two-choices(Po2), for many-server systems (with N servers) and focuses on the convergence of stationary distribution of Po2 in the both light and heavy traffic regimes to the solution of mean-field system. The framework of Stein’s method and state space collapse (SSC) are used to analyze both regimes. In both regimes, the thesis first uses the argument of state space collapse to show that the probability of the state being far from the mean-field solution is small enough. By a simple Markov inequality, it is able to show that the probability is indeed very small with a proper choice of parameters. Then, for the state space close to the solution of mean-field model, the thesis uses Stein’s method to show that the stochastic system is close to a linear mean-field model. By characterizing the generator difference, it is able to characterize the dominant terms in both regimes. Note that for heavy traffic case, the lower and upper bound analysis of a tridiagonal matrix, which arises from the linear mean-field model, is needed. From the dominant term, it allows to calculate the coefficient of the convergence rate. In the end, comparisons between the theoretical predictions and numerical simulations are presented. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2020 Electrical engineering co-evolving time series light and heavy traffic missing value recovery power-of-two-choices state space collapse Stein's method
5	Steady State Analysis of Load Balancing Algorithms in the Heavy Traffic Regime January 2019 (has links) abstract: This dissertation studies load balancing algorithms for many-server systems (with N servers) and focuses on the steady-state performance of load balancing algorithms in the heavy traffic regime. The framework of Stein’s method and (iterative) state-space collapse (SSC) are used to analyze three load balancing systems: 1) load balancing in the Sub-Halfin-Whitt regime with exponential service time; 2) load balancing in the Beyond-Halfin-Whitt regime with exponential service time; 3) load balancing in the Sub-Halfin-Whitt regime with Coxian-2 service time. When in the Sub-Halfin-Whitt regime, the sufficient conditions are established such that any load balancing algorithm that satisfies the conditions have both asymptotic zero waiting time and zero waiting probability. Furthermore, the number of servers with more than one jobs is o(1), in other words, the system collapses to a one-dimensional space. The result is proven using Stein’s method and state space collapse (SSC), which are powerful mathematical tools for steady-state analysis of load balancing algorithms. The second system is in even “heavier” traffic regime, and an iterative refined procedure is proposed to obtain the steady-state metrics. Again, asymptotic zero delay and waiting are established for a set of load balancing algorithms. Different from the first system, the system collapses to a two-dimensional state-space instead of one-dimensional state-space. The third system is more challenging because of “non-monotonicity” with Coxian-2 service time, and an iterative state space collapse is proposed to tackle the “non-monotonicity” challenge. For these three systems, a set of load balancing algorithms is established, respectively, under which the probability that an incoming job is routed to an idle server is one asymptotically at steady-state. The set of load balancing algorithms includes join-the-shortest-queue (JSQ), idle-one-first(I1F), join-the-idle-queue (JIQ), and power-of-d-choices (Pod) with a carefully-chosen d. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2019 Electrical engineering Operations research asymptotic zero delay data centers heavy traffic regime Load balancing state space collapse Stein's method
6	Malliavin-Stein Method in Stochastic Geometry Schulte, Matthias 19 March 2013 (has links) In this thesis, abstract bounds for the normal approximation of Poisson functionals are computed by the Malliavin-Stein method and used to derive central limit theorems for problems from stochastic geometry. As a Poisson functional we denote a random variable depending on a Poisson point process. It is known from stochastic analysis that every square integrable Poisson functional has a representation as a (possibly infinite) sum of multiple Wiener-Ito integrals. This decomposition is called Wiener-Itô chaos expansion, and the integrands are denoted as kernels of the Wiener-Itô chaos expansion. An explicit formula for these kernels is known due to Last and Penrose. Via their Wiener-Itô chaos expansions the so-called Malliavin operators are defined. By combining Malliavin calculus and Stein's method, a well-known technique to derive limit theorems in probability theory, bounds for the normal approximation of Poisson functionals in the Wasserstein distance and vectors of Poisson functionals in a similar distance were obtained by Peccati, Sole, Taqqu, and Utzet and Peccati and Zheng, respectively. An analogous bound for the univariate normal approximation in Kolmogorov distance is derived. In order to evaluate these bounds, one has to compute the expectation of products of multiple Wiener-Itô integrals, which are complicated sums of deterministic integrals. Therefore, the bounds for the normal approximation of Poisson functionals reduce to sums of integrals depending on the kernels of the Wiener-Itô chaos expansion. The strategy to derive central limit theorems for Poisson functionals is to compute the kernels of their Wiener-Itô chaos expansions, to put the kernels in the bounds for the normal approximation, and to show that the bounds vanish asymptotically. By this approach, central limit theorems for some problems from stochastic geometry are derived. Univariate and multivariate central limit theorems for some functionals of the intersection process of Poisson k-flats and the number of vertices and the total edge length of a Gilbert graph are shown. These Poisson functionals are so-called Poisson U-statistics which have an easier structure since their Wiener-Itô chaos expansions are finite, i.e. their Wiener-Itô chaos expansions consist of finitely many multiple Wiener-Itô integrals. As examples for Poisson functionals with infinite Wiener-Itô chaos expansions, central limit theorems for the volume of the Poisson-Voronoi approximation of a convex set and the intrinsic volumes of Boolean models are proven. Malliavin calculus Malliavin-Stein method Poisson point process Stein's method Stochastic geometry Wiener-Itô chaos expansion ddc:510
7	Limit theorems in preferential attachment random graphs Betken, Carina 17 May 2019 (has links) We consider a general preferential attachment model, where the probability that a newly arriving vertex connects to an older vertex is proportional to a (sub-)linear function of the indegree of the older vertex at that time. We provide a limit theorem with rates of convergence for the distribution of a vertex, chosen uniformly at random, as the number of vertices tends to infinity. To do so, we develop Stein's method for a new class of limting distributions including power-laws. Similar, but slightly weaker results are shown to be deducible using coupling techniques. Concentrating on a specific preferential attachment model we also show that the outdegree distribution asymptotically follows a Poisson law. In addition, we deduce a central limit theorem for the number of isolated vertices. We thereto construct a size-bias coupling which in combination with Stein’s method also yields bounds on the distributional distance. preferential attachment random graphs Stein's method limiting distribution rates of convergence coupling power-law distribution 31.70 - Wahrscheinlichkeitsrechnung ddc:510
8	Rates of convergence of variance-gamma approximations via Stein's method Gaunt, Robert E. January 2013 (has links) Stein's method is a powerful technique that can be used to obtain bounds for approximation errors in a weak convergence setting. The method has been used to obtain approximation results for a number of distributions, such as the normal, Poisson and Gamma distributions. A major strength of the method is that it is often relatively straightforward to apply it to problems involving dependent random variables. In this thesis, we consider the adaptation of Stein's method to the class of Variance-Gamma distributions. We obtain a Stein equation for the Variance-Gamma distributions. Uniform bounds for the solution of the Symmetric Variance-Gamma Stein equation and its first four derivatives are given in terms of the supremum norms of derivatives of the test function. New formulas and inequalities for modified Bessel functions are obtained, which allow us to obtain these bounds. We then use local approach couplings to obtain bounds on the error in approximating two asymptotically Variance-Gamma distributed statistics by their limiting distribution. In both cases, we obtain a convergence rate of order n<sup>-1</sup> for suitably smooth test functions. The product of two normal random variables has a Variance-Gamma distribution and this leads us to consider the development of Stein's method to the product of r independent mean-zero normal random variables. An elegant Stein equation is obtained, which motivates a generalisation of the zero bias transformation. This new transformation has a number of interesting properties, which we exploit to prove some limit theorems for statistics that are asymptotically distributed as the product of two central normal distributions. The Variance-Gamma and Product Normal distributions arise as functions of the multivariate normal distribution. We end this thesis by demonstrating how the multivariate normal Stein equation can be used to prove limit theorems for statistics that are asymptotically distributed as a function of the multivariate normal distribution. We establish some sufficient conditions for convergence rates to be of order n<sup>-1</sup> for smooth test functions, and thus faster than the O(n<sup>-1/2</sup>) rate that would arise from the Berry-Esseen Theorem. We apply the multivariate normal Stein equation approach to prove Variance-Gamma and Product Normal limit theorems, and we also consider an application to Friedman's X<sup>2</sup> statistic. 519.2
9	Algorithmes d'apprentissage statistique pour l'analyse géométrique et topologique de données / Statistical learning algorithms for geometric and topological data analysis Bonis, Thomas 01 December 2016 (has links) Dans cette thèse, on s'intéresse à des algorithmes d'analyse de données utilisant des marches aléatoires sur des graphes de voisinage, ou graphes géométriques aléatoires, construits à partir des données. On sait que les marches aléatoires sur ces graphes sont des approximations d'objets continus appelés processus de diffusion. Dans un premier temps, nous utilisons ce résultat pour proposer un nouvel algorithme de partitionnement de données flou de type recherche de modes. Dans cet algorithme, on définit les paquets en utilisant les propriétés d'un certain processus de diffusion que l'on approche par une marche aléatoire sur un graphe de voisinage. Après avoir prouvé la convergence de notre algorithme, nous étudions ses performances empiriques sur plusieurs jeux de données. Nous nous intéressons ensuite à la convergence des mesures stationnaires des marches aléatoires sur des graphes géométriques aléatoires vers la mesure stationnaire du processus de diffusion limite. En utilisant une approche basée sur la méthode de Stein, nous arrivons à quantifier cette convergence. Notre résultat s'applique en fait dans un cadre plus général que les marches aléatoires sur les graphes de voisinage et nous l'utilisons pour prouver d'autres résultats : par exemple, nous arrivons à obtenir des vitesses de convergence pour le théorème central limite. Dans la dernière partie de cette thèse, nous utilisons un concept de topologie algébrique appelé homologie persistante afin d'améliorer l'étape de "pooling" dans l'approche "sac-de-mots" pour la reconnaissance de formes 3D. / In this thesis, we study data analysis algorithms using random walks on neighborhood graphs, or random geometric graphs. It is known random walks on such graphs approximate continuous objects called diffusion processes. In the first part of this thesis, we use this approximation result to propose a new soft clustering algorithm based on the mode seeking framework. For our algorithm, we want to define clusters using the properties of a diffusion process. Since we do not have access to this continuous process, our algorithm uses a random walk on a random geometric graph instead. After proving the consistency of our algorithm, we evaluate its efficiency on both real and synthetic data. We then deal tackle the issue of the convergence of invariant measures of random walks on random geometric graphs. As these random walks converge to a diffusion process, we can expect their invariant measures to converge to the invariant measure of this diffusion process. Using an approach based on Stein's method, we manage to obtain quantitfy this convergence. Moreover, the method we use is more general and can be used to obtain other results such as convergence rates for the Central Limit Theorem. In the last part of this thesis, we use the concept of persistent homology, a concept of algebraic topology, to improve the pooling step of the bag-of-words approach for 3D shapes. Graphes géométriques aléatoires Marches aléatoires Partitionnement de données flou Méthode de Stein Homologie persistante Sac-de-mots Random geometric graphs Random walks Soft clustering Stein's method Persistent homology Bag-of-words

Search results