• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Statistical methods for certain large, complex data challenges

Li, Jun 15 November 2018 (has links)
Big data concerns large-volume, complex, growing data sets, and it provides us opportunities as well as challenges. This thesis focuses on statistical methods for several specific large, complex data challenges - each involving representation of data with complex format, utilization of complicated information, and/or intensive computational cost. The first problem we work on is hypothesis testing for multilayer network data, motivated by an example in computational biology. We show how to represent the complex structure of a multilayer network as a single data point within the space of supra-Laplacians and then develop a central limit theorem and hypothesis testing theories for multilayer networks in that space. We develop both global and local testing strategies for mean comparison and investigate sample size requirements. The methods were applied to the motivating computational biology example and compared with the classic Gene Set Enrichment Analysis(GSEA). More biological insights are found in this comparison. The second problem is the source detection problem in epidemiology, which is one of the most important issues for control of epidemics. Ideally, we want to locate the sources based on all history data. However, this is often infeasible, because the history data is complex, high-dimensional and cannot be fully observed. Epidemiologists have recognized the crucial role of human mobility as an important proxy to a complete history, but little in the literature to date uses this information for source detection. We recast the source detection problem as identifying a relevant mixture component in a multivariate Gaussian mixture model. Human mobility within a stochastic PDE model is used to calibrate the parameters. The capability of our method is demonstrated in the context of the 2000-2002 cholera outbreak in the KwaZulu-Natal province. The third problem is about multivariate time series imputation, which is a classic problem in statistics. To address the common problem of low signal-to-noise ratio in high-dimensional multivariate time series, we propose models based on state-space models which provide more precise inference of missing values by clustering multivariate time series components in a nonparametric way. The models are suitable for large-scale time series due to their efficient parameter estimation. / 2019-05-15T00:00:00Z
2

Symbolic regulation : human rights provisions in preferential trade agreements

Peacock, Claire January 2018 (has links)
While the multilateral trading system views human and labour rights issues as outside of its remit, states increasingly incorporate regulation in these areas into their bilateral reciprocal preferential trade agreements, "HR-PTAs. This dissertation investigates the emergence of HR-PTAs, testing alternative explanations derived from conventional "public interest" and "private interest" theories of regulation against a new theory of "symbolic regulation." According to the public interest theory of regulation, regulation is motivated by benevolent legislators' commitment to correcting market or social problems. The private interest theory of regulation instead views regulation as the result of private interest groups capturing the regulatory apparatus in order to regulate in their own self-interest. Unlike its counterparts, the symbolic theory of regulation suggests that regulation may also be created for the primary purpose of reassuring regulatory advocates that their demands have been heard, rather than to regulate a given issue area. This dissertation argues that for the states behind them, HR-PTAs are primarily a symbolic form of regulation. Legislators create HR-PTAs to appease domestic human and labour rights organizations, while defending their trade interests through the non-enforcement of their provisions. Using longitudinal network analysis to analyse original data from 415 preferential trade agreements in force from 1989 to 2009, paired with case study evidence from the EU, US, and Canada, this dissertation finds support for the symbolic regulation explanation of HR-PTAs. It shows that a state's commitment to HR-PTAs depends less on the public interest or the desires of private interest groups than on its need to accommodate human and labour rights advocates. Symbolic regulation however should not be dismissed. It sets precedents, creates policy space, facilitates softer forms of cooperation, and can fuel political accountability politics. When this occurs, states may use HR-PTAs or other forms of symbolic regulation to achieve their seeming purpose.
3

Hypothesis testing and community detection on networks with missingness and block structure

Guilherme Maia Rodrigues Gomes (8086652) 06 December 2019 (has links)
Statistical analysis of networks has grown rapidly over the last few years with increasing number of applications. Graph-valued data carries additional information of dependencies which opens the possibility of modeling highly complex objects in vast number of fields such as biology (e.g. brain networks , fungi networks, genes co-expression), chemistry (e.g. molecules fingerprints), psychology (e.g. social networks) and many others (e.g. citation networks, word co-occurrences, financial systems, anomaly detection). While the inclusion of graph structure in the analysis can further help inference, simple statistical tasks in a network is very complex. For instance, the assumption of exchangeability of the nodes or the edges is quite strong, and it brings issues such as sparsity, size bias and poor characterization of the generative process of the data. Solutions to these issues include adding specific constraints and assumptions on the data generation process. In this work, we approach this problem by assuming graphs are globally sparse but locally dense, which allows exchangeability assumption to hold in local regions of the graph. We consider problems with two types of locality structure: block structure (also framed as multiple graphs or population of networks) and unstructured sparsity which can be seen as missing data. For the former, we developed a hypothesis testing framework for weighted aligned graphs; and a spectral clustering method for community detection on population of non-aligned networks. For the latter, we derive an efficient spectral clustering approach to learn the parameters of the zero inflated stochastic blockmodel. Overall, we found that incorporating multiple local dense structures leads to a more precise and powerful local and global inference. This result indicates that this general modeling scheme allows for exchangeability assumption on the edges to hold while generating more realistic graphs. We give theoretical conditions for our proposed algorithms, and we evaluate them on synthetic and real-world datasets, we show our models are able to outperform the baselines on a number of settings. <br>

Page generated in 0.0812 seconds