631 |
Constrained Statistical Inference in RegressionPeiris, Thelge Buddika 01 August 2014 (has links)
Regression analysis constitutes a large portion of the statistical repertoire in applications. In case where such analysis is used for exploratory purposes with no previous knowledge of the structure one would not wish to impose any constraints on the problem. But in many applications we are interested in a simple parametric model to describe the structure of a system with some prior knowledge of the structure. An important example of this occurs when the experimenter has the strong belief that the regression function changes monotonically in some or all of the predictor variables in a region of interest. The analyses needed for statistical inference under such constraints are nonstandard. The specific aim of this study is to introduce a technique which can be used for statistical inferences of a multivariate simple regression with some non-standard constraints.
|
632 |
Statistical methods for biodiversity assessmentKumphakarm, Ratchaneewan January 2016 (has links)
This thesis focuses on statistical methods for estimating the number of species which is a natural index for measuring biodiversity. Both parametric and nonparametric approaches are investigated for this problem. Species abundance models including homogeneous and heterogeneous model are explored for species richness estimation. Two new improvements to the Chao estimator are developed using the Good-Turing coverage formula. Although the homogeneous abundance model is the simplest model, the species are collected with different probability in practice. This leads to overdispersed data, zero inflation and a heavy tail. The Poisson-Tweedie distribution, a mixed-Poisson distribution including many special cases such as the negative-binomial distribution, Poisson, Poisson inverse Gaussian, P\'lya-Aeppli and so on, is explored for estimating the number of species. The weighted linear regression estimator based on the ratio of successive frequencies is applied \add{to data generated from} the Poisson-Tweedie distribution. There may be a problem with sparse data which provides zero frequencies for species seen $i$ times. This leads to the weighted linear regression not working. Then, a smoothing technique is considered for improving the performance of the weighted linear regression estimator. Both simulated data and some real data sets are used to study the performance of parametric and nonparametric estimators in this thesis. Finally, the distribution of the number distinct species found in a sample is hard to compute. Many approximations including the Poisson, normal, COM-Poisson Binomial, Altham's multiplicative and additive-binomial and P\'{o}lya distribution are used for approximating the distribution of distinct species. Under various abundance models, Altham's multiplicative-binomial approximation performs well. Building on other recent work, the maximum likelihood and the maximum pseudo-likelihood estimators are applied with Altham's multiplicative-binomial approximation and compared with other estimators.
|
633 |
A statistical study of ship domainsGoodwin, Elisabeth M. January 1975 (has links)
The thesis is an attempt to establish the water area required by any one ship for safe and efficient navigation. The concept of a ship domain has been considered, which may be defined as the effective area around a ship which a navigator would like to keep free with respect to other ships and stationary objects. This area will not be the same for all ships but will depend on a variety of factors such as speed, size of ship and density of traffic among others. The first part of the project was concerned with the collection of data from two separate sources: one being the performance of ships' officers in collision avoidance exercises on a marine radar simulator and the second being marine traffic surveys conducted in the Sunk area of the North Sea, The collection of data on ship movements and their processing for analysis by computer comprised the early work of the thesis. The next section of work was concerned with the development of a technique for evaluating the size of the domain and in particular the range of the domain boundary from the ship referred to as the domange. Very little work appears to have been done on this topic previously so several possibilities were considered before a decision was made as to the most suitable technique. Once this had been established results were obtained for a variety of conditions such as different sea area, length of ship and experience of the navigator as well as those previously mentioned and others. The final part of the thesis considers possible applications of the results in a variety of situations which are of current and future interest in marine traffic studies.
|
634 |
Statistical Signal Processing for GraphsJanuary 2015 (has links)
abstract: Analysis of social networks has the potential to provide insights into wide range of applications. As datasets continue to grow, a key challenge is the lack of a widely applicable algorithmic framework for detection of statistically anomalous networks and network properties. Unlike traditional signal processing, where models of truth or empirical verification and background data exist and are often well defined, these features are commonly lacking in social and other networks. Here, a novel algorithmic framework for statistical signal processing for graphs is presented. The framework is based on the analysis of spectral properties of the residuals matrix. The framework is applied to the detection of innovation patterns in publication networks, leveraging well-studied empirical knowledge from the history of science. Both the framework itself and the application constitute novel contributions, while advancing algorithmic and mathematical techniques for graph-based data and understanding of the patterns of emergence of novel scientific research. Results indicate the efficacy of the approach and highlight a number of fruitful future directions. / Dissertation/Thesis / Doctoral Dissertation Applied Mathematics for the Life and Social Sciences 2015
|
635 |
Visualization of Statistical ContentsMEHMOOD, RAJA MAJID, IQBAL, GULRAIZ January 2009 (has links)
Our project presents the research on visualization of statistical contents. Here wewill introduce the concepts of visualization, software quality metrics andproposed visualization technique (line chart). Our aim to study the existingvisualization techniques for visualization of software metrics and then proposedthe visualization approach that is more time efficient and easy to perceive byviewer.In this project, we focus on the practical aspects of visualization of multipleprojects with respect to the versions and metrics. This project also gives animplementation of proposed visualization techniques of software metrics. In thisresearch based work, we have to compare practically the proposed visualizationapproaches. We will discuss the software development life cycle of our proposedvisualization system, and we will also describe the complete softwareimplementation of implemented software.
|
636 |
Improved Statistics HandlingKarlslätt, David January 2009 (has links)
Ericsson is a global provider of telecommunications systems equipment and related services for mobile and fixed network operators. 3Gsim is a tool used by Ericsson in tests of the 3G RNC node. In order to validate the tests, statistics are constantly gathered within 3Gsim and users can use telnet to access the statistics using some system specific 3Gsim commands. The statistics can be retrieved but is unstructured for the human eye and needs parsing and arranging to be readable. The statistics handler that is implemented during this thesis provides a possibility for users of 3Gsim to present information that favors their personal interest. The implementation can produce one prototype output document which contains the most common statistics needed by the 3Gsim user. A main focus of this final thesis has been to simplify content and format control for the user as much as possible. Presenting and structuring information now comes down to simple text editing and rid the user of the time consuming work of updating and recompiling the entire application. Earlier, scripts written in Perl, an iterative oriented language, were used for presenting the statistics. These scripts were often difficult to comprehend since there were many different authors with inadequate experience and knowledge. The new statistics handler has been written in Java, a high-level object-oriented language which should better suite the users and developers of 3Gsim.
|
637 |
Set representation by statistical propertiesMarchant, Alexander January 2011 (has links)
This thesis has investigated the apparent ability of the visual system to represent a set of similar objects with a summary description instead of information about the individual items themselves (Ariely, 2001; Chong and Treisman, 2005a). Summary descriptions can be based on set sizes that are beyond the capacity of focussed attention, leading to the proposal that a distributed attention mechanism, statistical processing, underlies this process (Chong and Treisman, 2003, 2005a, 2005b; Chong et al. 2008; Treisman, 2006). However, the conclusion that summary descriptions are formed by a mechanism involving distributed attention has been questioned on the basis of parsimony, and a proposal for the role of focussed attention strategies in producing these summary descriptions has been made (Myzcek & Simons, 2008; Simons & Myzcek, 2008; see also De Fockert & Marchant, 2008). The aim of this thesis was to further elucidate the process of set representation by statistical properties, exploring the evidence that the summary description is given preferential representational status over individual items (Chapter 2), that summary descriptions can be produced within the known capacity limits of focussed attention (Chapter 3), that the results found in these experiments are not affected by the development of a prototypical average across the experimental session (Chapter 4), and that similar summary descriptions may also be rapidly extracted from more complex stimuli (Chapter 5). These findings are discussed in the context of current average size perception theory, and the proposal of a dual process view of set representation by statistical properties is briefly outlined. The dual process view combines both focussed attention when stimulus complexity is low and/or cognitive resources are available and distributed attention when stimulus complexity is high and/or cognitive resources are restricted. Finally, a selection of further studies and research areas that follow from the current research and the dual process view are briefly detailed.
|
638 |
Statistical thermodynamics of crystal latticesHooton, David John January 1953 (has links)
No description available.
|
639 |
Statistical modelling in test cricketAkhtar, Sohail January 2011 (has links)
In this thesis, we focus on decision problems in test cricket. Initially, we address declaration and follow-on decision problems. We then investigate session by session batting and bowling strategy. Later, we extend our analysis to the rating of test cricket players. We also study how the nature and strength of the covariate effects in our match outcome models vary as a match progresses. We model the match outcome given the end of first, second and third innings positions and then use this for decision making. Our declaration models provide a decision support tool to a batting team captain and management to consider the best timing of declarations in the first three innings. Match outcome probabilities (win,draw, loss) are calculated using nominal multinomial logistic regression models. We also propose quantitative decision support for batting strategy in the third innings. We approach the statistical problem by supposing that the third innings run-rate and the target that the side batting third aims to set its opponent are decision variables. The follow-on decision problem is also briefly considered: should a captain enforce the follow-on or not? Surprisingly, we find that the decision to enforce the follow-on or otherwise has no effect on the match outcome. We forecast match outcomes in test cricket in play, session by session. Match outcome probabilities are modelled using multinomial regression, with a win, draw, or loss response, and explanatory variables or covariates relating to match state at the start of each session. These probabilities can facilitate a team captain or management to decide on an aggressive or defensive batting strategy for the coming session. These covariates include the lead, wicket resources used, run-rate, a home advantage factor, and surrogates for the state of the pitch (ground effect) and the pre-match strengths of teams. We attempt to compare our results with bookmakers' odds by means of examples. This thesis also investigates how the covariate effects vary from innings to innings and session to session. The nature of the covariates that influence the match outcome changes as the match progresses. Early in the match, pre-match team strengths have a large effect. This reduces as the match progresses. Home advantage and ground effect appear small and exist only early on. We also extend our analysis to the rating of test cricket players. The rating system is based on player contributions session by session in a test match. This rating system evaluates the performance of the players taking into account the stage of match in which runs and wickets are earned and conceded and the influence of the runs and wickets earned on the match outcome.
|
640 |
Statistical Representations Of Microbial SystemsVazquez Baeza, Yoshiki 06 January 2018 (has links)
<p> Technological developments in the past thirty years have transformed sequencing-based microbiology into a data-intensive field. Here, computing and efficient representations are catalyzers of insight into omnipresent and complex microbial interactions. Notably, classical ecologists have set the foundations for the way we analyze these systems, with some techniques dating back to the beginning of the twentieth century. In this thesis, we expand and where possible reuse these techniques to unravel the hidden patterns comprising the human gut microbiome.</p><p> To set an appropriate motivation and context for the rest of this work, Chapter 1 reviews recent discoveries on the human microbiome and how the communities within can influence the effectiveness of therapeutic agents. Next, in Chapter 2, we introduce EMPeror, an interactive analysis and visualization tool that is crucial to the findings presented in later chapters.</p><p> The following three chapters study concrete examples where the microbiome has been implicated as a driver or marker for dysbiosis. Chapter 3 describes how the microbial signature associated with Crohn's disease (CD) in humans, described in our previous work, is overlapping but distinct to that of dogs affected with inflammatory bowel disease (IBD). Surprisingly, unlike with humans, dog fecal samples alone are strong indicators of the disease. In Chapter 4, we study IBD from a longitudinal perspective, revealing increased volatility in the gut microbiomes of subjects with IBD, a property that does not appear to be present in unaffected controls. Furthermore, we use this as a predicting feature of the disease, and improve on the classification accuracy possible through a single fecal sample. In Chapter 5, we study the effect of fecal microbiota transplants (FMTs) to treat Clostridium difficile infection (CDI) and, using the techniques described in Chapter 2, we show the first animated visualization of this process, a dramatic microbial transformation as the subjects recover from all CDI symptoms. In addition, for CDI patients who also suffer from a subtype of IBD, a treatment with a FMT results in an increased number of relapses and decreased microbial diversity.</p><p> The closing chapter discusses these results and their possible applications, as well as future directions for computationally-centric microbiome research. </p><p>
|
Page generated in 0.22 seconds