• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 7
  • 3
  • 2
  • 1
  • Tagged with
  • 27
  • 9
  • 4
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Generalized linear models for large dependent data sets

Bate, Steven Mark January 2004 (has links)
Generalized linear models (GLMs) were originally used to build regression models for independent responses. In recent years, however, effort has focused on extending the original GLM theory to enable it to be applied to data which exhibit dependence in the responses. This thesis focuses on some specific extensions of the GLM theory for dependent responses. A new hypothesis testing technique is proposed for the application of GLMs to cluster dependent data. The test is based on an adjustment to the 'independence' likelihood ratio test, which allows for the within cluster dependence. The performance of the new test, in comparison to established techniques, is explored. The application of the generalized estimating equations (GEE) methodology to model space-time data is also investigated. The approach allows for the temporal dependence via the covariates and models the spatial dependence using techniques from geostatistics. The application area of climatology has been used to motivate much of the work undertaken. A key attribute of climate data sets, in addition to exhibiting dependence both spatially and temporally, is that they are typically large in size, often running into millions of observations. Therefore, throughout the thesis, particular attention has focused on computational issues, to enable analysis to be undertaken in a feasible time frame. For example, we investigate the use of the GEE one-step estimator in situations where the application of the full algorithm is impractical. The final chapter of this thesis presents a climate case study. This involves wind speeds over northwestern Europe, which we analyse using the techniques developed.
2

Multivariate statistical process monitoring using classical multidimensional scaling

Mohd Yunus, Mohd Yusri January 2012 (has links)
A new Multivariate Statistical Process Monitoring (MSPM) system, which comprises of three main frameworks, is proposed where the system utilizes Classical Multidimensional Scaling (CMDS) as the main multivariate data compression technique instead of using the linearbased Principal Component Analysis (PCA). The conventional method which usually applies variance-covariance or correlation measure in developing the multivariate scores is found to be inappropriately used especially in modelling nonlinear processes, where a high number of principal components will be typically required. Alternatively, the proposed method utilizes the inter-dissimilarity scales in describing the relationships among the monitored variables instead of variance-covariance measure for the multivariate scores development. However, the scores are plotted in terms of variable structure, thus providing different formulation of statistics for monitoring. Nonetheless, the proposed statistics still correspond to the conceptual objective of Hotelling’s T2 and Squared Prediction Errors (SPE). The first framework corresponds to the original CMDS framework, whereas the second utilizes Procrustes Analysis (PA) functions which is analogous to the concept of loading factors in PCA for score projection. Lastly, the final framework employs dynamic mechanism of PA functions as an alternative for enhancing the procedures of the second approach. A simulated system of Continuous Stirred Tank Reactor with Recycle (CSTRwR) has been chosen for the demonstration and the fault detection results were comparatively analyzed to the outcomes of PCA on the grounds of false alarm rates, total number of detected cases and also total number of fastest detection cases. The last two performance factors are obtained through fault detection time. The overall outcomes show that the three CMDS-based systems give almost comparable performances to the linear PCA based monitoring systemwhen dealing the abrupt fault events, whereas the new systems have demonstrated significant improvement over the conventional method in detecting incipient fault cases. More importantly, this monitoring accomplishment can be efficiently executed based on lower compressed dimensional space compared to the PCA technique, thus providing much simpler solution. All of these evidences verified that the proposed approaches are successfully developed conceptually as well as practically for monitoring while complying fundamentally with the principles and technical steps of the conventional MSPM system.
3

Some investigations in discriminant analysis with mixed variables

Mahat, Nor Idayu January 2006 (has links)
The location model is a potential basis for discriminating between groups of objects with mixed types of variables. The model specifies a parametric form for the conditional distribution of the continuous variables given each pattern of values of the categorical variables, thus leading to a theoretical discriminant function between the groups. To conduct a practical discriminant analysis, the objects must first be sorted into the cells of a multinomial table generated from the categorical values, and the model parameters must then be estimated from the data. However, in many practical situations some of the cells are empty, which prevents simple implementation of maximum likelihood estimation and restricts the feasibility of linear model estimators to cases with relatively few categorical variables. This deficiency was overcome by non-parametric smoothing estimation proposed by Asparoukhov and Krzanowski (2000). Its usual implementation uses exponential and piece-wise smoothing functions for the continuous variables, and adaptive weighted nearest neighbour for the categorical variables. Despite increasing the range of applicability, the smoothing parameters that are chosen by maximising the leave-one-out pseudo-likelihood depend on distributional assumptions, while, the smoothing method for the categorical variables produces erratic values if the number of variables is large. This thesis rectifies these shortcomings, and extends location model methodology to situations where there are large numbers of mixed categorical and continuous variables. Chapter 2 uses the simplest form of the exponential smoothing function for the continuous variables and describes how the smoothing parameters can instead be chosen by minimising either the leave-one-out error rate or the leave-one-out Brier score, neither of which make distributional assumptions. Alternative smoothing methods, namely a kernel and a weighted form of the maximum likelihood, are also investigated for the categorical variables. Numerical evidence in Chapter 3 shows that there is little to choose among the strategies for estimating smoothing parameters and among the smoothing methods for the categorical variables. However, some of the proposed smoothing methods are more feasible when the number of parameters to be estimated is reduced. Chapter 4 reviews previous work on problems of high dimensional feature variables, and focuses on selecting variables on the basis of the distance between groups. In particular, the Kullback-Leibler divergence is considered for the location model, but existing theory based on maximum likelihood estimators is not applicable for general cases. Chapter 5 therefore describes the implementation of this distance for smoothed estimators, and investigates its asymptotic distribution. The estimated distance and its asymptotic distribution provide a stopping rule in a sequence of searching processes, either by forward, backward or stepwise selections, following the test for no additional information. Simulation results in Chapter 6 exhibit the feasibility of the proposed variable selection strategies for large numbers of variables, but limitations in several circumstances are identified. Applications to real data sets in Chapter 7 show how the proposed methods are competitive with, and sometimes better than other existing classification methods. Possible future work is outlined in Chapter 8.
4

Parametric and Bayesian non-parametric estimation of copulas

Nicoloutsopoulos, Dimitrios January 2005 (has links)
This thesis studies parametric and non-parametric methods of cop ula estimation with special focus on the Archimedean class of copu las. The first part proposes an estimation procedure which is indepen dent of the marginal distributions and performs well for one-parame ter or two-parameter families of copulas, where traditional methods give questionable results especially for small sample sizes. In the sec ond part we follow a Bayesian methodology and represent the copula density as a random piecewise constant, function. Under the presence of some data, we set up a probability distribution over the copula density and utilize Markov Chain Monte Carlo techniques to explore that distribution. The methodology is extended to perform shape preserving estimation of a univariate convex and monotone func tion that characterizes the copula. The estimated first and second derivatives of the function of interest must satisfy the restrictions that the theory imposes. All methods are illustrated with examples from simulated samples and a real-life dataset of the daily observations of the Dow-Jones and FTSE financial indices.
5

Methods for the analysis of multivariate lifetime data with frailty

Fernandes Gomes da Silva, Alexandre Miguel January 2004 (has links)
No description available.
6

Elicitation of multivariate prior distributions

Moala, Fernando Antonio January 2006 (has links)
No description available.
7

Some novel nonlinear data fitting algorithms based on linear forms

Jenkinson, Damian Paul January 2006 (has links)
No description available.
8

The Health economic applications of Copulas: methods in applied econometrics reasearch

Quinn, Casey January 2007 (has links)
This thesis presents copulas as a statistical methodology appropriate to applied health economic research. Like all applied economic and econometric analysis, health economics applies econometric methods under certain assumptions. I propose here that copulas be used in place of common assumptions made when analysing multivariate data, specifically the distributional and dependence assimiptions commonly made jointly-dependent outcomes in health and health care.
9

Curved axes and trajectories for multidimensional scaling, with applications to sensory and consumer data

Bennett, Stephen John January 2008 (has links)
The analysis of sensory and consumer-derived data involves the use of many different statistical techniques. The vast majority of these are multivariate in for example, multidimensional scaling (MDS) and biplots. However, univariate techniques such as repeated measures analysis of variance and the Bradley-Terry model for paired comparison data are also common. This thesis introduces enhancements to MDS based on the use of curved axes and trajectories.
10

Some investigation on the analysis of distances

Yatim, Bidin bin January 2005 (has links)
No description available.

Page generated in 0.055 seconds