Global ETD Search

111	Methods for analysis of missing data using simulated longitudinal data with a binary outcome Sloan, Lauren Elizabeth. January 2005 (has links) (PDF) Thesis--University of Oklahoma. / Bibliography: leaves 62-63.
112	RJMCMC algorithm for multivariate Gaussian mixtures with applications in linear mixed-effects models / Ho, Kwok Wah. January 2005 (has links) Thesis (Ph.D.)--Hong Kong University of Science and Technology, 2005. / Includes bibliographical references (leaves 77-82). Also available in electronic version.
113	A comparison of estimators in hierarchical linear modeling restricted maximum likelihood versus bootstrap via minimum norm quadratic unbiased estimators / Delpish, Ayesha Nneka. Niu, Xu-Feng. January 2006 (has links) Thesis (Ph. D.)--Florida State University, 2006. / Advisor: Xu-Feng Niu, Florida State University, College of Arts and Sciences, Dept. of Statistics. Title and description from dissertation home page (viewed Sept. 18, 2006). Document formatted into pages; contains ix, 116 pages. Includes bibliographical references.
114	Nonlinear time series modeling with application to finance and other fields Jin, Shusong. January 2005 (has links) Thesis (Ph. D.)--University of Hong Kong, 2005. / Title proper from title frame. Also available in printed format.
115	Investigating change in intraindividual factor structure over time Rausch, Joseph R. January 2006 (has links) Thesis (Ph. D.)--University of Notre Dame, 2006. / Thesis directed by Scott E. Maxwell and Steven M. Boker for the Department of Psychology. "July 2006." Includes bibliographical references (leaves 105-119).
116	An approach to estimating the variance components to unbalanced cluster sampled survey data and simulated data Ramroop, Shaun 30 November 2002 (has links) Statistics / M. Sc. (Statistics) 519.53 Multilevel models (Statistics) Linear models (Statistics) Cluster set theory
117	Supervised Learning of Piecewise Linear Models Manwani, Naresh January 2012 (has links) (PDF) Supervised learning of piecewise linear models is a well studied problem in machine learning community. The key idea in piecewise linear modeling is to properly partition the input space and learn a linear model for every partition. Decision trees and regression trees are classic examples of piecewise linear models for classification and regression problems. The existing approaches for learning decision/regression trees can be broadly classified in to two classes, namely, fixed structure approaches and greedy approaches. In the fixed structure approaches, tree structure is fixed before hand by fixing the number of non leaf nodes, height of the tree and paths from root node to every leaf node of the tree. Mixture of experts and hierarchical mixture of experts are examples of fixed structure approaches for learning piecewise linear models. Parameters of the models are found using, e.g., maximum likelihood estimation, for which expectation maximization(EM) algorithm can be used. Fixed structure piecewise linear models can also be learnt using risk minimization under an appropriate loss function. Learning an optimal decision tree using fixed structure approach is a hard problem. Constructing an optimal binary decision tree is known to be NP Complete. On the other hand, greedy approaches do not assume any parametric form or any fixed structure for the decision tree classifier. Most of the greedy approaches learn tree structured piecewise linear models in a top down fashion. These are built by binary or multi-way recursive partitioning of the input space. The main issues in top down decision tree induction is to choose an appropriate objective function to rate the split rules. The objective function should be easy to optimize. Top-down decision trees are easy to implement and understand, but there are no optimality guarantees due to their greedy nature. Regression trees are built in the similar way as decision trees. In regression trees, every leaf node is associated with a linear regression function. All piece wise linear modeling techniques deal with two main tasks, namely, partitioning of the input space and learning a linear model for every partition. However, Partitioning of the input space and learning linear models for different partitions are not independent problems. Simultaneous optimal estimation of partitions and learning linear models for every partition, is a combinatorial problem and hence computationally hard. However, piecewise linear models provide better insights in to the classification or regression problem by giving explicit representation of the structure in the data. The information captured by piecewise linear models can be summarized in terms of simple rules, so that, they can be used to analyze the properties of the domain from which the data originates. These properties make piecewise linear models, like decision trees and regression trees, extremely useful in many data mining applications and place them among top data mining algorithms. In this thesis, we address the problem of supervised learning of piecewise linear models for classification and regression. We propose novel algorithms for learning piecewise linear classifiers and regression functions. We also address the problem of noise tolerant learning of classifiers in presence of label noise. We propose a novel algorithm for learning polyhedral classifiers which are the simplest form of piecewise linear classifiers. Polyhedral classifiers are useful when points of positive class fall inside a convex region and all the negative class points are distributed outside the convex region. Then the region of positive class can be well approximated by a simple polyhedral set. The key challenge in optimally learning a fixed structure polyhedral classifier is to identify sub problems, where each sub problem is a linear classification problem. This is a hard problem and identifying polyhedral separability is known to be NP complete. The goal of any polyhedral learning algorithm is to efficiently handle underlying combinatorial problem while achieving good classification accuracy. Existing methods for learning a fixed structure polyhedral classifier are based on solving non convex constrained optimization problems. These approaches do not efficiently handle the combinatorial aspect of the problem and are computationally expensive. We propose a method of model based estimation of posterior class probability to learn polyhedral classifiers. We solve an unconstrained optimization problem using a simple two step algorithm (similar to EM algorithm) to find the model parameters. To the best of our knowledge, this is the first attempt to form an unconstrained optimization problem for learning polyhedral classifiers. We then modify our algorithm to find the number of required hyperplanes also automatically. We experimentally show that our approach is better than the existing polyhedral learning algorithms in terms of training time, performance and the complexity. Most often, class conditional densities are multimodal. In such cases, each class region may be represented as a union of polyhedral regions and hence a single polyhedral classifier is not sufficient. To handle such situation, a generic decision tree is required. Learning optimal fixed structure decision tree is a computationally hard problem. On the other hand, top-down decision trees have no optimality guarantees due to the greedy nature. However, top-down decision tree approaches are widely used as they are versatile and easy to implement. Most of the existing top-down decision tree algorithms (CART,OC1,C4.5, etc.) use impurity measures to assess the goodness of hyper planes at each node of the tree. These measures do not properly capture the geometric structures in the data. We propose a novel decision tree algorithm that ,at each node, selects hyperplanes based on an objective function which takes into consideration geometric structure of the class regions. The resulting optimization problem turns out to be a generalized eigen value problem and hence is efficiently solved. We show through empirical studies that our approach leads to smaller size trees and better performance compared to other top-down decision tree approaches. We also provide some theoretical justification for the proposed method of learning decision trees. Piecewise linear regression is similar to the corresponding classification problem. For example, in regression trees, each leaf node is associated with a linear regression model. Thus the problem is once again that of (simultaneous) estimation of optimal partitions and learning a linear model for each partition. Regression trees, hinge hyperplane method, mixture of experts are some of the approaches to learn continuous piecewise linear regression models. Many of these algorithms are computationally intensive. We present a method of learning piecewise linear regression model which is computationally simple and is capable of learning discontinuous functions as well. The method is based on the idea of K plane regression that can identify a set of linear models given the training data. K plane regression is a simple algorithm motivated by the philosophy of k means clustering. However this simple algorithm has several problems. It does not give a model function so that we can predict the target value for any given input. Also, it is very sensitive to noise. We propose a modified K plane regression algorithm which can learn continuous as well as discontinuous functions. The proposed algorithm still retains the spirit of k means algorithm and after every iteration it improves the objective function. The proposed method learns a proper Piece wise linear model that can be used for prediction. The algorithm is also more robust to additive noise than K plane regression. While learning classifiers, one normally assumes that the class labels in the training data set are noise free. However, in many applications like Spam filtering, text classification etc., the training data can be mislabeled due to subjective errors. In such cases, the standard learning algorithms (SVM, Adaboost, decision trees etc.) start over fitting on the noisy points and lead to poor test accuracy. Thus analyzing the vulnerabilities of classifiers to label noise has recently attracted growing interest from the machine learning community. The existing noise tolerant learning approaches first try to identify the noisy points and then learn classifier on remaining points. In this thesis, we address the issue of developing learning algorithms which are inherently noise tolerant. An algorithm is inherently noise tolerant if, the classifier it learns with noisy samples would have the same performance on test data as that learnt from noise free samples. Algorithms having such robustness (under suitable assumption on the noise) are attractive for learning with noisy samples. Here, we consider non uniform label noise which is a generic noise model. In non uniform label noise, the probability of the class label for an example being incorrect, is a function of the feature vector of the example.(We assume that this probability is less than 0.5 for all feature vectors.) This can account for most cases of noisy data sets. There is no provably optimal algorithm for learning noise tolerant classifiers in presence of non uniform label noise. We propose a novel characterization of noise tolerance of an algorithm. We analyze noise tolerance properties of risk minimization frame work as risk minimization is a common strategy for classifier learning. We show that risk minimization under 01 loss has the best noise tolerance properties. None of the other convex loss functions have such noise tolerance properties. Empirical risk minimization under 01 loss is a hard problem as 01 loss function is not differentiable. We propose a gradient free stochastic optimization technique to minimize risk under 01 loss function for noise tolerant learning of linear classifiers. We show (under some conditions) that the algorithm converges asymptotically to the global minima of the risk under 01 loss function. We illustrate the noise tolerance of our algorithm through simulations experiments. We demonstrate the noise tolerance of the algorithm through simulations. Linear Models Linear Models (Classification) Linear Models (Regression) Polyhedral Classifiers Decision Trees Piecewise Linear Regression Noise Tolerant Learning Piecewise Linear Models Polyhedral Classifier Learning Geometric Decision Tree Regression Trees Nonlinear Models Supervised Learning Computer Science
118	DistribuiÃÃo espacial dos homicÃdios e a sua relaÃÃo com os fatores socioeconÃmico no MunicÃpio de Fortaleza no triÃnio 2004-2006 / Spatial distribution of homicide and its relationship with socioeconomic factors in Fortaleza in 2004-2006 Geziel dos Santos de Sousa 03 June 2009 (has links) A violÃncia tem se tornado um problema de saÃde pÃblica no Brasil, vindo a ser a terceira principal causa de mortes. No grupo das violÃncias, os homicÃdios sÃo as principais causas de Ãbito. O principal objetivo deste estudo foi analisar a distribuiÃÃo espacial dos homicÃdios de Fortaleza no triÃnio 2004 Ã 2006 relacionados a fatores socioeconÃmicos. Foi utilizada a tÃcnica de linkage para resgate do local de ocorrÃncia dos contidos nos registros do IML para o banco de dados do Sistema de InformaÃÃo sobre Mortalidade (SIM). Foi criado um modelo de anÃlise de regressÃo linear multivariada que permitisse a identificaÃÃo de uma relaÃÃo estatÃstica linear entre homicÃdios e indicadores socioeconÃmicos. Foram registrados 35.266 mil Ãbitos de pessoas residentes em Fortaleza, destes 1.815 foram vÃtimas de homicÃdios. O uso do procedimento do linkage possibilitou uma melhoria das informaÃÃes contidas no SIM, com 93,6% recuperaÃÃo das informaÃÃes dos locais de ocorrÃncia do Ãbito do local de ocorrÃncia do evento violento. Para efeito da anÃlise espacial foram somente considerados 1.699 Ãbitos com bairro de ocorrÃncia identificado. O principal grupo de risco de morte por homicÃdio Ã o de jovens do sexo masculino entre 15 e 29 anos, em que o risco Ã 15,5 vezes maior que o sexo feminino, cor parda, solteiros e com baixa escolaridade. A anÃlise estatÃstica espacial foi realizada atravÃs da suavizaÃÃo das variaÃÃes das taxas utilizando-se o mÃtodo Bayesiano empÃrico e da observaÃÃo de autocorrelaÃÃo espacial atravÃs do I de Moran local. Quanto Ã distribuiÃÃo espacial dos homicÃdios, percebem-se contrastes relevantes entre Ãreas de piores e melhores condiÃÃes de vida, aonde 9,65% dos bairros possuem baixo IDH, tambÃm possuem altas taxas de homicÃdio. A distribuiÃÃo espacial teve um padrÃo irregular, foi realizada a suavizaÃÃo da taxa de homicÃdios que passou a se comportar de forma menos fragmentada. NÃo foi detectada autocorrelaÃÃo espacial, sendo avaliada atravÃs do Ãndice global de Moran (I= 0,0425). O modelo de regressÃo proposto com cinco variÃveis mostrou-se apropriado Ãs pretensÃes deste trabalho obtendo um coeficiente de determinaÃÃo significativo (RÂ=0, 4567). / The violence has become a public health problem in Brazil, is the third leading cause of deaths. In the Group of the violence, murders are the main causes of death. The main objective of this study was to analyze the spatial distribution of the homicides in the triennium 2004 fortress to 2006 related to socioeconomic factors. The technique was used for linkage to rescue the place of occurrence of the contained in the records of the IML to the database of Mortality Information System (SIM). Was created a model of multivariate linear regression analysis that allowed the identification of a statistical relationship between homicides and socioeconomic indicators. We recorded 35,266 thousand deaths in residents in Fortaleza, this 1,815 were victims of homicide. The use of the linkage procedure allowed an improvement of the information contained on the SIM, with 93.6% recovery of information places of occurrence of death at the place of occurrence of the violent event. For effect of spatial analysis were only considered 1,699 deaths with neighborhood of occurrence identified. The main risk group to death for murder is that of young males between 15 and 29 years, in which the risk is 15.5 times greater than the drab female, unmarried and with low schooling. Spatial statistical analysis was performed by smoothing the rate variations using the Bayesian method and empirical observation of Spatial autocorrelation through I of Moran. As for the spatial distribution of the homicide, understand-if relevant contrasts between worst areas and better living conditions, where 9.65% of districts have low IDH, also have high rates of homicide. The spatial distribution had an irregular pattern, was held anti-aliasing of the homicide rate that began to behave less fragmented. Spatial autocorrelation was not detected, being evaluated through the global index of Moran (I=0.0425). The proposed regression model with five variables proved to be appropriate to the claims of this work by obtaining a significant coefficient of determination (RÂ = 0.4567). Violence External causes Homicide Spatial analysis Linear Models SAUDE PUBLICA
119	Aspects of generalized additive models and their application in actuarial science Amod, Farhaad 16 September 2015 (has links) M.Sc. / Please refer to full text to view abstract Actuarial science - Mathematical models Mathematical statistics Linear models (Statistics)
120	Missing Data Treatments at the Second Level of Hierarchical Linear Models St. Clair, Suzanne W. 08 1900 (has links) The current study evaluated the performance of traditional versus modern MDTs in the estimation of fixed-effects and variance components for data missing at the second level of an hierarchical linear model (HLM) model across 24 different study conditions. Variables manipulated in the analysis included, (a) number of Level-2 variables with missing data, (b) percentage of missing data, and (c) Level-2 sample size. Listwise deletion outperformed all other methods across all study conditions in the estimation of both fixed-effects and variance components. The model-based procedures evaluated, EM and MI, outperformed the other traditional MDTs, mean and group mean substitution, in the estimation of the variance components, outperforming mean substitution in the estimation of the fixed-effects as well. Group mean substitution performed well in the estimation of the fixed-effects, but poorly in the estimation of the variance components. Data in the current study were modeled as missing completely at random (MCAR). Further research is suggested to compare the performance of model-based versus traditional MDTs, specifically listwise deletion, when data are missing at random (MAR), a condition that is more likely to occur in practical research settings. hierarchical linear models missing data treatments missing data

Search results