Global ETD Search

1	Robust methods in logistic regression Nargis, Suraiya, n/a January 2005 (has links) My Masters research aims to deepen our understanding of the behaviour of robust methods in logistic regression. Logistic regression is a special case of Generalized Linear Modelling (GLM), which is a powerful and popular technique for modelling a large variety of data. Robust methods are useful in reducing the effect of outlying values in the response variable on parameter estimates. A literature survey shows that we are still at the beginning of being able to detect extreme observations in logistic regression analyses, to apply robust methods in logistic regression and to present informatively the results of logistic regression analyses. In Chapter 1 I have made a basic introduction to logistic regression, with an example, and to robust methods in general. In Chapters 2 through 4 of the thesis I have described traditional methods and some relatively new methods for presenting results of logistic regression using powerful visualization techniques as well as the concepts of outliers in binomial data. I have used different published data sets for illustration, such as the Prostate Cancer data set, the Damaged Carrots data set and the Recumbent Cow data set. In Chapter 4 I summarize and report on the modem concepts of graphical methods, such as central dimension reduction, and the use of graphics as pioneered by Cook and Weisberg (1999). In Section 4.6 I have then extended the work of Cook and Weisberg to robust logistic regression. In Chapter 5 I have described simulation studies to investigate the effects of outlying observations on logistic regression (robust and non-robust). In Section 5.2 I have come to the conclusion that, in the case of classical or robust multiple logistic regression with no outliers, robust methods do not necessarily provide more reasonable estimates of the parameters for the data that contain no st~ong outliers. In Section 5.4 I have looked into the cases where outliers are present and have come to the conclusion that either the breakdown method or a sensitivity analysis provides reasonable parameter estimates in that situation. Finally, I have identified areas for further study. logistic regression statistics robust methods Generalized Linear Modelling GLM
2	Robust Diagnostics for the Logistic Regression Model With Incomplete Data 范少華 Unknown Date (has links) Atkinson 及 Riani 應用前進搜尋演算法來處理百牡利資料中所包含的多重離群值(2001）。在這篇論文中，我們沿用相同的想法來處理在不完整資料下一般線性模型中的多重離群值。這個演算法藉由先填補資料中遺漏的部分，再利用前進搜尋演算法來確認資料中的離群值。我們所提出的方法可以解決處理多重離群值時常會遇到的遮蓋效應。我們應用了一些真實資料來說明這個演算法並得到令人滿意結果。 / Atkinson and Riani (2001) apply the forward search algorithm to deal with the problem of the detection of multiple outliers in binomial data. In this thesis, we extend the similar idea to identify multiple outliers for the generalized linear models when part of data are missing. The algorithm starts with imputation method to fill-in the missing observations in the data, and then use the forward search algorithm to confirm outliers. The proposed method can overcome the masking effect, which commonly occurs when multiple outliers exit in the data. Real data are used to illustrate the procedure, and satisfactory results are obtained. EM algorithm Incomplete data generalized linear model high breakdown ppint robust methods
3	Detekce významných křivek na 3D povrchových modelech / Robust feature curve detection in 3D surface models Hmíra, Peter January 2015 (has links) Most current algorithms typically lack in robustness to noise or do not handle T-shaped curve joining properly. There is a challenge to not only detect features in the noisy 3D-data obtained from the digital scanners. Moreover, most of the algorithms even when they are robust to noise, they lose the feature information near the T-shaped junctions as the triplet of lines ``confuses'' the algorithm so it treats it as a plane. Powered by TCPDF (www.tcpdf.org)
4	Alguns métodos robustos para detectar outliers multivariados / Some robust methods to detect multivariate outliers Giroldo, Fabíola Rocha de Santana 07 March 2008 (has links) Observações ou outliers estão quase sempre presentes em qualquer conjunto de dados, seja ele grande ou pequeno. Isso pode ocorrer por erro no armazenamento dos dados ou por existirem realmente alguns pontos diferentes dos demais. A presença desses pontos pode causar distorções nos resultados de modelos e estimativas. Por isso, a sua detecção é muito importante e deve ser feita antes do início de uma análise mais profunda dos dados. Após esse diagnóstico, pode-se tomar uma decisão a respeito dos pontos atípicos. Uma possibilidade é corrigi-los caso tenha ocorrido erro na transcrição dos dados. Caso sejam pontos válidos, eles devem ser tratados de forma diferente dos demais, seja com uma ponderação, seja com uma análise especial. Nos casos univariado e bivariado, o outlier pode ser detectado analisando-se o gráfico de dispersão que mostra o comportamento de cada observação do conjunto de dados de interesse. Se houver pontos distantes da massa de dados, eles devem ser considerados atípicos. No caso multivariado, a detecção por meio de gráficos torna-se um pouco mais complexa porque a análise deveria ser feita observando-se duas variáveis por vez, o que tornaria o processo longo e pouco confiável, pois um ponto pode ser atípico com relação a algumas variáveis e não ser com relação a outras, o que faria com que o resultado ficasse mascarado. Neste trabalho, alguns métodos robustos para detecção de outliers em dados multivariados são apresentados. A aplicação de cada um dos métodos é feita para um exemplo. Além disso, os métodos são comparados de acordo com o resultado que cada um apresentar para o exemplo em questão e via simulação. / Unusual observations or outliers are frequent in any data set, if it is large or not. Outliers may occur by typing mistake or by the existence of observations that are really different from the others. The presence of this observations may distort the results of models and estimates. Therefore, their detection is very important and it is recommended to be performed before any detailed analysis, when a decision can be taken about these atypical observations. A possibility is to correct these observations if the problem occurred with the construction of the data set. If the observations are correct, different strategies can be adopted, with some weights or with special analysis. In univariate and bivariate data sets, outliers can be detected analyzing the scatter plot. Observations distant from the cloud formed by the data set are considered unusual. In multivariate data sets, the detection of outliers using graphics is more difficult because we have to analyse a couple of variables each time, which results is a long and less reliable process because we can find an observation that is unusual for one variable and not unusual for the others, masking the results. In this work, some robust methods for detection of multivariate outliers are presented. The application of each one is done for an example. Moreover, the methods are compared by the results of each one in the example and by simulation. dados multivariados métodos robustos multivariate data multivariate outliers outliers multivariados robust methods
5	On Learning from Collective Data Xiong, Liang 01 December 2013 (has links) In many machine learning problems and application domains, the data are naturally organized by groups. For example, a video sequence is a group of images, an image is a group of patches, a document is a group of paragraphs/words, and a community is a group of people. We call them the collective data. In this thesis, we study how and what we can learn from collective data. Usually, machine learning focuses on individual objects, each of which is described by a feature vector and studied as a point in some metric space. When approaching collective data, researchers often reduce the groups into vectors to which traditional methods can be applied. We, on the other hand, will try to develop machine learning methods that respect the collective nature of data and learn from them directly. Several different approaches were taken to address this learning problem. When the groups consist of unordered discrete data points, it can naturally be characterized by its sufficient statistics – the histogram. For this case we develop efficient methods to address the outliers and temporal effects in the data based on matrix and tensor factorization methods. To learn from groups that contain multi-dimensional real-valued vectors, we develop both generative methods based on hierarchical probabilistic models and discriminative methods using group kernels based on new divergence estimators. With these tools, we can accomplish various tasks such as classification, regression, clustering, anomaly detection, and dimensionality reduction on collective data. We further consider the practical side of the divergence based algorithms. To reduce their time and space requirements, we evaluate and find methods that can effectively reduce the size of the groups with little impact on the accuracy. We also proposed the conditional divergence along with an efficient estimator in order to correct the sampling biases that might be present in the data. Finally, we develop methods to learn in cases where some divergences are missing, caused by either insufficient computational resources or extreme sampling biases. In addition to designing new learning methods, we will use them to help the scientific discovery process. In our collaboration with astronomers and physicists, we see that the new techniques can indeed help scientists make the best of data.
6	Alguns métodos robustos para detectar outliers multivariados / Some robust methods to detect multivariate outliers Fabíola Rocha de Santana Giroldo 07 March 2008 (has links) Observações ou outliers estão quase sempre presentes em qualquer conjunto de dados, seja ele grande ou pequeno. Isso pode ocorrer por erro no armazenamento dos dados ou por existirem realmente alguns pontos diferentes dos demais. A presença desses pontos pode causar distorções nos resultados de modelos e estimativas. Por isso, a sua detecção é muito importante e deve ser feita antes do início de uma análise mais profunda dos dados. Após esse diagnóstico, pode-se tomar uma decisão a respeito dos pontos atípicos. Uma possibilidade é corrigi-los caso tenha ocorrido erro na transcrição dos dados. Caso sejam pontos válidos, eles devem ser tratados de forma diferente dos demais, seja com uma ponderação, seja com uma análise especial. Nos casos univariado e bivariado, o outlier pode ser detectado analisando-se o gráfico de dispersão que mostra o comportamento de cada observação do conjunto de dados de interesse. Se houver pontos distantes da massa de dados, eles devem ser considerados atípicos. No caso multivariado, a detecção por meio de gráficos torna-se um pouco mais complexa porque a análise deveria ser feita observando-se duas variáveis por vez, o que tornaria o processo longo e pouco confiável, pois um ponto pode ser atípico com relação a algumas variáveis e não ser com relação a outras, o que faria com que o resultado ficasse mascarado. Neste trabalho, alguns métodos robustos para detecção de outliers em dados multivariados são apresentados. A aplicação de cada um dos métodos é feita para um exemplo. Além disso, os métodos são comparados de acordo com o resultado que cada um apresentar para o exemplo em questão e via simulação. / Unusual observations or outliers are frequent in any data set, if it is large or not. Outliers may occur by typing mistake or by the existence of observations that are really different from the others. The presence of this observations may distort the results of models and estimates. Therefore, their detection is very important and it is recommended to be performed before any detailed analysis, when a decision can be taken about these atypical observations. A possibility is to correct these observations if the problem occurred with the construction of the data set. If the observations are correct, different strategies can be adopted, with some weights or with special analysis. In univariate and bivariate data sets, outliers can be detected analyzing the scatter plot. Observations distant from the cloud formed by the data set are considered unusual. In multivariate data sets, the detection of outliers using graphics is more difficult because we have to analyse a couple of variables each time, which results is a long and less reliable process because we can find an observation that is unusual for one variable and not unusual for the others, masking the results. In this work, some robust methods for detection of multivariate outliers are presented. The application of each one is done for an example. Moreover, the methods are compared by the results of each one in the example and by simulation. dados multivariados métodos robustos outliers multivariados multivariate data multivariate outliers robust methods
7	Robustní metody v teorii portfolia / Robust methods in portfolio theory Petrušová, Lucia January 2016 (has links) 01 Abstract: This thesis is concerned with the robust methods in portfolio theory. Different risk measures used in portfolio management are introduced and the corresponding robust portfolio optimization problems are formulated. The analytical solutions of the robust portfolio optimization problem with the lower partial moments (LPM), value-at-risk (VaR) or conditional value-at-risk (CVaR), as a risk measure, are presented. The application of the worst-case conditional value-at-risk (WCVaR) to robust portfolio management is proposed. This thesis considers WCVaR in the situation where only partial information on the underlying probability distribution is available. The minimization of WCVaR under mixture distribution uncertainty, box uncertainty, and ellipsoidal uncertainty are investigated. Several numerical examples based on real market data are presented to illustrate the proposed approaches and advantage of the robust formulation over the corresponding nominal approach.
8	Robustní regrese - identifikace odlehlých pozorování / Robust regression - outlier detection Hradilová, Lenka January 2017 (has links) This master thesis is focused on methods of outlier detection. The aim of this work is to assess the suitability of using robust methods on real data of EKO-KOM, a.s. The first part of the thesis provides an overview and a theoretical treatise on classic and robust methods of outlier detection. These methods are subsequently applied to the obtained data file of EKO-KOM, a.s. in the practical part of the thesis. At the conclusion of the thesis, there are recommendations about suitability of methods, which are based on comparison of classical and robust methods.
9	Student Ratings of Instruction: Examining the Role of Academic Field, Course Level, and Class Size Laughlin, Anne Margaret 11 April 2014 (has links) This dissertation investigated the relationship between course characteristics and student ratings of instruction at a large research intensive university. Specifically, it examined the extent to which academic field, course level, and class size were associated with variation in mean class ratings. Past research consistently identifies differences between student ratings in different academic fields, but offers no unifying conceptual framework for the definition or categorization of academic fields. Therefore, two different approaches to categorizing classes into academic fields were compared - one based on the institution's own academic college system and one based on Holland's (1997) theory of academic environments. Because the data violated assumptions of normality and homogeneity of variance, traditional ANOVA procedures were followed by post-hoc analyses using bootstrapping to more accurately estimate standard errors and confidence intervals. Bootstrapping was also used to determine the statistical significance of a difference between the effect sizes of academic college and Holland environment, a situation for which traditional statistical tests have not been developed. Findings replicate the general pattern of academic field differences found in prior research on student ratings and offer several unique contributions. They confirm the value of institution-specific approaches to defining academic fields and also indicate that Holland's theory of academic environments may be a useful conceptual framework for making sense of academic field differences in student ratings. Building on past studies that reported differences in mean ratings across academic fields, this study describes differences in the variance of ratings across academic fields. Finally, this study shows that class size and course level may impact student ratings differently - in terms of interaction effects and magnitude of effects - depending on the academic field of the course. / Ph. D. Student Evaluation of Teaching Student Ratings of Instruction Bootstrap Robust Methods
10	Modelos mistos lineares elípticos com erros de medição / Elliptical linear mixed models with measurement errors Borssoi, Joelmir André 20 February 2014 (has links) O objetivo principal deste trabalho é estudar modelos mistos lineares elípticos em que uma das variáveis explicativas ou covariáveis é medida com erros, sob a abordagem estrutural. O trabalho é apresentado numa notação longitudinal, todavia a covariável medida com erros pode ser observada temporalmente ou como medidas repetidas. Assumimos uma estrutura hierárquica apropriada com distribuição elíptica conjunta para os erros envolvidos, porém a inferência é desenvolvida sob uma abordagem marginal em que consideramos a distribuição marginal da resposta e da variável medida com erros. Procedimentos de influência local em que o esquema de perturbação é escolhido de forma apropriada são desenvolvidos. Um exemplo para motivação é apresentado e analisado através dos procedimentos apresentados neste trabalho. Detalhamos nos apêndices os principais procedimentos necessários para o desenvolvimento do modelo proposto. / The aim of this thesis is to study elliptical linear mixed models in which one of the explanatory variables is subject to measurement error under the structural assumption. The work is presented by assuming a longitudinal structure, however the explanatory variable may be observed along the time or as repeated measures. A joint hierarchical structure is assumed for the elliptical errors, but the inference is made under the marginal structure. The methodology of local influence is applied with the perturbation schemes being selected appropriately. A motivation example is presented and analysed by the procedures developed in this work. All the main derivations for the development of the proposed model are presented in the appendices. diagnostic methods Elliptical models measurement error models Métodos de diagnóstico métodos robustos mixed models modelos com erros nas variáveis modelos elípticos modelos mistos robust methods

Search results