Global ETD Search

241	Fisher and logistic discriminant function estimation in the presence of collinearity O'Donnell, Robert P. (Robert Paul) 27 September 1990 (has links) The relative merits of the Fisher linear discriminant function (Efron, 1975) and logistic regression procedure (Press and Wilson, 1978; McLachlan and Byth, 1979), applied to the two group discrimination problem under conditions of multivariate normality and common covariance, have been debated. In related research, DiPillo (1976, 1977, 1979) has argued that a biased Fisher linear discriminant function is preferable when one or more collinearities exist among the classifying variables. This paper proposes a generalized ridge logistic regression (GRL) estimator as a logistic analog to DiPillo's biased alternative estimator. Ridge and Principal Component logistic estimators proposed by Schaefer et al. (1984) for conventional logistic regression are shown to be special cases of this generalized ridge logistic estimator. Two Fisher estimators (Linear Discriminant Function (LDF) and Biased Linear Discriminant Function (BLDF)) and three logistic estimators (Linear Logistic Regression (LLR), Ridge Logistic Regression (RLR) and Principal Component Logistic Regression (PCLR)) are compared in a Monte Carlo simulation under varying conditions of distance between populations, training set s1ze and degree of collinearity. A new approach to the selection of the ridge parameter in the BLDF method is proposed and evaluated. The results of the simulation indicate that two of the biased estimators (BLDF, RLR) produce smaller MSE values and are more stable estimators (smaller standard deviations) than their unbiased counterparts. But the improved performance for MSE does not translate into equivalent improvement in error rates. The expected actual error rates are only marginally smaller for the biased estimators. The results suggest that small training set size, rather than strong collinearity, may produce the greatest classification advantage for the biased estimators. The unbiased estimators (LDF, LLR) produce smaller average apparent error rates. The relative advantage of the Fisher estimators over the logistic estimators is maintained. But, given that the comparison is made under conditions most favorable to the Fisher estimators, the absolute advantage of the Fisher estimators is small. The new ridge parameter selection method for the BLDF estimator performs as well as, but no better than, the method used by DiPillo. The PCLR estimator shows performance comparable to the other estimators when there is a high level of collinearity. However, the estimator gives up a significant degree of performance in conditions where collinearity is not a problem. / Graduation date: 1991 Estimation theory Ridge regression (Statistics)
242	Diagnostic tools for overdispersion in generalized linear models Ganio-Gibbons, Lisa M. 18 August 1989 (has links) Data in the form of counts or proportions often exhibit more variability than that predicted by a Poisson or binomial distribution. Many different models have been proposed to account for extra-Poisson or extra-binomial variation. A simple model includes a single heterogeneity factor (dispersion parameter) in the variance. Other models that allow the dispersion parameter to vary between groups or according to a continuous covariate also exist but require a more complicated analysis. This thesis is concerned with (1) understanding the consequences of using an oversimplified model for overdispersion, (2) presenting diagnostic tools for detecting the dependence of overdispersion on covariates in regression settings for counts and proportions and (3) presenting diagnostic tools for distinguishing between some commonly used models for overdispersed data. The double exponential family of distributions is used as a foundation for this work. A double binomial or double Poisson density is constructed from a binomial or Poisson density and an additional dispersion parameter. This provides a completely parametric framework for modeling overdispersed counts and proportions. The first issue above is addressed by exploring the properties of maximum likelihood estimates obtained from incorrectly specified likelihoods. The diagnostic tools are based on a score test in the double exponential family. An attractive feature of this test is that it can be computed from the components of the deviance in the standard generalized linear model fit. A graphical display is suggested by the score test. For the normal linear model, which is a special case of the double exponential family, the diagnostics reduce to those for heteroscedasticity presented by Cook and Weisberg (1983). / Graduation date: 1990 Linear models (Statistics) Regression analysis
243	Design of a Software Application for Visualization of GPS and Vehicle Data Arslan, Recep Sinan Jr January 2009 (has links) I present an application to visualization of GPS data and Linear Correlations and models. A collection of data for each vehicle is used to compute correlations. Deviating correlations can be indicative of a faulty vehicle. The correlation values for each vehicle are computed with use linear regression algorithms using up to 4 signals in the data (with varied time window), and display the model parameters in a window next to the GPS map. Multiple measurements (multiple drive routes and multiple model parameters) are displayed at the same time, allowing tracking over time and comparison of different vehicles. The whole technique is demonstrated on three data which is set on first frame by user. The results are displayed with a java application and Google Map. Deviation Detection Linear Regression GPS
244	Vaccinering mot H1N1 : En studie av vad som påverkade svenska individers vaccinationsbeslut 2009 Altersved, Sofia, Mäkelä, Elin January 2012 (has links) The Swine flu (H1N1) erupted in 2009 and wasquickly spread over the world and developed into a pandemic, with a great threat against people’s health. It was soon discovered that the H1N1–virus had a different character than the seasonal flu, since it especially affected younger individuals and the consequences from the disease were expected to be more severe. In Sweden it was decided to provide a free of charge vaccination against the H1N1-virus, and the Swedish vaccination ratiobecome relatively high compared to other countries. This thesis studies what factors affected the Swedish population´s decision to take the flu shot against the H1N1-virus in 2009. This is done by a statistical study with a logistic regression analysis, which is conducted on secondary data. The results show that the probability of vaccination against H1N1 increases if the individual is over 60 years, and increases with growing income. The results also show that women have a higher vaccination propensity than men. In contrast, there’s no association between vaccination against H1N1 and the level of health or education level. As the results were not entirely consistent in comparison with theories and previous studies, it can be concluded that it is difficult to determine how different factors actually affected the individuals’ vaccination decision against H1N1. Possibly,it depends on the specific and extreme circumstances with regard to H1N1. Therefore, it may be difficult to predict how individuals will behave in the case of future pandemics. / Svininfluensan (H1N1) bröt ut 2009 och spred sig snabbt över flera länder i världen med utveckling till en pandemi, vilket utgjorde ett stort hot mot människors hälsa. Det konstaterades snart att H1N1 var av en annan karaktär än säsongsinfluensan, då den framförallt drabbade yngre individer och konsekvenserna av sjukdomen förväntades vara allvarligare. I Sverige beslutades att befolkningen skulle erbjudas en kostnadsfri vaccinering och den svenska vaccinationstäckningsgraden blev relativt hög i jämförelse med många andra länder. Denna uppsats undersöker vilka faktorer som påverkade svenska befolkningens beslut om vaccinering mot svininfluensan under 2009. Detta görs genom en statistisk undersökning i form av en logistisk regressionsanalys som utförs på sekundärdata. Resultaten visar att sannolikheten för vaccinering mot H1N1 ökar om individen är över 60 år, samt ökar med en stigande inkomst. Resultaten visar också att kvinnor har högre benägenhet att vaccinera sig än män. Däremot förekommer inget samband mellan hälsonivå eller utbildning och vaccinering mot H1N1. Då resultaten inte var helt konsistenta i jämförelse med teorier och tidigare studier, kan konstateras att det är svårt att fastställa hur olika faktorer påverkade individers vaccinationsbeslut mot H1N1. Möjligtvis kan detta bero på de särskilda och extrema omständigheter som rörde H1N1. Utifrån detta kan det bli svårt att förutse hur individer kommer resonera och agera inför eventuella framtida pandemier. vaccinering H1N1 logistisk regression Sverige
245	Factors Affect the Employment of Youth in China Li, Xiaoxue January 2009 (has links) Today's young people are well-educated ever but in a poor employment situation. At the beginning of this paper, I first state the situation both in the world and in China, revealing the poor employment situation of youth. Then I introduce systems related to youth employment in China and measures the government taken to help graduate students to find a job. The purpose of this paper is to analyze employment of youth people in China especially among the medium and highly educated people and find which and how the factors contribute to it. By using the Logistic Regression by STATA, I find that the main factors are gender, age, living area, and political status, major and educational level. The result reveals that the discrimination and gap between rural and urban area are severe issues in China. Last but not least, I give some suggestions both to the society and the individual to improve the youth employment. employment logistic regression Economics Nationalekonomi
246	Accounting for the effects of rehabilitation actions on the reliability of flexible pavements: performance modeling and optimization Deshpande, Vighnesh Prakash 15 May 2009 (has links) A performance model and a reliability-based optimization model for flexible pavements that accounts for the effects of rehabilitation actions are developed. The developed performance model can be effectively implemented in all the applications that require the reliability (performance) of pavements, before and after the rehabilitation actions. The response surface methodology in conjunction with Monte Carlo simulation is used to evaluate pavement fragilities. To provide more flexibility, the parametric regression model that expresses fragilities in terms of decision variables is developed. Developed fragilities are used as performance measures in a reliability-based optimization model. Three decision policies for rehabilitation actions are formulated and evaluated using a genetic algorithm. The multi-objective genetic algorithm is used for obtaining optimal trade-off between performance and cost. To illustrate the developed model, a numerical study is presented. The developed performance model describes well the behavior of flexible pavement before as well as after rehabilitation actions. The sensitivity measures suggest that the reliability of flexible pavements before and after rehabilitation actions can effectively be improved by providing an asphalt layer as thick as possible in the initial design and improving the subgrade stiffness. The importance measures suggest that the asphalt layer modulus at the time of rehabilitation actions represent the principal uncertainty for the performance after rehabilitation actions. Statistical validation of the developed response model shows that the response surface methodology can be efficiently used to describe pavement responses. The results for parametric regression model indicate that the developed regression models are able to express the fragilities in terms of decision variables. Numerical illustration for optimization shows that the cost minimization and reliability maximization formulations can be efficiently used in determining optimal rehabilitation policies. Pareto optimal solutions obtained from multi-objective genetic algorithm can be used to obtain trade-off between cost and performance and avoid possible conflict between two decision policies. Reliability Pavements Rehabilitation Optimization Regression
247	Modeling and characterization of potato quality by active thermography Sun, Chih-Chen 15 May 2009 (has links) This research focuses on characterizing a potato with extra sugar content and identifying the location and depth of the extra sugar content using the active thermography imaging technique. The extra sugar content of the potato is an important problem for potato growers and potato chip manufacturers. Extra sugar content could result in diseases or wounds in the potato tuber. In general, potato tubers with low sugar content are considered as having a higher quality. The inspection system and general methodologies characterizing extra sugar content will be presented in this study. The average heating rate obtained from the thermal image analysis is the major factor in characterization procedures. Using information on the average heating rate, the probability of achieving a potato with extra sugar content may be predicted using the logistic regression model. In addition, neural networks are also used to identify the potato with extra sugar contents. The correct rate for identifying a potato with extra sugar content in it can reach 85%. The location of extra sugar content can also be found using the logistic regression model. Results show the overall correct rate predicting the extra sugar content location with a resolution of 20 by 20 pixels is 91%. In predicting the extra sugar content depth, amounts exceeds 2/3 inches are not detectable by analyzing thermal images. The depth of extra sugar content can be discriminated in 0.3 inch increments with a high rate of accuracy (87.5%). thermography logistic regression model potato
248	Statistical Relationships of the Tropical Rainfall Measurement Mission (TRMM) Precipitation and Large-scale Flow Borg, Kyle 2010 May 1900 (has links) The relationship between precipitation and large-flow is important to understand and characterize in the climate system. We examine statistical relationships between the Tropical Rainfall Measurement Mission (TRMM) 3B42 gridded precipitation and large-scale ow variables in the Tropics for 2000{2007. These variables include NCEP/NCAR Re-analysis sea surface temperatures (SSTs), vertical temperature pro files, omega, and moist static energy, as well as Atmospheric Infrared Sounder (AIRS) vertical temperatures and QuikSCAT surface divergence. We perform correlation analysis, empirical orthogonal function analysis, and logistic regression analysis on monthly, pentad, daily and near-instantaneous time scales. Logistic regression analysis is able to incorporate the non-linear nature of precipitation in the relation- ship. Flow variables are interpolated to the 0.25 degrees TRMM 3B42 grid and examined separately for each month to o set the effects of the seasonal cycle. January correlations of NCEP/NCAR Re-analysis SSTs and TRMM 3B42 precipitation have a coherent area of positive correlations in the Western and Central Tropical Pacific on all time scales. These areas correspond with the South Pacific Convergence Zone (SPCZ) and the Inter Tropical Convergence Zone (ITCZ). 500mb omega is negatively correlated with TRMM 3B42 precipitation across the Tropics on all time scales. QuikSCAT divergence correlations with precipitation have a band of weak and noisy correlations along the ITCZ on monthly time scales in January. Moist static energy, calculated from NCEP/NCAR Re-analysis has a large area of negative correlations with precipitation in the Central Tropical Pacific on all four time scales. The first few Empirical Orthogonal Functions (EOFs) of vertical temperature profiles in the Tropical Pacific have similar structure on monthly, pentad, and daily timescales. Logistic regression fit coefficients are large for SST and precipitation in four regions located across the Tropical Pacific. These areas show clear thresholded behavior. Logistic regression results for other variables and precipitation are less clear. The results from SST and precipitation logistic regression analysis indicate the potential usefulness of logistic regression as a non-linear statistic relating precipitation and certain ow variables. climate logistic regression TRMM precipitation
249	The Research in Key Factors of Credit Risk for Mortgage Hsu, Chao-Yi 06 July 2004 (has links) The wellness of credit risk has great influence on the Value of Mortage-Backed Securities (MBS), but there isn¡¦t any valuator to supervise and to estimate these securities-issued institutions in Taiwan. For earning the trust of the masses, these institutions must have great abilities to control credit risk in an acceptable degree, and then the people will be willing to invest in these MBS. This research makes use of data totaling 20,576 cases (17,425 normal cases and 3,151 default cases) from a certain domestic bank, Bank P, and constructs the Logistic Regression Model to steer the substantial evidence research. With the right prediction of 96.7% in normal, 85.4% in default, and 95% in whole, we find that we can use the borrower¡¦s age, occupation, the object of collateral, the use of collateral, the loan purpose, the year of loan, the line of credit, the category of interest, the interest rate, the source of case and the branch office as key factors for credit risk appraisal of reference provided to banks. In this study, we will determine whether interest rate is the key factor for default, followed by occupation. The other two factors, the category of interest and the source of case, which are not popularly talked about in related studies, are confirmed as the remarkable influence factors for credit risk. The other important discovery is that the influence of the loan condition and the specialities of the collateral have greater impact on credit risk than the personality of the borrower. This research provides some reference for financial institutions on credit evaluation, and makes up a good model for credit control. For previously issued MBS, this research also provides some academic basis for future adaptation. Logistic Regression Credit Risk Mortgage
250	none Wu, Shin-Hwa 11 July 2005 (has links) none Cointegration Vector Auto-Regression Model

Search results