Global ETD Search

301	An Economic Study of the Influencial Factors Impacting the College Readiness of Secondary Students Stewart, Morgan 01 December 2015 (has links) For many young individuals in their junior year of high school the pressures of getting into the desired secondary education institution of their choice is a nerve-wrecking task. For months they prepare to study for standardized tests and compile their greatest achievements to prove they are worthy enough to be accepted in to these prestigious universities. However, preparation for college starts way before the application season. It leads one to wonder what influential factors surrounding them could affect their odds of being successful in college once they are accepted. This study examines the influential factors that effect a student’s college readiness. The factors tested will be student’s parent income, total enrollment of the high school, total number of high school days in a year, average class size in the high school, and the teacher quality of that high school. A multiple regression will be used to test these independent variables against the high school graduate ready for college percentage for each high school. The slope parameters of the model will be tested through t-tests, p-values, and f-tests. The sample size will consist of Illinois High Schools who have completed an Illinois High School Report Card required by the No Child Left Behind law. In addition, a ten question survey will be dispersed to a population of fifty college students at SIU focusing on factors they believe have been influential on their college success. This study will aim to improve the understanding of all the factors that go into equipping high school students for a milestone that can ultimately affect their economic outcomes in life. Economics Regression Analysis Secondary Education
302	Threshold Regression Estimation via Lasso, Elastic-Net, and Lad-Lasso: A Simulation Study with Applications to Urban Traffic Data January 2015 (has links) abstract: Threshold regression is used to model regime switching dynamics where the effects of the explanatory variables in predicting the response variable depend on whether a certain threshold has been crossed. When regime-switching dynamics are present, new estimation problems arise related to estimating the value of the threshold. Conventional methods utilize an iterative search procedure, seeking to minimize the sum of squares criterion. However, when unnecessary variables are included in the model or certain variables drop out of the model depending on the regime, this method may have high variability. This paper proposes Lasso-type methods as an alternative to ordinary least squares. By incorporating an L_{1} penalty term, Lasso methods perform variable selection, thus potentially reducing some of the variance in estimating the threshold parameter. This paper discusses the results of a study in which two different underlying model structures were simulated. The first is a regression model with correlated predictors, whereas the second is a self-exciting threshold autoregressive model. Finally the proposed Lasso-type methods are compared to conventional methods in an application to urban traffic data. / Dissertation/Thesis / Masters Thesis Industrial Engineering 2015 Statistics Lasso SETAR Threshold Regression
303	Wavelet thresholding for unequally time-spaced data Kovac, Arne January 1999 (has links) No description available. 519.5 Adaptive estimation; Robust regression
304	Probability Modelling of Alpine Permafrost Distribution in Tarfala Valley, Sweden Alm, Micael January 2017 (has links) Datainsamling har genomförts i Tarfaladalen under 5 dagar vid månadsskiftet mellan mars och april 2017. Insamlingen resulterade i 36 BTS-mätningar (Bottom Temperature of Snow cover) som därefter har använts tillsammans med data från tidigare insamlingar, till att skapa en sammanställd modell över förekomsten av permafrost omkring Tarfala. En statistisk undersökning syftade till att identifiera meningsfulla parametrar som permafrost beror av, genom att testa de oberoende variablerna mot BTS i en stegvis regression. De oberoende faktorerna höjd över havet, aspekt, solinstrålning, vinkel och gradient hos sluttningar producerades för varje undersökt BTS-punkt i ett geografiskt informationssystem. Den stegvisa regressionen valde enbart höjden som signifikant variabel, höjden användes i en logistisk regression för att modellera permafrostens utbredning. Den slutliga modellen visade att permafrostens sannolikhet ökar med höjden. För att skilja mellan kontinuerlig, diskontinuerlig och sporadisk permafrost delades modellen in i tre zoner med olika sannolikhetsspann. Den kontinuerliga permafrosten är högst belägen och därav den zon där sannolikheten för permafrost är störst, denna zon gränsar till den diskontinuerliga permafrosten vid en höjd på 1523 m. Den diskontinuerliga permafrosten har en sannolikhet mellan 50–80 % och dess undre gräns på 1108 m.ö.h. separerar den diskontinuerliga zonen från den sporadiska permafrosten / A field data collection has been carried out in Tarfala valley at the turn of March to April 2017. The collection resulted in 36 BTS-measurements (Bottom Temperature of Snow cover) that has been used in combination with data from earlier surveys, to create a model of the occurrence of permafrost around Tarfala. To identify meaningful parameters that permafrost relies on, independent variables were tested against BTS in a stepwise regression. The independent variables elevation, aspect, solar radiation, slope angle and curvature were produced for each investigated BTS-point in a geographic information system. The stepwise regression selected elevation as the only significant variable, elevation was applied to a logistic regression to model the permafrost occurrence. The final model showed that the probability of permafrost increases with height. To distinguish between continuous, discontinuous and sporadic permafrost, the model was divided into three zones with intervals of probability. The continuous permafrost is the highest located zone and therefore has the highest likelihood, this zone delimits the discontinuous permafrost at 1523 m a.s.l. The discontinuous permafrost has probabilities between 50-80 % and its lower limit at 1108 m a.s.l. separates the discontinuous zone from the sporadic permafrost. Alpine permafrost BTS stepwise regression logistic regression Tarfala Alpin permafrost BTS stegvis regression logistisk regression Tarfala Physical Geography Naturgeografi
305	Can Leaf Spectroscopy Predict Leaf and Forest Traits Along a Peruvian Tropical Forest Elevation Gradient? Doughty, Christopher E., Santos-Andrade, P. E., Goldsmith, G. R., Blonder, B., Shenkin, A., Bentley, L. P., Chavana-Bryant, C., Huaraca-Huasco, W., Díaz, S., Salinas, N., Enquist, B. J., Martin, R., Asner, G. P., Malhi, Y. 11 1900 (has links) High-resolution spectroscopy can be used to measure leaf chemical and structural traits. Such leaf traits are often highly correlated to other traits, such as photosynthesis, through the leaf economics spectrum. We measured VNIR (visible-near infrared) leaf reflectance (400-1,075nm) of sunlit and shaded leaves in similar to 150 dominant species across ten, 1ha plots along a 3,300m elevation gradient in Peru (on 4,284 individual leaves). We used partial least squares (PLS) regression to compare leaf reflectance to chemical traits, such as nitrogen and phosphorus, structural traits, including leaf mass per area (LMA), branch wood density and leaf venation, and higher-level traits such as leaf photosynthetic capacity, leaf water repellency, and woody growth rates. Empirical models using leaf reflectance predicted leaf N and LMA (r(2)>30% and %RMSE<30%), weakly predicted leaf venation, photosynthesis, and branch density (r(2) between 10 and 35% and %RMSE between 10% and 65%), and did not predict leaf water repellency or woody growth rates (r(2)<5%). Prediction of higher-level traits such as photosynthesis and branch density is likely due to these traits correlations with LMA, a trait readily predicted with leaf spectroscopy. PLS regression spectroscopy tropical forests
306	Model checking for regressions when variables are measured with errors Xie, Chuanlong 28 August 2017 (has links) In this thesis, we investigate model checking problems for parametric single-index regression models when the variables are measured with different types of errors. The large sample behaviours of the test statistics can be used to develop properly centered and scaled model checking procedures. In addition, a dimension reduction model-adaptive strategy is employed, with the special requirements for the models with measurement errors, to improve the proposed testing procedures. This makes the test statistics converge to their weak limit under the null hypothesis with the convergence rates not depending on the dimension of predictor vector. Furthermore, the proposed tests behave like a classical local smoothing test with only one-dimensional predictor. Therefore the proposed methods have potential for alleviating the difficulties associated with high dimensionality in hypothesis testing.. Chapter 2 provides some tests for a parametric single-index regression model when predictors are measured with errors in an additive manner and validation dataset is available. The two proposed tests have consistency rates not depending on the dimension of predictor vector. One of these tests has a bias term that may become arbitrarily large with increasing sample size, but has smaller asymptotic variance. The other test is asymptotically unbiased with larger asymptotic variance. Both are still omnibus against general alternatives. Besides, a systematic study is conducted to give an insight on the effect of the ratio between the size of primary data and the size of validation data on the asymptotic behavior of these tests. Simulation studies are carried out to examine the finite-sample performances of the proposed tests. Also the tests are applied to a real data set about breast cancer with validation data obtained from a nutrition study.. Chapter 3 introduces a minimum projected-distance test for a parametric single-index regression model when predictors are measured with Berkson type errors. The distribution of the measurement error is assumed to be known up to several parameters. This test is constructed by combining the minimum distance test with a dimension reduction model-adaptive strategy. After properly centering, the minimum projected-distance test statistic is asymptotically normal at a convergence rate of order nh^(1/2) and can detect a sequence of local alternatives distinct from the null model with a rate of order n^(-1/2) h^(-1/4) where n is the sample size and h is a sequence of bandwidths tending to 0 as n tends infinity. These rates do not depend on the dimensionality of predictor vector, which implies that the proposed test has potential for alleviating the curse of dimensionality in hypothesis testing in this field. Further, as the test is asymptotically biased, two bias-correction methods are suggested to construct asymptotically unbiased tests. In addition, we discuss some details in the implementation of the proposed tests and then provide a simplified procedure. Simulations indicate desirable finite-sample performances of the tests. Besides, we illustrate the proposed model checking procedures by using two real datasets to illustrate the effects of air pollution on Emphysema.. Chapter 4 provides a nonparametric test for checking a parametric single-index regression model when predictor vector and response are measured with distortion errors. We estimate the true values of response and predictor, and then plug the estimated values into a test statistic to develop a model checking procedure. The dimension reduction model-adaptive strategy is also employed to improve its theoretical properties and finite sample performance. Another interesting observation in this work is that, with properly selected bandwidths and kernel functions in a limited range, the proposed test statistic has the same limiting distribution as that under the classical regression setup without distortion measurement errors. Simulation studies are conducted.
307	On ridge regression and least absolute shrinkage and selection operator AlNasser, Hassan 30 August 2017 (has links) This thesis focuses on ridge regression (RR) and least absolute shrinkage and selection operator (lasso). Ridge properties are being investigated in great detail which include studying the bias, the variance and the mean squared error as a function of the tuning parameter. We also study the convexity of the trace of the mean squared error in terms of the tuning parameter. In addition, we examined some special properties of RR for factorial experiments. Not only do we review ridge properties, we also review lasso properties because they are somewhat similar. Rather than shrinking the estimates toward zero in RR, the lasso is able to provide a sparse solution, setting many coefficient estimates exaclty to zero. Furthermore, we try a new approach to solve the lasso problem by formulating it as a bilevel problem and implementing a new algorithm to solve this bilevel program. / Graduate LASSO Ridge Regression Bilevel Optimization
308	Valuation theory and real property assessment Rollo, Gordon Paul January 1971 (has links) The real property tax has a major impact on real property owners in all Canadian municipalities. As with all systems of taxation it is important that the burden of this tax be distributed fairly and equitably. Legislators have attempted to ensure equitable treatment among real property owners by requiring that the basis of assessment should be 'actual value'. However, due to the large numbers of properties to be valued, assessors have not been able to use the market approach to value, a valuation technique known to produce 'actual values'. Rather, they have resorted to the more subjective cost approach to value. While the mechanics of the cost approach lend themselves to the mass valuation problem, they rarely produce values that can be equated with actual market values. The application of multiple regression analysis is presented as a solution to this valuation problem. Multiple regression analysis enables the assessor to produce objectively the 'actual value' of all single family homes in a municipality. After presenting multiple regression analysis as a modern application of the market approach to value, the applicability of this valuation technique is tested on actual sales data. A sample of approximately four hundred recently sold single family homes is subjected to valuation by multiple regression analysis. Various experiments, including means of stratifying the data are presented in an attempt to produce high standards of solution. While the statistical results of the experiment are not of sufficient calibre for practical assessment purposes, they do reveal how continued experimentation can improve the applicability of this valuation technique to mass appraisal. Multiple regression analysis is the assessor's tool of the future. It facilitates the application of a valuation technique that will permit the assessor to meet his statutory obligation while still allowing him to adhere to sound appraisal methodology. / Business, Sauder School of / Graduate Real property -- Valuation. Regression analysis.
309	The variable selection problem and the application of the roc curve for binary outcome variables Matshego, James Moeng 11 August 2008 (has links) Variable selection refers to the problem of selecting input variables that are most predictive of a given outcome. Variable selection problems are found in all machine learning tasks, supervised or unsupervised, classification, regression, time series prediction , two - class or multi-class, posing various levels of challenges. Variables selection problems are related to the problems of input dimensionality reduction and of parameter planning. It has practical and theoretical challenges of its own. From the practical point of view, eliminating variables may reduce the cost of producing the outcome and increase its speed, while space dimensionality does not address these problems. Theoretical challenges include estimating with what confidence one can state that a variable is relevant to the concept when it is useful to the outcome and providing a theoretical understanding of the stability of selected variables subsets. As the probability cut-points increase in value, the more likely it becomes that an observation is classified as a non-event by the selected variables. The mathematical statement of the problem is not widely agreed upon and may depend on the application. One typically distinguishes: i) The problem of discovering all the variables relevant to the outcome variable and determine HOW relevant they are and how they are related to each other. ii) The problem of finding a minimum subset of variables that is useful to the outcome variable. Logistic regression is an increasingly popular statistical technique used to model the probability of discrete binary outcome. Logistic regression applies maximum likelihood estimation after transforming the outcome variable into a logit variable. In this way, logistic regression estimates the probability of a certain event. When properly applied, logistic regression analyses yield a very powerful insight in to what variables are more or less likely to predict event outcome in a population of interest. These models also show the extent to which changes in the values of the variable may increase or decrease the predicted probability of event outcome. Variable selection, in all its facets is similarly important with logistic regression. The receiver operating characteristics (ROC) curve is a graphic display that gives a measure of the predictive accuracy of a logistic regression model. It is a measure of classification performance, the area under the ROC curve (AUC) is a scalar measure gauging one facet of performance. Another measure of predictive accuracy of a logistic regression model is a classification table. It uses the model to classifying observations as events if their estimated probability is greater or equal to a given probability cut-point, otherwise events are classified as non-events. This technique, as it appears in the literature, is also studied in this thesis. In this thesis the issue of variable selection, both for continuous and binary outcome variables, is investigated as it appears in the statistical literature. It is clear that this topic has been widely researched and still remains a feature of modern research. The last word certainly hasn’t been spoken. / Dissertation (MSc)--University of Pretoria, 2008. / Statistics / unrestricted Logistic regression Parameter planning UCTD
310	Using ATS to Turn Time Series Estimation and Model Diagnostics into Fast Regression Estimation and Model Diagnostics Jeremy M. Troisi (5930336) 15 May 2019 (has links) <pre>The Average Transform Smooth (ATS) statistical methods [McRae, Mallows, and Cleveland], are applied to measurements of a non-gaussian random variable to make them close to gaussian. This gaussianization makes use of the well known concept of variance stabilizing transformation, but takes it further by first averaging blocks of r measurements, transforming next, and then smoothing. The smoothing can be nonparametric, or can be the fitting of a parametric model. The gaussianization makes analysis simpler and more effective.</pre><pre><br></pre><pre>In this work ATS is applied to the periodogram of a stationary parametric time series, and makes use of the periodogram large sample properties given the true power spectrum [Brillinger], to develop a new approach to parametric time series model estimation and model diagnostics. The ATS results and the theory are reformulated as a regression model, PPS-REG, involving true power spectrum and the periodogram. PPS-REG has attractive properties: iid gaussian error terms with mean 0 and a known variance; accurate estimation; much faster estimation than the classical maximum likelihood when the time series is large; enables the use of the very powerful classical regression model diagnostics; bases the diagnostics on the power spectrum, adding substantially to the standard use of the autocovariance function for diagnosing the fits of models specified in the time domain.</pre> Statistics Periodogram Power Spectrum Regression

Search results