Global ETD Search

571	Covariance analysis of multiple linear regression equations Eekman, Gordon Clifford Duncan January 1969 (has links) A covariance analysis procedure which compares multiple linear regression equations is developed by extending the general linear hypothesis model of full rank to encompass heterogeneous data. A FORTRAN IV computer program tests parallelism and coincidence amongst sets of regression equations. By a practical example both the theory and the computer program are demonstrated. / Graduate and Postdoctoral Studies / Graduate Regression analysis FORTRAN (Computer program language)
572	Additivity of component regression equations when the underlying model is linear Chiyenda, Simeon Sandaramu January 1983 (has links) This thesis is concerned with the theory of fitting models of the form y = Xβ + ε, where some distributional assumptions are made on ε. More specifically, suppose that y[sub=j] = Zβ[sub=j] + ε [sub=j] is a model for a component j (j = 1, 2, ..., k) and that one is interested in estimation and interference theory relating to y[sub=T] = Σ [sup=k; sub=j=1] y[sub=j] = Xβ[sub=T] + ε[sub=T]. The theory of estimation and inference relating to the fitting of y[sub=T] is considered within the general framework of general linear model theory. The consequence of independence and dependence of the y[sub=j] (j = 1, 2, ..., k) for estimation and inference is investigated. It is shown that under the assumption of independence of the y[sub=j], the parameter vector of the total equation can easily be obtained by adding corresponding components of the estimates for the parameters of the component models. Under dependence, however, this additivity property seems to break down. Inference theory under dependence is much less tractable than under independence and depends critically, of course, upon whether y[sub=T] is normal or not. Finally, the theory of additivity is extended to classificatory models encountered in designed experiments. It is shown, however, that additivity does not hold in general in nonlinear models. The problem of additivity does not require new computing subroutines for estimation and inference in general in those cases where it works. / Forestry, Faculty of / Graduate Linear models (Statistics) Regression analysis Estimation theory
573	The accuracy of parameter estimates and coverage probability of population values in regression models upon different treatments of systematically missing data Othuon, Lucas Onyango A. 11 1900 (has links) Several methods are available for the treatment of missing data. Most of the methods are based on the assumption that data are missing completely at random (MCAR). However, data sets that are MCAR are rare in psycho-educational research. This gives rise to the need for investigating the performance of missing data treatments (MDTs) with non-randomly or systematically missing data, an area that has not received much attention by researchers in the past. In the current simulation study, the performance of four MDTs, namely, mean substitution (MS), pairwise deletion (PW), expectation-maximization method (EM), and regression imputation (RS), was investigated in a linear multiple regression context. Four investigations were conducted involving four predictors under low and high multiple R² , and nine predictors under low and high multiple R² . In addition, each investigation was conducted under three different sample size conditions (94, 153, and 265). The design factors were missing pattern (2 levels), percent missing (3 levels) and non-normality (4 levels). This design gave rise to 72 treatment conditions. The sampling was replicated one thousand times in each condition. MDTs were evaluated based on accuracy of parameter estimates. In addition, the bias in parameter estimates, and coverage probability of regression coefficients, were computed. The effect of missing pattern, percent missing, and non-normality on absolute error for R² estimate was of practical significance. In the estimation of R², EM was the most accurate under the low R² condition, and PW was the most accurate under the high R² condition. No MDT was consistently least biased under low R² condition. However, with nine predictors under the high R² condition, PW was generally the least biased, with a tendency to overestimate population R². The mean absolute error (MAE) tended to increase with increasing non-normality and increasing percent missing. Also, the MAE in R² estimate tended to be smaller under monotonic pattern than under non-monotonic pattern. MDTs were most differentiated at the highest level of percent missing (20%), and under non-monotonic missing pattern. In the estimation of regression coefficients, RS generally outperformed the other MDTs with respect to accuracy of regression coefficients as measured by MAE . However, EM was competitive under the four predictors, low R² condition. MDTs were most differentiated only in the estimation of β₁, the coefficient of the variable with no missing values. MDTs were undifferentiated in their performance in the estimation for b₂,...,bp, p = 4 or 9, although the MAE remained fairly the same across all the regression coefficients. The MAE increased with increasing non-normality and percent missing, but decreased with increasing sample size. The MAE was generally greater under non-monotonic pattern than under monotonic pattern. With four predictors, the least bias was under RS regardless of the magnitude of population R². Under nine predictors, the least bias was under PW regardless of population R². The results for coverage probabilities were generally similar to those under estimation of regression coefficients, with coverage probabilities closest to nominal alpha under RS. As expected, coverage probabilities decreased with increasing non-normality for each MDT, with values being closest to nominal value for normal data. MDTs were most differentiated with respect to coverage probabilities under non-monotonic pattern than under monotonic pattern. Important implications of the results to researchers are numerous. First, the choice of MDT was found to depend on the magnitude of population R², number of predictors, as well as on the parameter estimate of interest. With the estimation of R² as the goal of analysis, use of EM is recommended if the anticipated R² is low (about .2). However, if the anticipated R² is high (about .6), use of PW is recommended. With the estimation of regression coefficients as the goal of analysis, the choice of MDT was found to be most crucial for the variable with no missing data. The RS method is most recommended with respect to estimation accuracy of regression coefficients, although greater bias was recorded under RS than under PW or MS when the number of predictors was large (i.e., nine predictors). Second, the choice of MDT seems to be of little concern if the proportion of missing data is 10 percent, and also if the missing pattern is monotonic rather than non-monotonic. Third, the proportion of missing data seems to have less impact on the accuracy of parameter estimates under monotonic missing pattern than under non-monotonic missing pattern. Fourth, it is recommended for researchers that in the control of Type I error rates under low R² condition, the EM method should be used as it produced coverage probability of regression coefficients closest to nominal value at .05 level. However, in the control of Type I error rates under high R² condition, the RS method is recommended. Considering that simulated data were used in the present study, it is suggested that future research should attempt to validate the findings of the present study using real field data. Also, a future investigator could modify the number of predictors as well as the confidence interval in the calculation of coverage probabilities to extend generalization of results. / Education, Faculty of / Educational and Counselling Psychology, and Special Education (ECPS), Department of / Graduate Regression analysis Statistics Error analysis (Mathematics)
574	General Satisfaction of Students in 100% Online Courses in the Department of Learning Technologies at the University of North Texas Ahn, Byungmun 05 1900 (has links) The purpose of this study was to examine whether there are significant relationships between the general satisfaction of students and learner-content interaction, learner-instructor interaction, learner-learner interaction, and learner-technology interaction in 100% online courses. There were 310 responses from the students. This study did not use data from duplicate students and instructors. Excel was used to find duplicate students and instructors; therefore, 128 responses were deleted. After examination of box plots, an additional four cases were removed because they were outliers on seven or more variables. Nineteen responses were deleted because they did not answer all questions of interest, resulting in a total sample of 159 students. Multiple regression analysis was used to examine the relationship between the four independent variables and the dependent variable. In addition to tests for statistical significance, practical significance was evaluated with the multiple R2 , which reported the common variance between independent variables and dependent variable. The two variables of learner-content and learner-instructor interaction play a significant role in predicting online satisfaction. Minimally, the variable learner-technology can predict online satisfaction and is an important construct that must be considered when offering online courses. Results of this study provide help in establishing a valid and reliable survey instrument and in developing an online best learning environment, as well as recommendations for institutions offering online learning or considering the development of online learning courses. Online learning satisfaction of online courses multiple regression
575	Penalized methods in genome-wide association studies Liu, Jin 01 July 2011 (has links) Penalized regression methods are becoming increasingly popular in genome-wide association studies (GWAS) for identifying genetic markers associated with disease. However, standard penalized methods such as the LASSO do not take into account the possible linkage disequilibrium between adjacent markers. We propose a novel penalized approach for GWAS using a dense set of single nucleotide polymorphisms (SNPs). The proposed method uses the minimax concave penalty (MCP) for marker selection and incorporates linkage disequilibrium (LD) information by penalizing the difference of the genetic effects at adjacent SNPs with high correlation. A coordinate descent algorithm is derived to implement the proposed method. This algorithm is efficient and stable in dealing with a large number of SNPs. A multi-split method is used to calculate the p-values of the selected SNPs for assessing their significance. We refer to the proposed penalty function as the smoothed MCP (SMCP) and the proposed approach as the SMCP method. Performance of the proposed SMCP method and its comparison with a LASSO approach are evaluated through simulation studies, which demonstrate that the proposed method is more accurate in selecting associated SNPs. Its applicability to real data is illustrated using data from a GWAS on rheumatoid arthritis. Based on the idea of SMCP, we propose a new penalized method for group variable selection in GWAS with respect to the correlation between adjacent groups. The proposed method uses the group LASSO for encouraging group sparsity and a quadratic difference for adjacent group smoothing. We call it smoothed group LASSO, or SGL for short. Canonical correlations between two adjacent groups of SNPS are used as the weights in the quadratic difference penalty. Principal components are used to reduced dimensionality locally within groups. We derive a group coordinate descent algorithm for computing the solution path of the SGL. Simulation studies are used to evaluate the finite sample performance of the SGL and group LASSO. We also demonstrate its applicability on rheumatoid arthritis data. GWAS Linkage disequilibrium Penalized regression Statistics and Probability
576	Evaluation of Asset Pricing Models in the South African Equities Market Moyo, Nigel A P 16 February 2021 (has links) Asset pricing models have been of interest since their origin in modern finance. The Capital Asset Pricing Model is a widely used tool and is one of the early developed asset pricing models in modern finance. There are continual improvements of this model with the evident multifactor models of Fama and French (2015), Carhart (1997) and the South African two – factor arbitrage pricing models of Van Rensburg (2002) and Laird-Smith et al. (2016). This research empirically investigates the performance of eight-different multi-factor asset pricing models in describing average portfolio returns in the South African Johannesburg Stock Exchange. We find that the Carhart (1997) four factor model comprising of the market factor, size factor, value factor and the momentum factor is the most parsimonious model and thus better explains the average portfolio returns in the South African JSE. This model is an improvement of the Fama and French (1992) three factor model. Additionally, we investigate the performance of the two factor Asset Pricing Theory (APT) model of Laird-Smith et al. (2016) and Van Rensburg (2002) that consists of the South African Financial Index (SAFI) and the South African Resources Index (SARI). We observe that the model performs better than the traditional CAPM that is widely used in industry. Adding the SAFI and the SARI to the six-factor model results in an eight-factor model that has a significant improvement in explaining average returns. The results indicate that the market factor, the South African Financial Index and the South African Resources Index (SARI) poorly explain each other but their linear combination improves the eight-factor asset pricing model in explaining average portfolio returns in the South African market. The eight – factor model comprises of the market, size, value, investment, profitability, momentum factors and the two South African indices namely, the South African Financials Index (SAFI) and the South African Resources Index (SARI). CAPM JSE APT SAFI SARI Regression Indices
577	Modern variable selection techniques in the generalised linear model with application in Biostatistics Millard, Salomi 10 1900 (has links) In a Biostatistics environment, the datasets to be analysed are frequently high-dimensional and multicollinearity is expected due to the nature of the features. However, many traditional approaches to statistical analysis and feature selection cease to be useful in the presence of high-dimensionality and multicollinearity. Penalised regression methods have proved to be practical and attractive for dealing with these problems. In this dissertation, we propose a new penalised approach, the modified elastic-net (MEnet), for statistical analysis and feature selection using a combination of the ridge and bridge penalties. This method is designed to deal with high-dimensional problems with highly correlated predictor variables. Furthermore, it has a closed-form solution, unlike the most frequently used penalised techniques, which makes it simple to implement on high-dimensional data. We show how this approach can be used to analyse high-dimensional data with binary responses, e.g., microarray data, and simultaneously select significant features. An extensive simulation study and analysis of a colon cancer dataset demonstrate the properties and practical aspects of the proposed method. / Mini Dissertation (MSc (Advanced Data Analytics))--University of Pretoria, 2020. / DSI-CSIR Interbursary Support (IBS) Programme / Statistics Industry HUB, Department of Statistics, University of Pretoria / Statistics / MSc / Restricted Mathematical statistics Penalised regression Feature selection UCTD
578	Finanční analýza společnosti s využitím systému Maple / Financial Analysis of the Company Using the Maple System Šuľan, Matej January 2019 (has links) The diploma thesis deals with the financial analysis of the selected company. On the analysis base of ratio financial indicators, time series, regression analysis and with using of the Maple system, the past and actual financial situation have been evaluated and future potential development of the company has been predicted.
579	Letter to the editor: “A population-based study of cervical cytology findings and human papillomavirus infection in a suburban area of Thailand” Vásquez-Medina, Mirtha Jimena, Villegas-Otiniano, Paola Jimena, Benítes-Zapata, Vicente A. 02 1900 (has links) Carta al editor / Revisión por pares Oral contraceptive agent Cancer regression Disease classification
580	Comparing logistic regression methods for completely separated and quasi-separated data Botes, Michelle January 2013 (has links) An occurrence which is sometimes observed in a model based on dichotomous dependent variables is separation in the data. Separation in the data is when one or more of the independent variables can perfectly predict some binary outcome and it primarily occurs in small samples. There are three different mutually exclusive and exhaustive classes into which the data from a logistic regression can be classified: complete separation, quasi-complete separation and overlap. Separation (either complete or quasi-complete) in the data gives rise to a number of problems since it implies in nite or zero maximum likelihood estimates which are idealistic and does not happen in practice. In this dissertation the theory behind a logistic regression model, the definition of separation and different methods to deal with separation are discussed in part I. The methods that will be focused on are exact logistic regression, Firth s method which penalises the likelihood function and hidden logistic regression. In part II of this dissertation the three fore mentioned methods will be compared to one another. This will be done by applying each method to data sets which exhibit either complete or quasi-complete separation for different sample sizes and different covariate types. / Dissertation (MSc)--University of Pretoria, 2013. / Statistics / Unrestricted Regression methods UCTD C14/4/166/gm

Search results