Global ETD Search

21	Model selection Hildebrand, Annelize 11 1900 (has links) In developing an understanding of real-world problems, researchers develop mathematical and statistical models. Various model selection methods exist which can be used to obtain a mathematical model that best describes the real-world situation in some or other sense. These methods aim to assess the merits of competing models by concentrating on a particular criterion. Each selection method is associated with its own criterion and is named accordingly. The better known ones include Akaike's Information Criterion, Mallows' Cp and cross-validation, to name a few. The value of the criterion is calculated for each model and the model corresponding to the minimum value of the criterion is then selected as the "best" model. / Mathematical Sciences / M. Sc. (Statistics) Model selection Discrepancy measures Criteria Akaike Information Criterion Mallows' Cp Cross-validation R-square Adjusted R-square Mean Square Error 519.5 Mallows' Cp Akaike Information Criterion Mathematical models -- Evaluation
22	Logistic regression to determine significant factors associated with share price change Muchabaiwa, Honest 19 February 2014 (has links) This thesis investigates the factors that are associated with annual changes in the share price of Johannesburg Stock Exchange (JSE) listed companies. In this study, an increase in value of a share is when the share price of a company goes up by the end of the financial year as compared to the previous year. Secondary data that was sourced from McGregor BFA website was used. The data was from 2004 up to 2011. Deciding which share to buy is the biggest challenge faced by both investment companies and individuals when investing on the stock exchange. This thesis uses binary logistic regression to identify the variables that are associated with share price increase. The dependent variable was annual change in share price (ACSP) and the independent variables were assets per capital employed ratio, debt per assets ratio, debt per equity ratio, dividend yield, earnings per share, earnings yield, operating profit margin, price earnings ratio, return on assets, return on equity and return on capital employed. Different variable selection methods were used and it was established that the backward elimination method produced the best model. It was established that the probability of success of a share is higher if the shareholders are anticipating a higher return on capital employed, and high earnings/ share. It was however, noted that the share price is negatively impacted by dividend yield and earnings yield. Since the odds of an increase in share price is higher if there is a higher return on capital employed and high earning per share, investors and investment companies are encouraged to choose companies with high earnings per share and the best returns on capital employed. The final model had a classification rate of 68.3% and the validation sample produced a classification rate of 65.2% / Mathematical Sciences / M.Sc. (Statistics) Logistic regression Binary logistic regression Share price Stock exchange Akaike’s Information Criterion Wald Test Score test Enter method Stepwise logistic regression 519.536 Stock exchange Logistic regression analysis Market share Akaike Information Criterion
23	Multiple Outlier Detection: Hypothesis Tests versus Model Selection by Information Criteria Lehmann, Rüdiger, Lösler, Michael 14 June 2017 (has links) (PDF) The detection of multiple outliers can be interpreted as a model selection problem. Models that can be selected are the null model, which indicates an outlier free set of observations, or a class of alternative models, which contain a set of additional bias parameters. A common way to select the right model is by using a statistical hypothesis test. In geodesy data snooping is most popular. Another approach arises from information theory. Here, the Akaike information criterion (AIC) is used to select an appropriate model for a given set of observations. The AIC is based on the Kullback-Leibler divergence, which describes the discrepancy between the model candidates. Both approaches are discussed and applied to test problems: the fitting of a straight line and a geodetic network. Some relationships between data snooping and information criteria are discussed. When compared, it turns out that the information criteria approach is more simple and elegant. Along with AIC there are many alternative information criteria for selecting different outliers, and it is not clear which one is optimal. Methode der Kleinsten Quadrate Ausreißererkennung Hypothesentest Informationskriterium Akaike Informationskriterium (AIC) Modellselektion Least squares adjustment Outlier detection Hypothesis test Information criterion Akaike information criterion (AIC) Data snooping Model selection ddc:526 rvk:ZI 9010
24	Automated construction of generalized additive neural networks for predictive data mining / Jan Valentine du Toit Du Toit, Jan Valentine January 2006 (has links) In this thesis Generalized Additive Neural Networks (GANNs) are studied in the context of predictive Data Mining. A GANN is a novel neural network implementation of a Generalized Additive Model. Originally GANNs were constructed interactively by considering partial residual plots. This methodology involves subjective human judgment, is time consuming, and can result in suboptimal results. The newly developed automated construction algorithm solves these difficulties by performing model selection based on an objective model selection criterion. Partial residual plots are only utilized after the best model is found to gain insight into the relationships between inputs and the target. Models are organized in a search tree with a greedy search procedure that identifies good models in a relatively short time. The automated construction algorithm, implemented in the powerful SAS® language, is nontrivial, effective, and comparable to other model selection methodologies found in the literature. This implementation, which is called AutoGANN, has a simple, intuitive, and user-friendly interface. The AutoGANN system is further extended with an approximation to Bayesian Model Averaging. This technique accounts for uncertainty about the variables that must be included in the model and uncertainty about the model structure. Model averaging utilizes in-sample model selection criteria and creates a combined model with better predictive ability than using any single model. In the field of Credit Scoring, the standard theory of scorecard building is not tampered with, but a pre-processing step is introduced to arrive at a more accurate scorecard that discriminates better between good and bad applicants. The pre-processing step exploits GANN models to achieve significant reductions in marginal and cumulative bad rates. The time it takes to develop a scorecard may be reduced by utilizing the automated construction algorithm. / Thesis (Ph.D. (Computer Science))--North-West University, Potchefstroom Campus, 2006. Akaike Information Criterion AIC Automated construction algorithm Bayesian Model Averaging Credit scoring Data mining Generalized Additive Neural Network GANN Generalized Additive Model GAM Interactive construction algorithm Model averaging Neural network Partial residua Predictive modeling Schwarz information criterion SBC
25	Automated construction of generalized additive neural networks for predictive data mining / Jan Valentine du Toit Du Toit, Jan Valentine January 2006 (has links) In this thesis Generalized Additive Neural Networks (GANNs) are studied in the context of predictive Data Mining. A GANN is a novel neural network implementation of a Generalized Additive Model. Originally GANNs were constructed interactively by considering partial residual plots. This methodology involves subjective human judgment, is time consuming, and can result in suboptimal results. The newly developed automated construction algorithm solves these difficulties by performing model selection based on an objective model selection criterion. Partial residual plots are only utilized after the best model is found to gain insight into the relationships between inputs and the target. Models are organized in a search tree with a greedy search procedure that identifies good models in a relatively short time. The automated construction algorithm, implemented in the powerful SAS® language, is nontrivial, effective, and comparable to other model selection methodologies found in the literature. This implementation, which is called AutoGANN, has a simple, intuitive, and user-friendly interface. The AutoGANN system is further extended with an approximation to Bayesian Model Averaging. This technique accounts for uncertainty about the variables that must be included in the model and uncertainty about the model structure. Model averaging utilizes in-sample model selection criteria and creates a combined model with better predictive ability than using any single model. In the field of Credit Scoring, the standard theory of scorecard building is not tampered with, but a pre-processing step is introduced to arrive at a more accurate scorecard that discriminates better between good and bad applicants. The pre-processing step exploits GANN models to achieve significant reductions in marginal and cumulative bad rates. The time it takes to develop a scorecard may be reduced by utilizing the automated construction algorithm. / Thesis (Ph.D. (Computer Science))--North-West University, Potchefstroom Campus, 2006. Akaike Information Criterion AIC Automated construction algorithm Bayesian Model Averaging Credit scoring Data mining Generalized Additive Neural Network GANN Generalized Additive Model GAM Interactive construction algorithm Model averaging Neural network Partial residua Predictive modeling Schwarz information criterion SBC
26	Logistic regression to determine significant factors associated with share price change Muchabaiwa, Honest 19 February 2014 (has links) This thesis investigates the factors that are associated with annual changes in the share price of Johannesburg Stock Exchange (JSE) listed companies. In this study, an increase in value of a share is when the share price of a company goes up by the end of the financial year as compared to the previous year. Secondary data that was sourced from McGregor BFA website was used. The data was from 2004 up to 2011. Deciding which share to buy is the biggest challenge faced by both investment companies and individuals when investing on the stock exchange. This thesis uses binary logistic regression to identify the variables that are associated with share price increase. The dependent variable was annual change in share price (ACSP) and the independent variables were assets per capital employed ratio, debt per assets ratio, debt per equity ratio, dividend yield, earnings per share, earnings yield, operating profit margin, price earnings ratio, return on assets, return on equity and return on capital employed. Different variable selection methods were used and it was established that the backward elimination method produced the best model. It was established that the probability of success of a share is higher if the shareholders are anticipating a higher return on capital employed, and high earnings/ share. It was however, noted that the share price is negatively impacted by dividend yield and earnings yield. Since the odds of an increase in share price is higher if there is a higher return on capital employed and high earning per share, investors and investment companies are encouraged to choose companies with high earnings per share and the best returns on capital employed. The final model had a classification rate of 68.3% and the validation sample produced a classification rate of 65.2% / Mathematical Sciences / M.Sc. (Statistics) Logistic regression Binary logistic regression Share price Stock exchange Akaike’s Information Criterion Wald Test Score test Enter method Stepwise logistic regression 519.536 Stock exchange Logistic regression analysis Market share Akaike Information Criterion
27	Model selection Hildebrand, Annelize 11 1900 (has links) In developing an understanding of real-world problems, researchers develop mathematical and statistical models. Various model selection methods exist which can be used to obtain a mathematical model that best describes the real-world situation in some or other sense. These methods aim to assess the merits of competing models by concentrating on a particular criterion. Each selection method is associated with its own criterion and is named accordingly. The better known ones include Akaike's Information Criterion, Mallows' Cp and cross-validation, to name a few. The value of the criterion is calculated for each model and the model corresponding to the minimum value of the criterion is then selected as the "best" model. / Mathematical Sciences / M. Sc. (Statistics) Model selection Discrepancy measures Criteria Akaike Information Criterion Mallows' Cp Cross-validation R-square Adjusted R-square Mean Square Error 519.5 Mallows' Cp Akaike Information Criterion Mathematical models -- Evaluation
28	Multiple Outlier Detection: Hypothesis Tests versus Model Selection by Information Criteria Lehmann, Rüdiger, Lösler, Michael January 2016 (has links) The detection of multiple outliers can be interpreted as a model selection problem. Models that can be selected are the null model, which indicates an outlier free set of observations, or a class of alternative models, which contain a set of additional bias parameters. A common way to select the right model is by using a statistical hypothesis test. In geodesy data snooping is most popular. Another approach arises from information theory. Here, the Akaike information criterion (AIC) is used to select an appropriate model for a given set of observations. The AIC is based on the Kullback-Leibler divergence, which describes the discrepancy between the model candidates. Both approaches are discussed and applied to test problems: the fitting of a straight line and a geodetic network. Some relationships between data snooping and information criteria are discussed. When compared, it turns out that the information criteria approach is more simple and elegant. Along with AIC there are many alternative information criteria for selecting different outliers, and it is not clear which one is optimal. info:eu-repo/classification/ddc/526 ddc:526
29	Seleção de modelos multiníveis para dados de avaliação educacional / Selection of multilevel models for educational evaluation data Coelho, Fabiano Rodrigues 11 August 2017 (has links) Quando um conjunto de dados possui uma estrutura hierárquica, uma possível abordagem são os modelos de regressão multiníveis, que se justifica pelo fato de haver uma porção significativa da variabilidade dos dados que pode ser explicada por níveis macro. Neste trabalho, desenvolvemos a seleção de modelos de regressão multinível aplicados a dados educacionais. Esta análise divide-se em duas partes: seleção de variáveis e seleção de modelos. Esta última subdivide-se em dois casos: modelagem clássica e modelagem bayesiana. Buscamos através de critérios como o Lasso, AIC, BIC, WAIC entre outros, encontrar quais são os fatores que influenciam no desempenho em matemática dos alunos do nono ano do ensino fundamental do estado de São Paulo. Também investigamos o funcionamento de cada um dos critérios de seleção de variáveis e de modelos. Foi possível concluir que, sob a abordagem frequentista, o critério de seleção de modelos BIC é o mais eficiente, já na abordagem bayesiana, o critérioWAIC apresentou melhores resultados. Utilizando o critério de seleção de variáveis Lasso para abordagem clássica, houve uma diminuição de 34% dos preditores do modelo. Por fim, identificamos que o desempenho em matemática dos estudantes do nono ano do ensino fundamental do estado de São Paulo é influenciado pelas seguintes covariáveis: grau de instrução da mãe, frequência de leitura de livros, tempo gasto com recreação em dia de aula, o fato de gostar de matemática, o desempenho em matemática global da escola, desempenho em língua portuguesa do aluno, dependência administrativa da escola, sexo, grau de instrução do pai, reprovações e distorção idade-série. / When a dataset contains a hierarchical data structure, a possible approach is the multilevel regression modelling, which is justified by the significative amout of the data variability that can be explained by macro level processes. In this work, a selection of multilevel regression models for educational data is developed. This analysis is divided into two parts: variable selection and model selection. The latter is subdivided into two categories: classical and Bayesian modeling. Traditional criteria for model selection such as Lasso, AIC, BIC, and WAIC, among others are used in this study as an attempt to identify the factors influencing ninth grade students performance in Mathematics of elementary education in the State of São Paulo. Likewise, an investigation was conducted to evaluate the performance of each variable selection criteria and model selection methods applied to fitted models that will be mentioned throughout this work. It was possible to conclude that, under the frequentist approach, BIC is the most efficient, whereas under the bayesian approach, WAIC presented better results. Using Lasso under the frequentist approach, a decrease of 34% on the number of predictors was observed. Finally, we identified that the performance in Mathematics of students in the ninth year of elementary school in the state of São Paulo is most influenced by the following covariates: mothers educational level, frequency of book reading, time spent with recreation in classroom, the fact of liking Math, school global performance in Mathematics, performance in Portuguese, school administrative dependence, gender, fathers educational degree, failures and age-grade distortion. Critério de informação e Prova Brasil Model selection Modelos Multiníveis Multilevel models Seleção de modelos
30	Crystallographic Image Processing with Unambiguous 2D Bravais Lattice Identification on the Basis of a Geometric Akaike Information Criterion Bilyeu, Taylor Thomas 02 July 2013 (has links) Crystallographic image processing (CIP) is a technique first used to aid in the structure determination of periodic organic complexes imaged with a high-resolution transmission electron microscope (TEM). The technique has subsequently been utilized for TEM images of inorganic crystals, scanning TEM images, and even scanning probe microscope (SPM) images of two-dimensional periodic arrays. We have written software specialized for use on such SPM images. A key step in the CIP process requires that an experimental image be classified as one of only 17 possible mathematical plane symmetry groups. The current methods used for making this symmetry determination are not entirely objective, and there is no generally accepted method for measuring or quantifying deviations from ideal symmetry. Here, we discuss the crystallographic symmetries present in real images and the general techniques of CIP, with emphasis on the current methods for symmetry determination in an experimental 2D periodic image. The geometric Akaike information criterion (AIC) is introduced as a viable statistical criterion for both quantifying deviations from ideal symmetry and determining which 2D Bravais lattice best fits the experimental data from an image being processed with CIP. By objectively determining the statistically favored 2D Bravais lattice, the determination of plane symmetry in the CIP procedure can be greatly improved. As examples, we examine scanning tunneling microscope images of 2D molecular arrays of the following compounds: cobalt phthalocyanine on Au (111) substrate; nominal cobalt phthalocyanine on Ag (111); tetraphenoxyphthalocyanine on highly oriented pyrolitic graphite; hexaazatriphenylene-hexacarbonitrile on Ag (111). We show that the geometric AIC procedure can unambiguously determine which 2D Bravais lattice fits the experimental data for a variety of different lattice types. In some cases, the geometric AIC procedure can be used to determine which plane symmetry group best fits the experimental data, when traditional CIP methods fail to do so. Crystallography -- Data processing Scanning tunneling microscopy Transmission electron microscopy Akaike Information Criterion Atomic, Molecular and Optical Physics Other Physics

Search results