• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • 1
  • Tagged with
  • 4
  • 4
  • 4
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Ensemble Learning Method on Machine Maintenance Data

Zhao, Xiaochuang 05 November 2015 (has links)
In the industry, a lot of companies are facing the explosion of big data. With this much information stored, companies want to make sense of the data and use it to help them for better decision making, especially for future prediction. A lot of money can be saved and huge revenue can be generated with the power of big data. When building statistical learning models for prediction, companies in the industry are aiming to build models with efficiency and high accuracy. After the learning models have been developed for production, new data will be generated. With the updated data, the models have to be updated as well. Due to this nature, the model performs best today doesn’t mean it will necessarily perform the same tomorrow. Thus, it is very hard to decide which algorithm should be used to build the learning model. This paper introduces a new method that ensembles the information generated by two different classification statistical learning algorithms together as inputs for another learning model to increase the final prediction power. The dataset used in this paper is NASA’s Turbofan Engine Degradation data. There are 49 numeric features (X) and the response Y is binary with 0 indicating the engine is working properly and 1 indicating engine failure. The model’s purpose is to predict whether the engine is going to pass or fail. The dataset is divided in training set and testing set. First, training set is used twice to build support vector machine (SVM) and neural network models. Second, it used the trained SVM and neural network model taking X of the training set as input to predict Y1 and Y2. Then, it takes Y1 and Y2 as inputs to build the Penalized Logistic Regression model, which is the ensemble model here. Finally, use the testing set follow the same steps to get the final prediction result. The model accuracy is calculated using overall classification accuracy. The result shows that the ensemble model has 92% accuracy. The prediction accuracies of SVM, neural network and ensemble models are compared to prove that the ensemble model successfully captured the power of the two individual learning model.
2

Schätzverfahren für individuelles Preissetzungsverhalten im Lebensmitteleinzelhandel / Estimation methods for individual pricesetting behavior in the retail sector

Schulze Bisping, Christin 17 November 2017 (has links)
No description available.
3

Investigating Gene-Gene and Gene-Environment Interactions in the Association Between Overnutrition and Obesity-Related Phenotypes

Tessier, François January 2017 (has links)
Introduction – Animal studies suggested that NFKB1, SOCS3 and IKBKB genes could be involved in the association between overnutrition and obesity. This study aims to investigate interactions involving these genes and nutrition affecting obesity-related phenotypes. Methods – We used multifactor dimensionality reduction (MDR) and penalized logistic regression (PLR) to better detect gene/environment interactions in data from the Toronto Nutrigenomics and Health Study (n=1639) using dichotomized body mass index (BMI) and waist circumference (WC) as obesity-related phenotypes. Exposure variables included genotypes on 54 single nucleotide polymorphisms, dietary factors and ethnicity. Results – MDR identified interactions between SOCS3 rs6501199 and rs4969172, and IKBKB rs3747811 affecting BMI in whites; SOCS3 rs6501199 and NFKB1 rs1609798 affecting WC in whites; and SOCS3 rs4436839 and IKBKB rs3747811 affecting WC in South Asians. PLR found a main effect of SOCS3 rs12944581 on BMI among South Asians. Conclusion – MDR and PLR gave different results, but support some results from previous studies.
4

High-dimensional classification and attribute-based forecasting

Lo, Shin-Lian 27 August 2010 (has links)
This thesis consists of two parts. The first part focuses on high-dimensional classification problems in microarray experiments. The second part deals with forecasting problems with a large number of categories in predictors. Classification problems in microarray experiments refer to discriminating subjects with different biologic phenotypes or known tumor subtypes as well as to predicting the clinical outcomes or the prognostic stages of subjects. One important characteristic of microarray data is that the number of genes is much larger than the sample size. The penalized logistic regression method is known for simultaneous variable selection and classification. However, the performance of this method declines as the number of variables increases. With this concern, in the first study, we propose a new classification approach that employs the penalized logistic regression method iteratively with a controlled size of gene subsets to maintain variable selection consistency and classification accuracy. The second study is motivated by a modern microarray experiment that includes two layers of replicates. This new experimental setting causes most existing classification methods, including penalized logistic regression, not appropriate to be directly applied because the assumption of independent observations is violated. To solve this problem, we propose a new classification method by incorporating random effects into penalized logistic regression such that the heterogeneity among different experimental subjects and the correlations from repeated measurements can be taken into account. An efficient hybrid algorithm is introduced to tackle computational challenges in estimation and integration. Applications to a breast cancer study show that the proposed classification method obtains smaller models with higher prediction accuracy than the method based on the assumption of independent observations. The second part of this thesis develops a new forecasting approach for large-scale datasets associated with a large number of predictor categories and with predictor structures. The new approach, beyond conventional tree-based methods, incorporates a general linear model and hierarchical splits to make trees more comprehensive, efficient, and interpretable. Through an empirical study in the air cargo industry and a simulation study containing several different settings, the new approach produces higher forecasting accuracy and higher computational efficiency than existing tree-based methods.

Page generated in 0.1491 seconds