Global ETD Search

1	Stacking Ensemble for auto_ml Ngo, Khai Thoi 13 June 2018 (has links) Machine learning has been a subject undergoing intense study across many different industries and academic research areas. Companies and researchers have taken full advantages of various machine learning approaches to solve their problems; however, vast understanding and study of the field is required for developers to fully harvest the potential of different machine learning models and to achieve efficient results. Therefore, this thesis begins by comparing auto ml with other hyper-parameter optimization techniques. auto ml is a fully autonomous framework that lessens the knowledge prerequisite to accomplish complicated machine learning tasks. The auto ml framework automatically selects the best features from a given data set and chooses the best model to fit and predict the data. Through multiple tests, auto ml outperforms MLP and other similar frameworks in various datasets using small amount of processing time. The thesis then proposes and implements a stacking ensemble technique in order to build protection against over-fitting for small datasets into the auto ml framework. Stacking is a technique used to combine a collection of Machine Learning models’ predictions to arrive at a final prediction. The stacked auto ml ensemble results are more stable and consistent than the original framework; across different training sizes of all analyzed small datasets. / Master of Science / Machine learning is a concept of using known past data to predict unknown future data. Many different industries uses machine learning; hospitals use machine learning to find mutations in DNA, online retailers use machine learning to recommend items, and advertisers use machine learning to show interesting ads to viewers. With increasing adoption of machine learning in various fields, there are a significant number of developers who want to take advantages of this concept, but they are not deeply familiar with techniques used in machine learning. This thesis introduces auto_ml framework which reduces the required deep understanding of these techniques. auto_ml automatically selects the best technique to use for each individual process, which used to train and predict given datasets. In addition, the thesis also implements a stacking ensemble technique which helps to yield consistently good predictions on small datasets. As the result, auto_ml performs better than MLP and other frameworks. In addition, auto_ml with the stacking ensemble technique performs more consistently than auto_ml without the stacking ensemble technique. Machine learning Stacking Ensemble Model Selection Hyper-parameter optimization auto_ml
2	Stacking Ensemble Classification applied to US flight delay prediction during the COVID-19 pandemic Schwarz, Patrick January 2022 (has links) This thesis aims to show that a Stacking Ensemble of multiple base-learners can provide a more accurate prediction of commercial flight delays between the ten largest US airports than the individual prediction models. Three types of machine learning models, namely LASSO, Random Forests and Neural Networks are used as base-learners with different hyper- parameters. A Stacking Ensemble is created by using LASSO as meta-learner. The Stacking Ensemble and the base-learners that performed best on the training data are then evaluated on a test data set. The results are compared by the metrics accuracy, ROC AUC, MCC and F1 Score. It is shown that the Stacking Ensemble is able to provide superior predictions for flight delays in comparison to the best individual models. Stacking Ensemble Machine Learning Flight Delays Classification Probability Theory and Statistics Sannolikhetsteori och statistik
3	Differential evolution technique on weighted voting stacking ensemble method for credit card fraud detection Dolo, Kgaugelo Moses 12 1900 (has links) Differential Evolution is an optimization technique of stochastic search for a population-based vector, which is powerful and efficient over a continuous space for solving differentiable and non-linear optimization problems. Weighted voting stacking ensemble method is an important technique that combines various classifier models. However, selecting the appropriate weights of classifier models for the correct classification of transactions is a problem. This research study is therefore aimed at exploring whether the Differential Evolution optimization method is a good approach for defining the weighting function. Manual and random selection of weights for voting credit card transactions has previously been carried out. However, a large number of fraudulent transactions were not detected by the classifier models. Which means that a technique to overcome the weaknesses of the classifier models is required. Thus, the problem of selecting the appropriate weights was viewed as the problem of weights optimization in this study. The dataset was downloaded from the Kaggle competition data repository. Various machine learning algorithms were used to weight vote a class of transaction. The differential evolution optimization techniques was used as a weighting function. In addition, the Synthetic Minority Oversampling Technique (SMOTE) and Safe Level Synthetic Minority Oversampling Technique (SL-SMOTE) oversampling algorithms were modified to preserve the definition of SMOTE while improving the performance. Result generated from this research study showed that the Differential Evolution Optimization method is a good weighting function, which can be adopted as a systematic weight function for weight voting stacking ensemble method of various classification methods. / School of Computing / M. Sc. (Computing) Differentia evolution Weighted voting Stacking ensemble method Class distribution Data distribution SMOTE Machine learning Bid data Credit card fraud 364.163 Credit Card Fraud
4	Using Ensemble Machine Learning Methods in Estimating Software Development Effort Kanneganti, Alekhya January 2020 (has links) Background: Software Development Effort Estimation is a process that focuses on estimating the required effort to develop a software project with a minimal budget. Estimating effort includes interpretation of required manpower, resources, time and schedule. Project managers are responsible for estimating the required effort. A model that can predict software development effort efficiently comes in hand and acts as a decision support system for the project managers to enhance the precision in estimating effort. Therefore, the context of this study is to increase the efficiency in estimating software development effort. Objective: The main objective of this thesis is to identify an effective ensemble method to build and implement it, in estimating software development effort. Apart from this, parameter tuning is also implemented to improve the performance of the model. Finally, we compare the results of the developed model with the existing models. Method: In this thesis, we have adopted two research methods. Initially, a Literature Review was conducted to gain knowledge on the existing studies, machine learning techniques, datasets, ensemble methods that were previously used in estimating Software Development Effort. Then a controlled Experiment was conducted in order to build an ensemble model and to evaluate the performance of the ensemble model for determining if the developed model has a better performance when compared to the existing models. Results: After conducting literature review and collecting evidence, we have decided to build and implement stacked generalization ensemble method in this thesis, with the help of individual machine learning techniques like Support vector regressor (SVR), K-Nearest Neighbors regressor (KNN), Decision Tree Regressor (DTR), Linear Regressor (LR), Multi-Layer Perceptron Regressor (MLP) Random Forest Regressor (RFR), Gradient Boosting Regressor (GBR), AdaBoost Regressor (ABR), XGBoost Regressor (XGB). Likewise, we have decided to implement Randomized Parameter Optimization and SelectKbest function to implement feature section. Datasets like COCOMO81, MAXWELL, ALBERCHT, DESHARNAIS were used. Results of the experiment show that the developed ensemble model performs at its best, for three out of four datasets. Conclusion: After evaluating and analyzing the results obtained, we can conclude that the developed model works well with the datasets that have continuous, numeric type of values. We can also conclude that the developed ensemble model outperforms other existing models when implemented with COCOMO81, MAXWELL, ALBERCHT datasets. Software Development Effort Ensemble Ensemble Learning Stacking Ensemble Software Development Effort Estimation Machine Learning Effort Estimation Computer Sciences Datavetenskap (datalogi)

1

Page generated in 0.0712 seconds