Spelling suggestions: "subject:"densemble learning."" "subject:"dfensemble learning.""
31 |
Research on a Heart Disease Prediction Model Based on the Stacking PrincipleLi, Jianeng January 2020 (has links)
In this study, the prediction model based on the Stacking principle is called the Stacking fusion model. Little evidence demonstrates that the Stacking fusion model possesses better prediction performance in the field of heart disease diagnosis than other classification models. Since this model belongs to the family of ensemble learning models, which has a bad interpretability, it should be used with caution in medical diagnoses. The purpose of this study is to verify whether the Stacking fusion model has better prediction performance than stand-alone machine learning models and other ensemble classifiers in the field of heart disease diagnosis, and to find ways to explain this model. This study uses experiment and quantitative analysis to evaluate the prediction performance of eight models in terms of prediction ability, algorithmic stability, false negative rate and run-time. It is proved that the Stacking fusion model with Naive Bayes classifier, XGBoost and Random forest as the first-level learners is superior to other classifiers in prediction ability. The false negative rate of this model is also outstanding. Furthermore, the Stacking fusion model is explained from the working principle of the model and the SHAP framework. The SHAP framework explains this model’s judgement of the important factors that influence heart disease and the relationship between the value of these factors and the probability of disease. Overall, two research problems in this study help reveal the prediction performance and reliability of the cardiac disease prediction model based on the Stacking principle. This study provides practical and theoretical support for hospitals to use the Stacking principle in the diagnosis of heart disease.
|
32 |
Generalized super-resolution of 4D Flow MRI : extending capabilities using ensemble learning / Allmän superupplösning av 4D MRI Flöde : utvidgad användning genom ensemblelärandeHjalmarsson, Adam, Ericsson, Leon January 2023 (has links)
4D Flow Magnet Resonance Imaging (4D Flow MRI) is a novel non-invasive technique for imaging of cardiovascular blood flow. However, when utilized as a stand-alone analysis method, 4D Flow MRI has certain limitations including limited spatial resolution and noise artefacts, motivating the application of dedicated post-processing tools. Learning based super-resolution (SR) has here emerged as a promising utility for such work, however, more often than not, these efforts have been constricted to a narrowly defined cardiovascular domain. Rather, there has been limited exploration of how learned super-resolution models perform across \emph{multiple} cardiovascular compartments, with the wide range of hemodynamic compartments covered by the cardiovasculature as an apparent challenge. To address this, we investigate the generalizability of 4D Flow MRI super-resolution using ensemble learning. Our investigation employs ensemble learning techniques, specifically bagging and stacking, with a convolutional neural network (4DFlowNet) serving as the framework for all base learners. To assist in training, synthetic training data was extracted from patient-specific, physics-based velocity fields derived from computational fluid dynamic (CFD) simulations conducted in three key compartments: the aorta, brain and the heart. Varying base and ensemble networks were then trained on pairs of high-resolution and low-resolution synthetic data, with performance quantitatively assessed as a function of cardiovascular domain, and specific architecture. To ensure clinical relevance, we also evaluated model performance on clinically acquired MRI data from the very same three compartments. We find that ensemble models improve performance, as compared to isolated equivalents. Our ensemble model \textit{Stacking Block-3}, improves in-silico error rate by $16.22$ points across the average domain. Additionally, performance on the aorta, brain and heart improves by $2.66$, $5.81$ and $2.00$ points respectively. Employing both qualitative and quantitative evaluation methods on the in-vivo data, we find that ensemble models produce super-resolved velocity fields that are quantitatively coherent with ground truth reference data and visually pleasing. To conclude, ensemble learning has shown potential in generalizing 4D Flow MRI across multiple cardiovascular compartments.
|
33 |
Credit Card Approval Prediction : A comparative analysis between logistic regressionclassifier, random forest classifier, support vectorclassifier with ensemble bagging classifier.Janapareddy, Dhanush, Yenduri, Narendra Chowdary January 2023 (has links)
Background. Due to an increasing number of credit card defaulters, companies arenow taking greater precautions when approving credit applications. When a customermeets certain requirements, credit card firms typically use their experience todecide whether to grant them a credit card. Additionally, a few machine learningmethods have been applied to support the final decision. Objectives. The aim of this thesis is to compare the accuracy of logistic regressionclassifier, random forest classifier, and support vector classifier with the ensemblebagging classifier for predicting credit card approval. Methods. This thesis follows a method called general experimentation to determinethe most accurate classification technique for predicting credit card approval. Thedataset is taken from Kaggle, which contains information about credit card applications.The selected algorithms are trained with training data and validate themusing validation data then evaluate their performance on the testing data by usingmetrics such as accuracy, precision, recall, F1 score, and ROC curve. Now ensemblelearning bagging technique is applied to combine the predictions of these multiplemodels using majority voting to create an ensemble model. Finally, the performanceof the ensemble model was evaluated on the testing data and compared its accuracyto that of the individual models to identify the most accurate classification techniquefor predicting credit card approval. Results. Among the four selected machine learning algorithms, the random forestclassifier performed better with an accuracy of 88.41% on the testing dataset.The second-best algorithm is the ensemble bagging classifier, with an accuracy of84.78%. Hence, the random forest classifier is the most accurate algorithm for predictingcredit card approval. Conclusions. After evaluating various classifiers, including logistic regression classifier,random forest classifier, support vector classifier, and ensemble bagging, it wasobserved that the random forest classifier outperformed the other models in termsof predicting accuracy. This indicates that the random forest classifier was better atpredicting credit card approval.
|
34 |
N-SLOPE: A One-Class Classification Ensemble for Nuclear ForensicsKehl, Justin 01 June 2018 (has links) (PDF)
One-class classification is a specialized form of classification from the field of machine learning. Traditional classification attempts to assign unknowns to known classes, but cannot handle novel unknowns that do not belong to any of the known classes. One-class classification seeks to identify these outliers, while still correctly assigning unknowns to classes appropriately. One-class classification is applied here to the field of nuclear forensics, which is the study and analysis of nuclear material for the purpose of nuclear incident investigations. Nuclear forensics data poses an interesting challenge because false positive identification can prove costly and data is often small, high-dimensional, and sparse, which is problematic for most machine learning approaches. A web application is built using the R programming language and the shiny framework that incorporates N-SLOPE: a machine learning ensemble. N-SLOPE combines five existing one-class classifiers with a novel one-class classifier introduced here and uses ensemble learning techniques to combine output. N-SLOPE is validated on three distinct data sets: Iris, Obsidian, and Galaxy Serpent 3, which is an enhanced version of a recent international nuclear forensics exercise. N-SLOPE achieves high classification accuracy on each data set of 100%, 83.33%, and 83.33%, respectively, while minimizing false positive detection rate to 0% across the board and correctly detecting every single novel unknown from each data set. N-SLOPE is shown to be a useful and powerful tool to aid in nuclear forensic investigations.
|
35 |
CLEAVER: Classification of Everyday Activities via Ensemble RecognizersHsu, Samantha 01 December 2018 (has links) (PDF)
Physical activity can have immediate and long-term benefits on health and reduce the risk for chronic diseases. Valid measures of physical activity are needed in order to improve our understanding of the exact relationship between physical activity and health. Activity monitors have become a standard for measuring physical activity; accelerometers in particular are widely used in research and consumer products because they are objective, inexpensive, and practical. Previous studies have experimented with different monitor placements and classification methods. However, the majority of these methods were developed using data collected in controlled, laboratory-based settings, which is not reliably representative of real life data. Therefore, more work is required to validate these methods in free-living settings.
For our work, 25 participants were directly observed by trained observers for two two-hour activity sessions over a seven day timespan. During the sessions, the participants wore accelerometers on the wrist, thigh, and chest. In this thesis, we tested a battery of machine learning techniques, including a hierarchical classification schema and a confusion matrix boosting method to predict activity type, activity intensity, and sedentary time in one-second intervals. To do this, we created a dataset containing almost 100 hours worth of observations from three sets of accelerometer data from an ActiGraph wrist monitor, a BioStampRC thigh monitor, and a BioStampRC chest monitor. Random forest and k-nearest neighbors are shown to consistently perform the best out of our traditional machine learning techniques. In addition, we reduce the severity of error from our traditional random forest classifiers on some monitors using a hierarchical classification approach, and combat the imbalanced nature of our dataset using a multi-class (confusion matrix) boosting method. Out of the three monitors, our models most accurately predict activity using either or both of the BioStamp accelerometers (with the exception of the chest BioStamp predicting sedentary time). Our results show that we outperform previous methods while still predicting behavior at a more granular level.
|
36 |
Assessment of building renovations using Ensemble LearningLieutier, Paul January 2023 (has links)
In the context of global warming, to reduce energy consumption, an unavoidable policy is to renovate badly-isolated buildings. However, most studies concerning efficiency of renovation work do not rely on energy data from smart meters but rather on estimates. To develop a precise tool to assess the quality of renovation work, several ensemble models were tested and compared with existing ones. Each model learns the consumption habits before the date of the works and then predicts what the energy load curve would have been if the works had not been realized. The prediction is finally compared to the actual energy load to infer the savings over the same dataset. The models were compared using precision and time complexity metrics. The best ensemble model’s precision scores are equivalent to the state-of-the-art. Moreover, the developed model is 32 times quicker to fit and predict. / I samband med den globala uppvärmningen är det oundvikligt att renovera dåligt isolerade byggnader för att minska energiförbrukningen. De flesta studier om renoveringsarbetenas effektivitet bygger dock inte på energidata från smarta mätare utan snarare på uppskattningar. För att utveckla ett exakt verktyg för att bedöma kvaliteten på renoveringsarbeten har flera ensemblemodeller testats och jämförts med befintliga modeller. Varje modell lär sig förbrukningsvanorna före arbetena och förutspår sedan hur energibelastningskurvan skulle ha sett ut om arbetena inte hade genomförts. Prognosen jämförs slutligen med den faktiska energilasten för att härleda besparingarna för samma dataset. Modellerna jämfördes med hjälp av precision och tidskomplexitet. Den bästa ensemblemodellens precisionspoäng är likvärdig med den bästa modellen. Dessutom är den utvecklade modellen 32 gånger snabbare att anpassa och förutsäga
|
37 |
Road damage detection withYolov8 on Swedish roadsEriksson, Martin January 2023 (has links)
This thesis addresses the problem of Road Damage Detection using object detection models,Yolov8 and Yolov5. While Yolov5 has been utilized in prior road damage detection projects, thiswork introduces the application of the newly released Yolov8 model to this domain. We haveprepared a dataset of 3,000 annotated images of road damage in Sweden and applied variousYolov8 and Yolov5 models to this dataset and a larger international one. The potential ofdeploying a lightweight Yolov8 model in a smartphone application for real-time detection, aswell as the effectiveness of an ensemble approach combining several models, were alsoexplored. The results show an F1 score of 0.57 and 0.6 for the best-performing models on theSwedish dataset and an international Road damage dataset respectively. Several box clusteringmethods were tested to combine the predictions of the ensemble, but none outperformed thebest individual model. A Quantized version of Yolov8 was deployed to a smartphone device withsatisfying performance. This work aims to create a model which can ultimately be used toimprove road safety and quality.T
|
38 |
Semi-supervised Ensemble Learning Methods for Enhanced Prognostics and Health ManagementShi, Zhe 15 May 2018 (has links)
No description available.
|
39 |
Classification in High Dimensional Feature Spaces through Random Subspace EnsemblesPathical, Santhosh P. January 2010 (has links)
No description available.
|
40 |
Total Organic Carbon and Clay Estimation in Shale Reservoirs Using Automatic Machine LearningHu, Yue 21 September 2021 (has links)
High total organic carbon (TOC) and low clay content are two criteria to identify the "sweet spots" in shale gas plays. Recently, machine learning has been proved to be effective to estimate TOC and clay from well loggings. The remaining questions are what algorithm we should choose in the first place and whether we can improve the already built models. Automatic machine learning (AutoML) appears as a promising tool to solve those realistic questions by training multiple models and compares them automatically. Two wells with conventional well loggings and elemental capture spectroscopy are selected from a shale gas play to test the AutoML's ability in TOC and clay estimation. TOC and clay content are extracted from the Schlumberger's ELAN interpretation and calibrated to cores. Generalizability is proved in the blind test well and the mean absolute test errors for TOC and clay estimation are 0.23% and 3.77%. 829 data points are used to generate the final models with the train-test ratio of 75:25. The mean absolute test errors are 0.26% and 2.68% for TOC and clay, respectively, which are very low for TOC ranging from 0-6% and clay from 35-65%. The results show the AutoML's success and efficiency in the estimation. The trained models are interpreted to understand the variables effects in predictions. 235 wells are selected through data quality checking and feed into the models to create TOC and clay distribution maps. The maps provide guidance on where to drill a new well for higher shale gas production. / Master of Science / Locating "sweet spots", where the shale gas production is much higher than the average areas, is critical for a shale reservoir's successful commercial exploitation. Among the properties of shale, total organic carbon (TOC) and clay content are often selected to evaluate the gas production potential. For TOC and clay estimation, multiple machine learning models have been tested in recent studies and are proved successful. The questions are what algorithm to choose for a specific task and whether the already built models can be improved. Automatic machine learning (AutoML) has the potential to solve the problems by automatically training multiple models and comparing them to achieve the best performance. In our study, AutoML is tested to estimate TOC and clay using data from two gas wells in a shale gas field. First, one well is treated as blind test well and the other is used as trained well to examine the generalizability. The mean absolute errors for TOC and clay content are 0.23% and 3.77%, indicating reliable generalization. Final models are built using 829 data points which are split into train-test sets with the ratio of 75:25. The mean absolute test errors are 0.26% and 2.68% for TOC and clay, respectively, which are very low for TOC ranging from 0-6% and clay from 35-65%. Moreover, AutoML requires very limited human efforts and liberate researchers or engineers from tedious parameter-tuning process that is the critical part of machine learning. Trained models are interpreted to understand the mechanism behind the models. Distribution maps of TOC and clay are created by selecting 235 gas wells that pass the data quality checking, feeding them into trained models, and interpolating. The maps provide guidance on where to drill a new well for higher shale gas production.
|
Page generated in 0.0812 seconds