Global ETD Search

21	Machine Learning for Outcome Prediction of High-Risk Trauma Patients in the Emergency Department Cardosi, Joshua David January 2021 (has links) No description available. Mechanical Engineering machine learning emergency department critical care mortality missing data neural network XGBoost LightGBM
22	CAN STATISTICAL MODELS BEAT BENCHMARK PREDICTIONS BASED ON RANKINGS IN TENNIS? Svensson, William January 2021 (has links) The aim of this thesis is to beat a benchmark prediction of 64.58 percent based on player rankings on the ATP tour in tennis. That means that the player with the best rank in a tennis match is deemed as the winner. Three statistical model are used, logistic regression, random forest and XGBoost. The data are over a period between the years 2000-2010 and has over 60 000 observations with 49 variables each. After the data was prepared, new variables were created and the difference between the two players in hand taken all three statistical models did outperform the benchmark prediction. All three variables had an accuracy around 66 percent with the logistic regression performing the best with an accuracy of 66.45 percent. The most important variable overall for the models is the total win rate on different surfaces, the total win rate and rank. Logistic Regression Random Forest XGBoost ATP tour Probability Theory and Statistics Sannolikhetsteori och statistik
23	ASSESSING PREDICTION CONDITIONS ANDSEQUENTIAL CLASSIFICATION IN ICU SEPSISPREDICTION Lind, Petter January 2023 (has links) Patients admitted to intensive care units (ICUs) often have a higher risk of sepsis due to weakened immune systems. Early sepsis diagnosis is crucial for timely treatment, emphasizing the need to improve the predictive capabilities of sepsis prediction models. Although machine learning models have demonstrated success in predicting sepsis onset, there is limited work done on how model assessment is affected by sequential prediction rather than evaluating on one prediction per patient. This thesis assesses the effectiveness of the evaluation procedures employed by such models and explore different prediction conditions to enhance sepsis prediction. Data was collected from the MIMIC-IV data set,and includes variables commonly used in real ICU settings relevant to sepsis diagnosis. Random onset matching is used to select time points for patients with and without sepsis, with the data analyzed using XGBoost. Evaluation metrics are calculated both once per patient, and is compared to sequential measurements for all patients from 40 hours before sepsis up until sepsis onset. Results shows that a model trained on data close to sepsis onset has strong predictive performance up to 25 hours before sepsis onset. In addition,different restrictive conditions on predictions are considered and evaluated. As the test set is limited it is important that the results are validated further, as it could provide insights regarding interpretation in the practical implementation of similar prediction models for support of healthcare professionals through timely interventions. Sepsis Prediction Sequential Prediction Conditional Predictions XGBoost Probability Theory and Statistics Sannolikhetsteori och statistik
24	Predicting and classifying atrial fibrillation from ECG recordings using machine learning Bogstedt, Carl January 2023 (has links) Atrial fibrillation is one of the most common types of heart arrhythmias, which can cause irregular, weak and fast atrial contractions up to 600 beats per minute. Atrial fibrillation has increased prevalence with age and is associated with increased risks of ischemia, as blood clots can form due to the weak contractions. During prolonged periods of atrial fibrillation, the atria can undergo a process called atrial remodelling. This causes electrophysiological and structural changes to the atria such as increased atrial size and changes to calcium ion densities. These changes themselves promotes the initiation and propagation of atrial fibrillation, which makes early detection crucial. Fortunately, atrial fibrillation can be detected on an electrocardiogram. Electrocardiograms measures the electrical activity of the heart during its cardiac cycle. This includes the initiation of the action potential, the depolarization of the atria and ventricles and their repolarization. On the electrocardiogram recording, these are seen as peaks and valleys, where each peak and valley can be traced back to one of these events. This means that during atrial fibrillation, the weak, irregular and fast atrial contractions can all be detected and measured. The aim of this project was to develop a machine learning model that could predict onset of atrial fibrillation, and that could classify ongoing atrial fibrillation. This was achieved by training one multiclass classification machine learning model using XGBoost, and three binary classification machine learning models using ROSETTA, on electrocardiogram recordings of people with and without atrial fibrillation. XGBoost is a tree boosting system which uses tree-like structures to classify data, while ROSETTA is a rule-based classification model which creates rules in an IF and THEN format to make decisions. The recordings were labelled according to three different classes: no atrial fibrillation, atrial fibrillation or preceding atrial fibrillation. The XGBoost model had a prediction accuracy of 99.3%, outperforming the three ROSETTA models and other atrial fibrillation classification and prediction models found. The ROSETTA models had high accuracies on the learning set, however, the predictions were subpar, indicating faulty settings for this type of data. The results in this project indicate that the models created can be used to accurately classify and predict onset of and ongoing atrial fibrillation, serving as a tool for early detection and verification of diagnosis. Bioinformatics Machine Learning Electrocardiogram Classification Rough Sets XGBoost Atrial Fibrillation Bioinformatics (Computational Biology) Bioinformatik (beräkningsbiologi)
25	Recommendation System for Insurance Policies : An Investigation of Unsupervised and Supervised Learning Techniques Palmgren, Andreas January 2023 (has links) Recommendation systems have significantly influenced user experiences across various industries, yet their application in the insurance sector remains relatively unexplored. This thesis focuses on developing a car insurance recommendation system that implements a `consumers like you' feature. The study initially employs a clustering-based recommendation system due to missing labels in an offline environment. However, challenges emerge, such as determining the optimal number of clusters and managing complex data. Additionally, the inability to effectively update based on feedback and lower predictive performance compared to supervised methods necessitated exploring supervised alternatives. In response, this thesis proposes a methodology where the unsupervised approach simulates consumer behavior in an offline environment. Supervised alternatives are pre-trained on the clustering-based system to replicate it and come with the ability to be fine-tuned based on live traffic. Three supervised alternatives — KNN, XGBoost, and a neural network — are developed and compared. Given the supervised recommendation system adaptability based on feedback, supervised methods can provide more accurate, personalized recommendations in the insurance domain. The XGBoost and neural network-based recommendation systems were able to replicate the unsupervised approach, and their expressive power makes them valid candidate models to further evaluate on live traffic. The thesis concludes with the potential to both improve and adapt these recommendation systems to other insurance types, marking a significant step toward more personalized, user-friendly insurance services. Recommendation System Car Insurance Machine Learning Cluster Analysis KNN XGBoost Neural Network Mathematics Matematik
26	Improving House Price Prediction Models: Exploring the Impact of Macroeconomic Features Holmqvist, Martin, Hansson, Max January 2023 (has links) This thesis investigates if house price prediction models perform better when adding macroe- conomic features to a data set with only house-specific features. Previous research has shown that tree-based models perform well when predicting house prices, especially the algorithms random forest and XGBoost. It is common to rely entirely on house-specific features when training these models. However, studies show that macroeconomic variables such as interest rate, inflation, and GDP affect house prices. Therefore it makes sense to include them in these models and study if they outperform the more traditional models with only house-specific features. The thesis also investigates which algorithm, out of random forest and XGBoost is better at predicting house prices. The results show that the mean absolute error is lower for the XGBoost and random forest models trained on data with macroeconomic features. Furthermore, XGBoost outperformed random forest regardless of the set of features. In Con- clusion, the suggestion is to include macroeconomic features and use the XGBoost algorithm when predicting house prices. Machine learning random forest xgboost macroeconomic features house prices Probability Theory and Statistics Sannolikhetsteori och statistik
27	Application of Machine Learning to Financial Trading Horemuz, Michal January 2018 (has links) Machine learning methods have become powerful tools used in multiple industries. They have been successfully applied to problems such as image recognition, speech recognition and machine translation, among others. In this report, we investigated several machine learning methods for forecasting five different bond indexes. We have implemented and analyzed Feedforward Neural Nets, LSTMs, Q-Networks and Gradient Boosted Trees, and compared them to the Buy&Hold strategy. We performed manual feature extraction based on some popular features used in the industry. The features were extracted from several financial instruments and were used as predictor variables. The results showed that XGBoost and Feedforward Neural Networks were consistently able to beat the Buy&Hold strategy for three of five bond indexes. / Maskininlärningsmetoder har blivit kraftfulla verktyg som används i flera problemområden. De har framgångsrikt tillämpats på problem som bland annat bildigenkänning, taligenkänning och maskinöversättning. I denna rapport har vi undersökt flera maskininlärningsmetoder för att förutse fem olika obligationsindex. Vi har implementerat och analyserat Feedforward Neural Nets, LSTMs, Q-Networks och Gradient Boosted Trees, och jämfört dem med Buy\&Hold strategin. Vi har utfört manuell extraktion av features baserat på några populära funktioner som används inom industrin. Dessa features beräknades från flera finansiella instrument och användes som prediktorvariabler. Resultaten visar att XGBoost och Feedforward Neural Networks kan konsekvent slå Buy\&Hold strategin för tre av fem obligationsindex. Machine Learning Financial Trading XGBoost Bond index Michal Horemuz Computer Sciences Datavetenskap (datalogi)
28	Automation of price prediction using machine learning in a large furniture company Ghorbanali, Mojtaba January 2022 (has links) The accurate prediction of the price of products can be highlybeneficial for the procurers both businesses wised and productionwise. Many companies today, in various fields ofoperations and sizes, have access to a vast amount of datathat valuable information can be extracted from them. In thismaster thesis, some large databases of products in differentcategories have been analyzed. Because of confidentiality, thelabels from the database that are in this thesis are subtitled bysome general titles and the real titles are not mentioned. Also,the company is not referred to by name, but the whole job iscarried out on the real data set of products. As a real-worlddata set, the data was messy and full of nulls and missing data.So, the data wrangling took some more time. The approachesthat were used for the model were Regression methods andGradient Boosting models.The main purpose of this master thesis was to build priceprediction models based on the features of each item to assistwith the initial positioning of the product and its initial price.The best result that was achieved during this master thesiswas from XGBoost machine learning model with about 96%accuracy which can be beneficial for the producer to acceleratetheir pricing strategies. Price Prediction Machine learning Regression analysis Gradient boosting algorithms LightGBM XGBoost Computer Sciences Datavetenskap (datalogi)
29	Using Machine Learning as a Tool to Improve Train Wheel Overhaul Efficiency Gert, Oskar January 2020 (has links) This thesis develops a method for using machine learning in a industrial pro-cess. The implementation of this machine learning model aimed to reduce costsand increase efficiency of train wheel overhaul in partnership with the AustrianFederal Railroads, Oebb. Different machine learning models as well as categoryencodings were tested to find which performed best on the data set. In addition,differently sized training sets were used to determine whether size of the trainingset affected the results. The implementation shows that Oebb can save moneyand increase efficiency of train wheel overhaul by using machine learning andthat continuous training of prediction models is necessary because of variationsin the data set. Machine Learning Industry 4.0 Hyperparameter tuning XGBoost Media and Communication Technology Medieteknik
30	Physics-guided Machine Learning Approaches for Applications in Geothermal Energy Prediction Shahdi, Arya 03 June 2021 (has links) In the area of geothermal energy mapping, scientists have used physics-based models and bottom-hole temperature measurements from oil and gas wells to generate heat flow and temperature-at-depth maps. Given the uncertainties and simplifying assumptions associated with the current state of physics-based models used in this field, this thesis explores an alternate approach for locating geothermally active regions using machine learning methods coupled with physics knowledge of geothermal energy problems, in the emerging field of physics-guided machine learning. There are two primary contributions of this thesis. First, we present a thorough analysis of using state-of-the-art machine learning models to predict a subsurface geothermal parameter, temperature-at-depth, using a rich geo-spatial dataset across the Appalachian Basin. Specifically, we explore a suite of machine learning algorithms such as neural networks (DNN), Ridge regression (R-reg) models, and decision-tree-based models (e.g., XGBoost and Random Forest). We found that XGBoost and Random Forests result in the highest accuracy for subsurface temperature prediction. We also ran our model on a fine spatial grid to provide 2D continuous temperature maps at three different depths using the XGBoost model, which can be used to locate prospective geothermally active regions. Second, we develop a physics-guided machine learning model for predicting subsurface temperatures that not only uses surface temperature, thermal conductivity coefficient, and depth as input parameters, but also the heat-flux parameter that is known to be a potent indicator of temperature-at-depth values according to physics knowledge of geothermal energy problems. Since, there is no independent easy-to-use method for observing heat-flux directly or inferring it from other observed variables. We develop an innovative approach to take into account heat-flux parameters through a physics-guided clustering-regression model. Specifically, the bottom-hole temperature data is initially clustered into multiple groups based on the heat-flux parameter using Gaussian mixture model (GMM). This is followed by training neural network regression models using the data within each constant heat-flux region. Finally, a KNN classifier is trained for cluster membership prediction. Our preliminary results indicate that our proposed approach results in lower errors as the number of clusters increases because the heat-flux parameter is indirectly accounted for in the machine learning model. / Master of Science / Machine learning and artificial intelligence have transformed many research fields and industries. In this thesis, we investigate the applicability of machine learning and data-driven approaches in the field of geothermal energy exploration. Given the uncertainties and simplifying assumptions associated with the current state of physics-based models, we show that machine learning can provide viable alternative solutions for geothermal energy mapping. First, we explore a suite of machine learning algorithms such as neural networks (DNN), Ridge regression (R-reg) models, and decision-tree based models (e.g., XGBoost and Random Forest). We find that XGBoost and Random Forests result in the highest accuracy for subsurface temperature prediction. Accuracy measures show that machine learning models are at par with physics-based models and can even outperform the thermal conductivity model. Second, we incorporate the thermal conductivity theory with machine learning and propose an innovative clustering-regression approach in the emerging area of physics-guided machine learning that results in a smaller error than black-box machine learning methods. Renewable Energy Geothermal Energy Machine learning XGBoost Subsurface temperature geothermal gradient

Search results