Global ETD Search

81	A Machine Learning Assessment to Predict the Sediment Transport Rate Under Oscillating Sheet Flow Conditions Vu, Huy 01 December 2019 (has links) The two-phase flow approach has been the conventional method designed to study the sediment transport rate. Due to the complexity of sediment transport, the precisely numerical models computed from that approach require initial assumptions and, as a result, may not yield accurate output for all conditions. This research work proposes that Machine Learning algorithms can be an alternative way to predict the processes of sediment transport in two-dimensional directions under oscillating sheet flow conditions, by utilizing the available dataset of the SedFoam multidimensional two-phase model. The assessment utilized linear regression and gradient boosting algorithm to analyze the lowest average mean squared error in each case and search for the best partition method based on the domain height of the simulation setup. Computer Sciences
82	Predicting misuse of subscription tranquilizers : A comparasion of regularized logistic regression, Adaptive Bossting and support vector machines Norén, Ida January 2022 (has links) Tranquilizer misuse is a behavior associated with substance use disorder. As of now there is only one published article that includes a predictive model on misuse of subscription tranquilizers. The aim of this study is to predict ongoing tranquilizer misuse whilst comparing three different methods of classification; (1) regularized logistic regression, (2) adaptive boosting and (3) support vector machines. Data from the National Survey of Drug Use and Health (NSDUH) from 2019 is used to predict misuse among the individuals in the sample from 2020. The regularized logistic regression and the support vector machines models both yield an AUC of 0.88, which is slightly higher than the adaptive boosting model. However, the support vector machine model yields a higher level of sensitivity, meaning that it is better at detecting individuals who misuse. Although the difference in performance between the methods is relatively small and is most likely caused by the fact that different methods perform differently depending on the characteristics of the data. Adaptive Boosting Benzodiazepines classification lasso regularization logistic regression support vector machines Probability Theory and Statistics Sannolikhetsteori och statistik
83	Bias Mitigation Techniques and a Cost-Aware Framework for Boosted Ranking Algorithms Salomon, Sophie 02 June 2020 (has links) No description available. Computer Science Artificial Intelligence
84	A Direct Algorithm for the K-Nearest-Neighbor Classifier via Local Warping of the Distance Metric Neo, TohKoon 30 November 2007 (has links) (PDF) The k-nearest neighbor (k-NN) pattern classifier is a simple yet effective learner. However, it has a few drawbacks, one of which is the large model size. There are a number of algorithms that are able to condense the model size of the k-NN classifier at the expense of accuracy. Boosting is therefore desirable for increasing the accuracy of these condensed models. Unfortunately, there does not exist a boosting algorithm that works well with k-NN directly. We present a direct boosting algorithm for the k-NN classifier that creates an ensemble of models with locally modified distance weighting. An empirical study conducted on 10 standard databases from the UCI repository shows that this new Boosted k-NN algorithm has increased generalization accuracy in the majority of the datasets and never performs worse than standard k-NN. computer science machine learning k nearest neighbor knn boosting Computer Sciences
85	Automation of price prediction using machine learning in a large furniture company Ghorbanali, Mojtaba January 2022 (has links) The accurate prediction of the price of products can be highlybeneficial for the procurers both businesses wised and productionwise. Many companies today, in various fields ofoperations and sizes, have access to a vast amount of datathat valuable information can be extracted from them. In thismaster thesis, some large databases of products in differentcategories have been analyzed. Because of confidentiality, thelabels from the database that are in this thesis are subtitled bysome general titles and the real titles are not mentioned. Also,the company is not referred to by name, but the whole job iscarried out on the real data set of products. As a real-worlddata set, the data was messy and full of nulls and missing data.So, the data wrangling took some more time. The approachesthat were used for the model were Regression methods andGradient Boosting models.The main purpose of this master thesis was to build priceprediction models based on the features of each item to assistwith the initial positioning of the product and its initial price.The best result that was achieved during this master thesiswas from XGBoost machine learning model with about 96%accuracy which can be beneficial for the producer to acceleratetheir pricing strategies. Read more Price Prediction Machine learning Regression analysis Gradient boosting algorithms LightGBM XGBoost Computer Sciences Datavetenskap (datalogi)
86	Comparative Analysis of Surrogate Models for the Dissolution of Spent Nuclear Fuel Awe, Dayo 01 May 2024 (has links) (PDF) This thesis presents a comparative analysis of surrogate models for the dissolution of spent nuclear fuel, with a focus on the use of deep learning techniques. The study explores the accuracy and efficiency of different machine learning methods in predicting the dissolution behavior of nuclear waste, and compares them to traditional modeling approaches. The results show that deep learning models can achieve high accuracy in predicting the dissolution rate, while also being computationally efficient. The study also discusses the potential applications of surrogate modeling in the field of nuclear waste management, including the optimization of waste disposal strategies and the design of more effective containment systems. Overall, this research highlights the importance of surrogate modeling in improving our understanding of nuclear waste behavior and developing more sustainable waste management practices. spent nuclear fuel random forest regression boosting methods surrogate model machine learning Physical Sciences and Mathematics
87	Machine Learning and Telematics for Risk Assessment in Auto Insurance Ekström, Frithiof, Chen, Anton January 2020 (has links) Pricing models for car insurance traditionally use variables related to the policyholder and the insured vehicle (e.g. car brand and driver age) to determine the premium. This can lead to situations where policyholders belonging to a group that is seen as carrying a higher risk for accidents wrongfully get a higher premium, even if the higher risk might not necessarily apply on a per- individual basis. Telematics data offers an opportunity to look at driving behavior during individual trips, enabling a pricing model that can be customized to each policyholder. While these additional variables can be used in a generalized linear model (GLM) similar to the traditional pricing models, machine learning methods can possibly unravel non-linear connections between the variables. Using telematics data, we build a gradient boosting model (GBM) and a neural network (NN) to predict the claim frequency of policyholders on a monthly basis. We find that both GBMs and NNs offer predictive power that can be generalized to data that has not been used in the training of the models. The results of the study also show that telematics data play a considerable role in the model predictions, and that the frequency and distance of trips are important factors in determining the risk using these models. / Prissättningsmodeller för bilförsäkringar använder traditionellt variabler relaterade till försäkringstagaren och det försäkrade fordonet (t.ex. bilmärke och förarålder) för att bestämma försäkringspremien. Detta kan leda till situationer där försäkringstagare som tillhör en grupp som anses bära på en högre risk för olyckor får en felaktigt hög premie, även om den högre risken inte nödvändigtvis gäller på en individbasis. Telematikdata erbjuder en möjlighet att titta på körbeteende under individuella resor, vilket möjliggör en prissättningsmodell som kan anpassas till varje enskild försäkringstagare. Ä ven om dessa variabler kan användas i en linjär modell liknande de traditionella prissättningsmodellerna kan användandet av maskininlärningsmetoder möjligen avslöja icke-linjära samband mellan variablerna. Med hjälp av telematikdata bygger vi en modell baserad på gradient boosting (GBM) och ett neuralt nätverk (NN) för att förutsäga frekvensen av olyckor för försäkringstagare på månadsbasis. Vi kommer fram till att båda modeller har en prediktiv förmåga som går att generalisera till data som inte har använts vid träningen av modellerna. Resultaten av studien visar även att telematikdata spelar en betydande roll i modellernas prediktioner, samt att frekvensen och sträckan av resor är viktiga faktorer vid bedömningen av risken med hjälp av dessa modeller. Read more Telematics for car insurance Gradient Boosting Machine Neural Network Machine Learning Computer and Information Sciences Data- och informationsvetenskap
88	Data Analytics using Regression Models for Health Insurance Market place Data Killada, Parimala January 2017 (has links) No description available. Computer Science
89	Learning to Rank Algorithms and Their Application in Machine Translation Xia, Tian January 2015 (has links) No description available. Computer Engineering learning to rank boosting regression tree algorithm analysis machine translation training
90	Statistical learning and predictive modeling in data mining Li, Bin 13 September 2006 (has links) No description available. Statistics Bayesian robustness Boosting Flat-tailed prior distribution Interpretation MART Statistical learning

Search results