Spelling suggestions: "subject:"[een] BOOSTING"" "subject:"[enn] BOOSTING""
121 |
Predicting Risk of Delays in Postal Deliveries with Neural Networks and Gradient Boosting Machines / Predicering av risk för förseningar av leveranser med neurala nätverk och gradient boosting machinesSöderholm, Matilda January 2020 (has links)
This thesis conducts a study on a data set from the Swedish and Danish postal service Postnord, comparing an artificial neural network (ANN) and a gradient boosting machine (GBM) for predicting delays in package deliveries. The models are evaluated based on F1-score for the important class which represents the data points that are delayed and needed to be identified. The GBM is already implemented and tuned using grid search by Postnord, the ANN is tuned using sequential model based optimization with the tree Parzen estimator function. Furthermore, it is trained using dynamic resampling to handle the imbalanced data set. Even with several measures implemented to handle the class imbalance, the ANN performs poorly when tested on unseen data, unlike the GBM. The GBM has high precision (84%) and decent recall (24%), which produces a F1-score of 0.38. The ANN has high recall (62%) but extremely low precision (5%) which gives a F1-score of 0.08, indicating that it is biased to predict sample as delayed when it is in time. The GBM has a natural handling of class imbalance unlike the ANN, and even with measures taken to improve the ANN and its handling of class imbalance, GBM performs better.
|
122 |
Predicting profitability of new customers using gradient boosting tree models : Evaluating the predictive capabilities of the XGBoost, LightGBM and CatBoost algorithmsKinnander, Mathias January 2020 (has links)
In the context of providing credit online to customers in retail shops, the provider must perform risk assessments quickly and often based on scarce historical data. This can be achieved by automating the process with Machine Learning algorithms. Gradient Boosting Tree algorithms have demonstrated to be capable in a wide range of application scenarios. However, they are yet to be implemented for predicting the profitability of new customers based solely on the customers’ first purchases. This study aims to evaluate the predictive performance of the XGBoost, LightGBM, and CatBoost algorithms in this context. The Recall and Precision metrics were used as the basis for assessing the models’ performance. The experiment implemented for this study shows that the model displays similar capabilities while also being biased towards the majority class.
|
123 |
AdaBoost v počítačovém vidění / AdaBoost in Computer VisionHradiš, Michal Unknown Date (has links)
In this thesis, we present the local rank differences (LRD). These novel image features are invariant to lighting changes and are suitable for object detection in programmable hardware, such as FPGA. The performance of AdaBoost classifiers with the LRD was tested on a face detection dataset with results which are similar to the Haar-like features which are the state of the art in real-time object detection. These results together with the fact that the LRD are evaluated much faster in FPGA then the Haar-like features are very encouraging and suggest that the LRD may be a solution for future hardware object detectors. We also present a framework for experiments with boosting methods in computer vision. This framework is very flexible and, at the same time, offers high learning performance and a possibility for future parallelization. The framework is available as open source software and we hope that it will simplify work for other researchers.
|
124 |
Investigating the Impact of Air Pollution, Meteorology, and Human Mobility on Excess Deaths during COVID-19 in Quito : A Correlation, Regression, Machine Learning, and Granger Causality AnalysisTariq, Waleed, Naqvi, Sehrish January 2023 (has links)
Air pollution and meteorological conditions impact COVID-19 mortality rates. This research studied Quito, Ecuador, using Granger causality tests and regression models to investigate the relationship between pollutants, meteorological variables, human mobility, and excess deaths. Results suggested that Mobility as defined by Google Mobility Index, Facebook Isolation Index, in addition to Nitrogen Dioxide, and Sulphur Dioxide significantly impact excess deaths, while Carbon Monoxide and Relative Humidity have mixed results. Measures to reduce Carbon Monoxide emissions and increase humidity levels may mitigate the impact of air pollution on COVID-19 mortality rates. Further research is needed to investigate the impact of pollutants on COVID-19 transmission in other locations. Healthcare decision-makers must monitor and mitigate the impact of pollutants, promote healthy air quality policies, and encourage physical activity in safe environments. They must also consider meteorological conditions and implement measures such as increased ventilation and air conditioning to reduce exposure. Additionally, they must consider human mobility and reduce it to slow the spread of the diseases. Decisionmakers must monitor and track excess deaths during the pandemic to understand the impact of pollutants, meteorological conditions, and human mobility on human health. Public education is critical to raising awareness of air quality and its impact on health. Encouraging individuals to reduce their exposure to pollutants and meteorological conditions can play a critical role in mitigating the impact of air pollution on respiratory health during the pandemic.
|
125 |
Models for fitting correlated non-identical bernoulli random variables with applications to an airline data problemPerez Romo Leroux, Andres January 2021 (has links)
Our research deals with the problem of devising models for fitting non- identical dependent Bernoulli variables and using these models to predict fu- ture Bernoulli trials.We focus on modelling and predicting random Bernoulli response variables which meet all of the following conditions:
1. Each observed as well as future response corresponds to a Bernoulli trial
2. The trials are non-identical, having possibly different probabilities of occurrence
3. The trials are mutually correlated, with an underlying complex trial cluster correlation structure. Also allowing for the possible partitioning of trials within clusters into groups. Within cluster - group level correlation is reflected in the correlation structure.
4. The probability of occurrence and correlation structure for both ob- served and future trials can depend on a set of observed covariates.
A number of proposed approaches meeting some of the above conditions are present in the current literature. Our research expands on existing statistical and machine learning methods.
We propose three extensions to existing models that make use of the above conditions. Each proposed method brings specific advantages for dealing with
correlated binary data. The proposed models allow for within cluster trial grouping to be reflected in the correlation structure. We partition sets of trials into groups either explicitly estimated or implicitly inferred. Explicit groups arise from the determination of common covariates; inferred groups arise via imposing mixture models. The main motivation of our research is in modelling and further understanding the potential of introducing binary trial group level correlations. In a number of applications, it can be beneficial to use models that allow for these types of trial groupings, both for improved predictions and better understanding of behavior of trials.
The first model extension builds on the Multivariate Probit model. This model makes use of covariates and other information from former trials to determine explicit trial groupings and predict the occurrence of future trials. We call this the Explicit Groups model.
The second model extension uses mixtures of univariate Probit models. This model predicts the occurrence of current trials using estimators of pa- rameters supporting mixture models for the observed trials. We call this the Inferred Groups model.
Our third methods extends on a gradient descent based boosting algorithm which allows for correlation of binary outcomes called WL2Boost. We refer to our extension of this algorithm as GWL2Boost.
Bernoulli trials are divided into observed and future trials; with all trials having associated known covariate information. We apply our methodology to the problem of predicting the set and total number of passengers who will not show up on commercial flights using covariate information and past passenger data.
The models and algorithms are evaluated with regards to their capac- ity to predict future Bernoulli responses. We compare the models proposed against a set of competing existing models and algorithms using available air- line passenger no-show data. We show that our proposed algorithm extension GWL2Boost outperforms top existing algorithms and models that assume in- dependence of binary outcomes in various prediction metrics. / Statistics
|
126 |
Characterizing Intentional and Unintentional Drug-Drug Interactions to Improve the Pharmacokinetics of Ibrutinib and VenetoclaxEisenmann, Eric Daniel January 2021 (has links)
No description available.
|
127 |
A Framework for Defining, Measuring, and Predicting Service Procurement SavingsBerggren, Oliver, Matti, Zina January 2021 (has links)
Recent technical advances have paved the way for transformations such as Industry 4.0, Supply Chain 4.0, and new ways for organizations to utilize services to meet the needs of people. In the midst of this shift, a focus has been put on service procurement to meet the demand of everything from cloud computing and information technology to software solutions that support operations or add value to the end customer. Procurement is an integral part of organizations and typically accounts for a substantial part of their costs. Analyzing savings is one of the primary ways of measuring cost reduction and performance. This paper examines how savings can be defined and measured in a unifying way, and determine if machine learning can be used to predict service purchase costs. Semi-structured interviews were utilized to find definitions and measurements. Three decision-tree ensemble machine learning models, XGBoost, LightGBM, and CatBoost were evaluated to study cost prediction. The result indicates that cost reduction and cost avoidance should be seen as a financial, and a performance measure, respectively. Spend and capital binding can be controlled by a budget reallocation system and could be improved further with machine learning cost prediction. The best performing model was XGBoost with a MAPE of 14.17%, compared to the base model’s MAPE of 40.24%. This suggests that budget setting and negotiation can be aided by more accurately predicting cost through machine learning, and in turn have a positive impact on an organization’s resource allocation and profitability. / Nya teknologiska framsteg har gett upphov till transformationer som Industri 4.0, Supply Chain 4.0 och nya satt för organisationer att använda tjänster för att möta människors behov. Från denna förändring har fokus hamna på tjänsteupphandling för att möta efterfrågan på allt från molntjänster och informationsteknologi till mjukvarulösningar som stödjer operationer eller skapar värde för slutkunder. Upphandling ar en väsentlig del av organisationer och utgör oftast en stor del av deras kostnader. Att mata besparingar är ett av de primära sätten att driva kostnadsreducering och prestanda. Detta arbete utforskar hur besparingar kan definieras och matas på ett förenande sätt och undersöker om maskininlärning kan användas för att predicera tjänsteinköpskostnader. Semistrukturerade intervjuer hölls för att hitta definitioner och mått. Tre maskininlärningsmodeller, XGBoost, LightGMB och CatBoost utvärderades för att studera kostnadsprediktion. XGBoost presterade bäst med MAPE 14,17%, jämfört med basmodellens MAPE på 40,24%. Detta tyder på att budgetsättning och förhandling kan stödjas av maskininlärning genom att mer precist predicera kostnader, som i sin tur kan ha en positiv påverkan på en organisations resursallokering och lönsamhet.
|
128 |
Evolutionary Learning of Boosted Features for Visual Inspection AutomationZhang, Meng 01 March 2018 (has links)
Feature extraction is one of the major challenges in object recognition. Features that are extracted from one type of objects cannot always be used directly for a different type of objects, therefore limiting the performance of feature extraction. Having an automatic feature learning algorithm could be a big advantage for an object recognition algorithm. This research first introduces several improvements on a fully automatic feature construction method called Evolution COnstructed Feature (ECO-Feature). These improvements are developed to construct more robust features and make the training process more efficient than the original version. The main weakness of the original ECO-Feature algorithm is that it is designed only for binary classification and cannot be directly applied to multi-class cases. We also observe that the recognition performance depends heavily on the size of the feature pool from which features can be selected and the ability of selecting the best features. For these reasons, we have developed an enhanced evolutionary learning method for multi-class object classification to address these challenges. Our method is called Evolutionary Learning of Boosted Features (ECO-Boost). ECO-Boost method is an efficient evolutionary learning algorithm developed to automatically construct highly discriminative image features from the training image for multi-class image classification. This unique method constructs image features that are often overlooked by humans, and is robust to minor image distortion and geometric transformations. We evaluate this algorithm with a few visual inspection datasets including specialty crops, fruits and road surface conditions. Results from extensive experiments confirm that ECO-Boost performs closely comparable to other methods and achieves a good balance between accuracy and simplicity for real-time multi-class object classification applications. It is a hardware-friendly algorithm that can be optimized for hardware implementation in an FPGA for real-time embedded visual inspection applications.
|
129 |
Housing Price Prediction over Countrywide Data : A comparison of XGBoost and Random Forest regressor modelsHenriksson, Erik, Werlinder, Kristopher January 2021 (has links)
The aim of this research project is to investigate how an XGBoost regressor compares to a Random Forest regressor in terms of predictive performance of housing prices with the help of two data sets. The comparison considers training time, inference time and the three evaluation metrics R2, RMSE and MAPE. The data sets are described in detail together with background about the regressor models that are used. The method makes substantial data cleaning of the two data sets, it involves hyperparameter tuning to find optimal parameters and 5foldcrossvalidation in order to achieve good performance estimates. The finding of this research project is that XGBoost performs better on both small and large data sets. While the Random Forest model can achieve similar results as the XGBoost model, it needs a much longer training time, between 2 and 50 times as long, and has a longer inference time, around 40 times as long. This makes it especially superior when used on larger sets of data. / Målet med den här studien är att jämföra och undersöka hur en XGBoost regressor och en Random Forest regressor presterar i att förutsäga huspriser. Detta görs med hjälp av två stycken datauppsättningar. Jämförelsen tar hänsyn till modellernas träningstid, slutledningstid och de tre utvärderingsfaktorerna R2, RMSE and MAPE. Datauppsättningarna beskrivs i detalj tillsammans med en bakgrund om regressionsmodellerna. Metoden innefattar en rengöring av datauppsättningarna, sökande efter optimala hyperparametrar för modellerna och 5delad korsvalidering för att uppnå goda förutsägelser. Resultatet av studien är att XGBoost regressorn presterar bättre på både små och stora datauppsättningar, men att den är överlägsen när det gäller stora datauppsättningar. Medan Random Forest modellen kan uppnå liknande resultat som XGBoost modellen, tar träningstiden mellan 250 gånger så lång tid och modellen får en cirka 40 gånger längre slutledningstid. Detta gör att XGBoost är särskilt överlägsen vid användning av stora datauppsättningar.
|
130 |
Trends and Limits of Two-Stage Boosting Systems for Automotive Diesel EnginesVarnier ., Olivier Nicolás 26 July 2012 (has links)
Internal combustion engines developments are driven by emissions reduction and energetic efficiency increase. To reach the next standards, downsized/downspeeded engines are required to reduce fuel consumption and CO2 emissions. These techniques place an important demand on the charging system and force the introduction of multistage boosting architectures. With many possible arrangements and large number of parameter to optimize, these architectures present higher complexity than current systems. The objective of this thesis has thus been to investigate the potential of two-stage boosting architectures to establish, for the particular case of passenger car downsized/downspeeded Diesel engines, the most efficient solutions for achieving the forthcoming CO2 emissions targets.
To respond to this objective, an exhaustive literature review of all existing solutions has first been performed to determinate the most promising two-stage boosting architectures. Then, a new matching methodology has been defined to optimize the architectures with, on the one hand the development of a new turbine characteristic maps representation allowing straight forward matching calculations and, on the other hand, the development of a complete 0D engine model able to predict, within a reduced computational time, the behavior of any boosting architecture in both steady state and transient operating conditions. Finally, a large parametric study has been carried out to analyze and compare the different architectures on the same base engines, to characterize the impacts of thermo-mechanical limits and turbocharger size on engine performance, and to quantify for different engine development options their potential improvements in term of fuel consumption, maximum power and fun to drive.
As main contributions, the thesis provides new modeling tools for efficient matching calculations and synthesizes the main trends in advanced boosting systems to guide future passenger car Diesel engine develop / Varnier ., ON. (2012). Trends and Limits of Two-Stage Boosting Systems for Automotive Diesel Engines [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/16880
|
Page generated in 0.0313 seconds