• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 139
  • 60
  • 27
  • 12
  • 12
  • 11
  • 9
  • 8
  • 4
  • 4
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 317
  • 317
  • 100
  • 86
  • 85
  • 64
  • 56
  • 46
  • 46
  • 41
  • 41
  • 40
  • 36
  • 34
  • 34
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
281

Peeking Through the Leaves : Improving Default Estimation with Machine Learning : A transparent approach using tree-based models

Hadad, Elias, Wigton, Angus January 2023 (has links)
In recent years the development and implementation of AI and machine learning models has increased dramatically. The availability of quality data paving the way for sophisticated AI models. Financial institutions uses many models in their daily operations. They are however, heavily regulated and need to follow the regulation that are set by central banks auditory standard and the financial supervisory authorities. One of these standards is the disclosure of expected credit losses in financial statements of banks, called IFRS 9. Banks must measure the expected credit shortfall in line with regulations set up by the EBA and FSA. In this master thesis, we are collaborating with a Swedish bank to evaluate different machine learning models to predict defaults of a unsecured credit portfolio. The default probability is a key variable in the expected credit loss equation. The goal is not only to develop a valid model to predict these defaults but to create and evaluate different models based on their performance and transparency. With regulatory challenges within AI the need to introduce transparency in models are part of the process. When banks use models there’s a requirement on transparency which refers to of how easily a model can be understood with its architecture, calculations, feature importance and logic’s behind the decision making process. We have compared the commonly used model logistic regression to three machine learning models, decision tree, random forest and XG boost. Where we want to show the performance and transparency differences of the machine learning models and the industry standard. We have introduced a transparency evaluation tool called transparency matrix to shed light on the different transparency requirements of machine learning models. The results show that all of the tree based machine learning models are a better choice of algorithm when estimating defaults compared to the traditional logistic regression. This is shown in the AUC score as well as the R2 metric. We also show that when models increase in complexity there is a performance-transparency trade off, the more complex our models gets the better it makes predictions. / Under de senaste ̊aren har utvecklingen och implementeringen av AI- och maskininl ̈arningsmodeller o ̈kat dramatiskt. Tillg ̊angen till kvalitetsdata banar va ̈gen fo ̈r sofistikerade AI-modeller. Finansiella institutioner anva ̈nder m ̊anga modeller i sin dagliga verksamhet. De a ̈r dock starkt reglerade och m ̊aste fo ̈lja de regler som faststa ̈lls av centralbankernas revisionsstandard och finansiella tillsynsmyndigheter. En av dessa standarder a ̈r offentligg ̈orandet av fo ̈rva ̈ntade kreditfo ̈rluster i bankernas finansiella rapporter, kallad IFRS 9. Banker m ̊aste ma ̈ta den fo ̈rva ̈ntade kreditfo ̈rlusten i linje med regler som faststa ̈lls av EBA och FSA. I denna uppsats samarbetar vi med en svensk bank fo ̈r att utva ̈rdera olika maskininl ̈arningsmodeller f ̈or att fo ̈rutsa ̈ga fallisemang i en blankokreditsportfo ̈lj. Sannolikheten fo ̈r fallismang ̈ar en viktig variabel i ekvationen fo ̈r fo ̈rva ̈ntade kreditfo ̈rluster. M ̊alet a ̈r inte bara att utveckla en bra modell fo ̈r att prediktera fallismang, utan ocks ̊a att skapa och utva ̈rdera olika modeller baserat p ̊a deras prestanda och transparens. Med de utmaningar som finns inom AI a ̈r behovet av att info ̈ra transparens i modeller en del av processen. Na ̈r banker anva ̈nder modeller finns det krav p ̊a transparens som ha ̈nvisar till hur enkelt en modell kan fo ̈rst ̊as med sin arkitektur, bera ̈kningar, variabel p ̊averkan och logik bakom beslutsprocessen. Vi har ja ̈mfo ̈rt den vanligt anva ̈nda modellen logistisk regression med tre maskininla ̈rningsmodeller: Decision trees, Random forest och XG Boost. Vi vill visa skillnaderna i prestanda och transparens mellan maskininl ̈arningsmodeller och branschstandarden. Vi har introducerat ett verktyg fo ̈r transparensutva ̈rdering som kallas transparensmatris fo ̈r att belysa de olika transparenskraven fo ̈r maskininla ̈rningsmodeller. Resultaten visar att alla tra ̈d-baserade maskininla ̈rningsmodeller a ̈r ett ba ̈ttre val av modell vid prediktion av fallisemang j ̈amfo ̈rt med den traditionella logistiska regressionen. Detta visas i AUC-score samt R2 va ̈rdet. Vi visar ocks ̊a att n ̈ar modeller blir mer komplexa uppst ̊ar en kompromiss mellan prestanda och transparens; ju mer komplexa v ̊ara modeller blir, desto ba ̈ttre blir deras prediktioner.
282

Neural Networks for Modeling of Electrical Parameters and Losses in Electric Vehicle

Fujimoto, Yo January 2023 (has links)
Permanent magnet synchronous machines have various advantages and have showed the most superiorperformance for Electric Vehicles. However, modeling them is difficult because of their nonlinearity. In orderto deal with the complexity, the artificial neural network and machine learning models including k-nearest neighbors, decision tree, random forest, and multiple linear regression with a quadratic model are developed to predict electrical parameters and losses as new prediction approaches for the performance of Volvo Cars’ electric vehicles and evaluate their performance. The test operation data of the Volvo Cars Corporation was used to extract and calculate the input and output data for each prediction model. In order to smooth the effects of each input variable, the input data was normalized. In addition, correlation matrices of normalized inputs were produced, which showed a high correlation between rotor temperature and winding resistance in the electrical parameter prediction dataset. They also demonstrated a strong correlation between the winding temperature and the rotor temperature in the loss prediction dataset.Grid search with 5-fold cross validation was implemented to optimize hyperparameters of artificial neuralnetwork and machine learning models. The artificial neural network models performed the best in MSE and R-squared scores for all the electrical parameters and loss prediction. The results indicate that artificial neural networks are more successful at handling complicated nonlinear relationships like those seen in electrical systems compared with other machine learning algorithms. Compared to other machine learning algorithms like decision trees, k-nearest neighbors, and multiple linear regression with a quadratic model, random forest produced superior results. With the exception of q-axis voltage, the decision tree model outperformed the knearestneighbors model in terms of parameter prediction, as measured by MSE and R-squared score. Multiple linear regression with a quadratic model produced the worst results for the electric parameters prediction because the relationship between the input and output was too complex for a multiple quadratic equation to deal with. Random forest models performed better than decision tree models because random forest ensemblehundreds of subset of decision trees and averaging the results. The k-nearest neighbors performed worse for almost all electrical parameters anticipation than the decision tree because it simply chooses the closest points and uses the average as the projected outputs so it was challenging to forecast complex nonlinear relationships. However, it is helpful for handling simple relationships and for understanding relationships in data. In terms of loss prediction, the k-nearest neighbors and decision tree produced similar results in MSE and R-squared score for the electric machine loss and the inverter loss. Their prediction results were worse than the multiple linear regression with a quadratic model, but they performed better than the multiple linear regression with a quadratic model, for forecasting the power difference between electromagnetic power and mechanical power.
283

Building Predictive Models for Stock Market Performance : En studie om maskininlärning och deras prestanda

Wennmark, Gabriel, Lindgren, Felix January 2023 (has links)
Today it is important for investors to identify which stocks that will result in positive returns in order for the right decision to be made when trading on the stock market. For decades it has been an area of interest for academics, and it is still challenging due to many difficulties and problems. A large number of studies has been carried out in machine learning and stock trading,where many of the studies has resulted in promising results despite these challenges. The aim of this study was to develop and evaluate predictive models for identifying stocks that outperform the Swedish market index OMXSPI. The research utilized a dataset of historical stock data and applied three various machine learning algorithms, Support Vector Machine, Logistic Regression and Decision Trees to predict if excess performance was met. With the help of ten-fold cross-validation and hyperparameter tuning the results were an IT-artefact that produced satisfying results. The results showed that hyperparameter tuning techniques marginally improved the metrics focused-on, namely accuracy and precision. The support vector machine model achieved an accuracy of 58,52% and a precision of 57,51%. The logistic regression model achieved an accuracy of 55,75% and a precision of 54,81%. Finally, the decision tree model which was the best performer, achieved an accuracy of 64,84% and a precision of 65,00%.
284

Spam Analysis and Detection for User Generated Content in Online Social Networks

Tan, Enhua 23 July 2013 (has links)
No description available.
285

[en] AUTOMATED SYNTHESIS OF OPTIMAL DECISION TREES FOR SMALL COMBINATORIAL OPTIMIZATION PROBLEMS / [pt] SÍNTESE AUTOMATIZADA DE ÁRVORES DE DECISÃO ÓTIMAS PARA PEQUENOS PROBLEMAS DE OTIMIZAÇÃO COMBINATÓRIA

CLEBER OLIVEIRA DAMASCENO 24 August 2021 (has links)
[pt] A análise de complexidade clássica para problemas NP-difíceis é geralmente orientada para cenários de pior caso, considerando apenas o comportamento assintótico. No entanto, existem algoritmos práticos com execução em um tempo razoável para muitos problemas clássicos. Além disso, há evidências que apontam para algoritmos polinomiais no modelo de árvore de decisão linear para resolver esses problemas, embora não muito explorados. Neste trabalho, exploramos esses resultados teóricos anteriores. Mostramos que a solução ótima para problemas combinatórios 0-1 pode ser encontrada reduzindo esses problemas para uma Busca por Vizinho Mais Próximo sobre o conjunto de vértices de Voronoi correspondentes. Utilizamos os hiperplanos que delimitam essas regiões para gerar sistematicamente uma árvore de decisão que repetidamente divide o espaço até que possa separar todas as soluções, garantindo uma resposta ótima. Fazemos experimentos para testar os limites de tamanho para os quais podemos construir essas árvores para os casos do 0-1 knapsack, weighted minimum cut e symmetric traveling salesman. Conseguimos encontrar as árvores desses problemas com tamanhos até 10, 5 e 6, respectivamente. Obtemos também as relações de adjacência completas para os esqueletos dos politopos do knapsack e do traveling salesman até os tamanhos 10 e 7. Nossa abordagem supera consistentemente o método de enumeração e os métodos baseline para o weighted minimum cut e symmetric traveling salesman, fornecendo soluções ótimas em microssegundos. / [en] Classical complexity analysis for NP-hard problems is usually oriented to worst-case scenarios, considering only the asymptotic behavior. However, there are practical algorithms running in a reasonable time for many classic problems. Furthermore, there is evidence pointing towards polynomial algorithms in the linear decision tree model to solve these problems, although not explored much. In this work, we explore previous theoretical results. We show that the optimal solution for 0-1 combinatorial problems can be found by reducing these problems into a Nearest Neighbor Search over the set of corresponding Voronoi vertices. We use the hyperplanes delimiting these regions to systematically generate a decision tree that repeatedly splits the space until it can separate all solutions, guaranteeing an optimal answer. We run experiments to test the size limits for which we can build these trees for the cases of the 0-1 knapsack, weighted minimum cut, and symmetric traveling salesman. We manage to find the trees of these problems with sizes up to 10, 5, and 6, respectively. We also obtain the complete adjacency relations for the skeletons of the knapsack and traveling salesman polytopes up to size 10 and 7. Our approach consistently outperforms the enumeration method and the baseline methods for the weighted minimum cut and symmetric traveling salesman, providing optimal solutions within microseconds.
286

A Study of an Iterative User-Specific Human Activity Classification Approach

Fürderer, Niklas January 2019 (has links)
Applications for sensor-based human activity recognition use the latest algorithms for the detection and classification of human everyday activities, both for online and offline use cases. The insights generated by those algorithms can in a next step be used within a wide broad of applications such as safety, fitness tracking, localization, personalized health advice and improved child and elderly care.In order for an algorithm to be performant, a significant amount of annotated data from a specific target audience is required. However, a satisfying data collection process is cost and labor intensive. This also may be unfeasible for specific target groups as aging effects motion patterns and behaviors. One main challenge in this application area lies in the ability to identify relevant changes over time while being able to reuse previously annotated user data. The accurate detection of those user-specific patterns and movement behaviors therefore requires individual and adaptive classification models for human activities.The goal of this degree work is to compare several supervised classifier performances when trained and tested on a newly iterative user-specific human activity classification approach as described in this report. A qualitative and quantitative data collection process was applied. The tree-based classification algorithms Decision Tree, Random Forest as well as XGBoost were tested on custom based datasets divided into three groups. The datasets contained labeled motion data of 21 volunteers from wrist worn sensors.Computed across all datasets, the average performance measured in recall increased by 5.2% (using a simulated leave-one-subject-out cross evaluation) for algorithms trained via the described approach compared to a random non-iterative approach. / Sensorbaserad aktivitetsigenkänning använder sig av det senaste algoritmerna för detektion och klassificering av mänskliga vardagliga aktiviteter, både i uppoch frånkopplat läge. De insikter som genereras av algoritmerna kan i ett nästa steg användas inom en mängd nya applikationer inom områden så som säkerhet, träningmonitorering, platsangivelser, personifierade hälsoråd samt inom barnoch äldreomsorgen.För att en algoritm skall uppnå hög prestanda krävs en inte obetydlig mängd annoterad data, som med fördel härrör från den avsedda målgruppen. Dock är datainsamlingsprocessen kostnadsoch arbetsintensiv. Den kan dessutom även vara orimlig att genomföra för vissa specifika målgrupper, då åldrandet påverkar rörelsemönster och beteenden. En av de största utmaningarna inom detta område är att hitta de relevanta förändringar som sker över tid, samtidigt som man vill återanvända tidigare annoterad data. För att kunna skapa en korrekt bild av det individuella rörelsemönstret behövs därför individuella och adaptiva klassificeringsmodeller.Målet med detta examensarbete är att jämföra flera olika övervakade klassificerares (eng. supervised classifiers) prestanda när dem tränats med hjälp av ett iterativt användarspecifikt aktivitetsklassificeringsmetod, som beskrivs i denna rapport. En kvalitativ och kvantitativ datainsamlingsprocess tillämpades. Trädbaserade klassificeringsalgoritmerna Decision Tree, Random Forest samt XGBoost testades utifrån specifikt skapade dataset baserade på 21 volontärer, som delades in i tre grupper. Data är baserad på rörelsedata från armbandssensorer.Beräknat över samtlig data, ökade den genomsnittliga sensitiviteten med 5.2% (simulerad korsvalidering genom utelämna-en-individ) för algoritmer tränade via beskrivna metoden jämfört med slumpvis icke-iterativ träning.
287

Predicting House Prices on the Countryside using Boosted Decision Trees / Förutseende av huspriser på landsbygden genom boostade beslutsträd

Revend, War January 2020 (has links)
This thesis intends to evaluate the feasibility of supervised learning models for predicting house prices on the countryside of South Sweden. It is essential for mortgage lenders to have accurate housing valuation algorithms and the current model offered by Booli is not accurate enough when evaluating residence prices on the countryside. Different types of boosted decision trees were implemented to address this issue and their performances were compared to traditional machine learning methods. These different types of supervised learning models were implemented in order to find the best model with regards to relevant evaluation metrics such as root-mean-squared error (RMSE) and mean absolute percentage error (MAPE). The implemented models were ridge regression, lasso regression, random forest, AdaBoost, gradient boosting, CatBoost, XGBoost, and LightGBM. All these models were benchmarked against Booli's current housing valuation algorithms which are based on a k-NN model. The results from this thesis indicated that the LightGBM model is the optimal one as it had the best overall performance with respect to the chosen evaluation metrics. When comparing the LightGBM model to the benchmark, the performance was overall better, the LightGBM model had an RMSE score of 0.330 compared to 0.358 for the Booli model, indicating that there is a potential of using boosted decision trees to improve the predictive accuracy of residence prices on the countryside. / Denna uppsats ämnar utvärdera genomförbarheten hos olika övervakade inlärningsmodeller för att förutse huspriser på landsbygden i Södra Sverige. Det är viktigt för bostadslånsgivare att ha noggranna algoritmer när de värderar bostäder, den nuvarande modellen som Booli erbjuder har dålig precision när det gäller värderingar av bostäder på landsbygden. Olika typer av boostade beslutsträd implementerades för att ta itu med denna fråga och deras prestanda jämfördes med traditionella maskininlärningsmetoder. Dessa olika typer av övervakad inlärningsmodeller implementerades för att hitta den bästa modellen med avseende på relevanta prestationsmått som t.ex. root-mean-squared error (RMSE) och mean absolute percentage error (MAPE). De övervakade inlärningsmodellerna var ridge regression, lasso regression, random forest, AdaBoost, gradient boosting, CatBoost, XGBoost, and LightGBM. Samtliga algoritmers prestanda jämförs med Boolis nuvarande bostadsvärderingsalgoritm, som är baserade på en k-NN modell. Resultatet från denna uppsats visar att LightGBM modellen är den optimala modellen för att värdera husen på landsbygden eftersom den hade den bästa totala prestandan med avseende på de utvalda utvärderingsmetoderna. LightGBM modellen jämfördes med Booli modellen där prestandan av LightGBM modellen var i överlag bättre, där LightGBM modellen hade ett RMSE värde på 0.330 jämfört med Booli modellen som hade ett RMSE värde på 0.358. Vilket indikerar att det finns en potential att använda boostade beslutsträd för att förbättra noggrannheten i förutsägelserna av huspriser på landsbygden.
288

Estimating Per-pixel Classification Confidence of Remote Sensing Images

Jiang, Shiguo 19 December 2012 (has links)
No description available.
289

A Comparative Study of Machine Learning Algorithms

Le Fort, Eric January 2018 (has links)
The selection of machine learning algorithm used to solve a problem is an important choice. This paper outlines research measuring three performance metrics for eight different algorithms on a prediction task involving under- graduate admissions data. The algorithms that were tested are k-nearest neighbours, decision trees, random forests, gradient tree boosting, logistic regression, naive bayes, support vector machines, and artificial neural net- works. These algorithms were compared in terms of accuracy, training time, and execution time. / Thesis / Master of Applied Science (MASc)
290

Grön AI : En analys av maskininlärningsalgoritmers prestanda och energiförbrukning

Berglin, Caroline, Ellström, Julia January 2024 (has links)
Trots de framsteg som gjorts inom artificiell intelligens (AI) och maskininlärning (ML), uppkommer utmaningar gällande deras miljöpåverkan. Fokuset på att skapa avancerade och träffsäkra modeller innebär ofta att omfattande beräkningsresurser krävs, vilket leder till en hög energiförbrukning. Syftet med detta arbete är att undersöka ämnet grön AI och sambandet mellan prestanda och energiförbrukning hos två ML-algoritmer. De algoritmer som undersöks är beslutsträd och stödvektormaskin (SVM), med hjälp av två dataset: Bank Marketing och MNIST. Prestandan mäts med utvärderingsmåtten noggrannhet, precision, recall och F1-poäng, medan energiförbrukningen mäts med verktyget Intel VTune Profiler. Arbetets resultat visar att en högre prestanda resulterade i en högre energiförbrukning, där SVM presterade bäst men också förbrukade mest energi i samtliga tester. Vidare visar resultatet att optimering av modellerna resulterade både i en förbättrad prestanda men också i en ökad energiförbrukning. Samma resultat kunde ses när ett större dataset användes. Arbetet anses inte bidra med resultat eller riktlinjer som går att generalisera till andra arbeten. Däremot bidrar arbetet med en förståelse och medvetenhet kring miljöaspekterna gällande AI, vilket kan användas som en grund för att undersöka ämnet vidare. Genom en ökad medvetenhet kan ett gemensamt ansvar tas för att utveckla AI-lösningar som inte bara är kraftfulla och effektiva, utan också hållbara. / Despite the advancements made in artificial intelligence (AI) and machine learning (ML), challenges regarding their environmental impact arise. The focus on creating advanced and accurate models often requires extensive computational resources, leading to a high energy consumption. The purpose of this work is to explore the topic of green AI and the relationship between performance and energy consumption of two ML algorithms. The algorithms being evaluated are decision trees and support vector machines (SVM), using two datasets: Bank Marketing and MNIST. Performance is measured using the evaluation metrics accuracy, precision, recall, and F1-score, while energy consumption is measured using the Intel VTune Profiler tool. The results show that higher performance resulted in higher energy consumption, with SVM performing the best but also consuming the most energy in all tests. Furthermore, the results show that optimizing the models resulted in both improved performance and increased energy consumption. The same results were observed when a larger dataset was used. This work is not considered to provide results or guidelines that can be generalized to other studies. However, it contributes to an understanding and awareness of the environmental aspects of AI, which can serve as a foundation for further exploration of the topic. Through increased awareness, shared responsibility can be taken to develop AI solutions that are not only powerful and efficient but also sustainable.

Page generated in 0.0542 seconds