Global ETD Search

881	Credit Scoring using Machine Learning Approaches Chitambira, Bornvalue January 2022 (has links) This project will explore machine learning approaches that are used in creditscoring. In this study we consider consumer credit scoring instead of corporatecredit scoring and our focus is on methods that are currently used in practiceby banks such as logistic regression and decision trees and also compare theirperformance against machine learning approaches such as support vector machines (SVM), neural networks and random forests. In our models we addressimportant issues such as dataset imbalance, model overfitting and calibrationof model probabilities. The six machine learning methods we study are support vector machine, logistic regression, k-nearest neighbour, artificial neuralnetworks, decision trees and random forests. We implement these models inpython and analyse their performance on credit dataset with 30000 observations from Taiwan, extracted from the University of California Irvine (UCI)machine learning repository. Credit Scoring Logistic Regression Decision Trees Artificial Neural Networks Random forests Support Vector Machine k-nearest neighbour cross validation imbalanced dataset Mathematics Matematik
882	Machine learning and statistical analysis in fuel consumption prediction for heavy vehicles / Maskininlärning och statistisk analys för prediktion av bränsleförbrukning i tunga fordon Almér, Henrik January 2015 (has links) I investigate how to use machine learning to predict fuel consumption in heavy vehicles. I examine data from several different sources describing road, vehicle, driver and weather characteristics and I find a regression to a fuel consumption measured in liters per distance. The thesis is done for Scania and uses data sources available to Scania. I evaluate which machine learning methods are most successful, how data collection frequency affects the prediction and which features are most influential for fuel consumption. I find that a lower collection frequency of 10 minutes is preferable to a higher collection frequency of 1 minute. I also find that the evaluated models are comparable in their performance and that the most important features for fuel consumption are related to the road slope, vehicle speed and vehicle weight. / Jag undersöker hur maskininlärning kan användas för att förutsäga bränsleförbrukning i tunga fordon. Jag undersöker data från flera olika källor som beskriver väg-, fordons-, förar- och väderkaraktäristiker. Det insamlade datat används för att hitta en regression till en bränsleförbrukning mätt i liter per sträcka. Studien utförs på uppdrag av Scania och jag använder mig av datakällor som är tillgängliga för Scania. Jag utvärderar vilka maskininlärningsmetoder som är bäst lämpade för problemet, hur insamlingsfrekvensen påverkar resultatet av förutsägelsen samt vilka attribut i datat som är mest inflytelserika för bränsleförbrukning. Jag finner att en lägre insamlingsfrekvens av 10 minuter är att föredra framför en högre frekvens av 1 minut. Jag finner även att de utvärderade modellerna ger likvärdiga resultat samt att de viktigaste attributen har att göra med vägens lutning, fordonets hastighet och fordonets vikt. machine learning statistical analysis data science fuel consumption prediction support vector regression artificial neural networks random forest linear regression Computer Sciences Datavetenskap (datalogi)
883	Evaluating supervised machine learning algorithms to predict recreational fishing success : A multiple species, multiple algorithms approach / Utvärdering av övervakade maskininlärningsalgoritmer för att förutsäga framgång inom sportfiske Wikström, Johan January 2015 (has links) This report examines three different machine learning algorithms and their effectiveness for predicting recreational fishing success. Recreational fishing is a huge pastime but reliable methods of predicting fishing success have largely been missing. This report compares random forest, linear regression and multilayer perceptron to a reasonable baseline model for predicting fishing success. Fishing success is defined as the expected weight of the fish caught. Previous reports have mainly focused on commercial fishing or limited the research to examining the impact of a single variable. In this exploratory study, multiple attributes and multiple algorithms are examined to determine if supervised machine learning is a viable tool to predict recreational fishing success. Recreational fishing success can potentially be predicted by a large number of attributes, which may be different for different species. In this report, data is fetched from multiple sources and combined into a unified data format. The primary source of data is a database from the fishing app FishBrain, containing data of over 250000 logged catches. Another is the World Weather Online API which supplies weather data. The report focuses on the four most common species in the database, largemouth bass, Micropterus salmoides, northern pike, Esox lucius, rainbow trout, Oncorhynchus mykiss and European perch, Perca fluviatilis with a focus on largemouth bass since it has the most data available. Algorithms are evaluated using the Weka data mining software. Hyperparameters are found using cross-validation and some data is used as a test set to validate the results after cross-validation. Results are measured as the error compared to a baseline algorithm. Random forest is the most effective algorithm in the experiments, reducing error compared to the baseline for all the examined fish species. It is also found that no single variable affects the chosen metric of fishing success much, but rather a combination of most of the examined variables is needed to give optimal predictions. In conclusion, the random forest algorithm can be used to predict fishing success across multiple species. It performs significantly better than linear regression, multilayer perceptron and the baseline on crossvalidation and on the testing set. / I denna rapport evalueras tre olika maskininlärningsalgoritmer och deras effektivitet för att förutsäga framgång inom sportfiske. Sport- fiske är en mycket populär hobby, men pålitliga metoder att förutsäga framgångsrikt sportfiske saknas. Denna rapport jämför random forest, linjär regression och flerlagers neurala nätverk mot en rimlig baselinealgorithm för att förutsäga framgång inom sportfiske. Framgång defineras som fiskens förväntade vikt i kg. Tidigare undersökningar har huvudsakligen fokuserat på kommersiellt fiske eller begränsat undersökningen till påverkan av en enskild variabel. I denna studie undersöks flera attribut och algoritmer för att avgöra om övervakad maskininlärning är ett användbart verktyg för att förutsäga framgång inom sportfiske. Framgång inom sportfiske kan potentiellt påverkas av ett stort antal attribut som kan vara olika för olika arter. I denna studie hämtas data från ett flertal källor som kombineras i ett unifierat dataformat. Den primära datakällan är en databas tillhörande sportfiskeappen FishBrain som innehåller över 250000 loggade fångster. En annan källa är World Weather Online:s API som bidrar med väderdata. Rapporten fokuserar på de fyra vanligaste arterna i databasen, largemouth bass, Micropterus salmoides, gädda, Esox lucius, regnbågsöring, Oncorhynchus mykiss och europeisk abborre, Perca fluviatilis med ett särskilt fokus på largemouth bass eftersom den har mest data tillgängligt. Algoritmerna evalueras med hjälp av data mining-verktyget Weka. Hyperparametrar bestäms med hjälp av korsvalidering och en delmängd av datan separeras och används för att validera resultaten efter korsvalidering. Resultaten mäts relativt en baseline-algoritm. Random forest är den mest effektiva algoritmen i experimenten och reducerar felet jämfört med baseline-algoritmen för alla undersökta fiskarter. Inget enskilt attribut påverkar slutresultatet mycket utan det behövs en kombination av flera attribut för att ge optimala prediktioner. Slutsatsen blir att random forest kan användas för att förutsäga framgång inom sportfiske för flera olika fiskarter. Den presterar signifikant bättre än linjär regression, flerlagers neuralt nätverk och baselinealgoritmen på korsvalidering och på testdelmängden. sport fishing recreational fishing fishing supervised machine learning random forest linear regression artificial neural networks sportfiske fiske Computer Sciences Datavetenskap (datalogi)
884	Essential Reservoir Computing Griffith, Aaron January 2021 (has links) No description available. Physics reservoir computing dynamical systems chaotic systems time-series prediction machine learning recurrent neural networks artificial neural networks echo state networks vector autoregression
885	Using deep learning time series forecasting to predict dropout in childhood obesity treatment / Förutsägelse av bortfall i ett behandlingsprogram för barnfetma med hjälp av djupinlärda tidsserieförutsägelser Schoerner, Jacob January 2021 (has links) The author investigates the performance of a time series based approach in predicting the risk of patients abandoning treatment in a treatment program for childhood obesity. The time series based approach is compared and contrasted to an approach based on static features (which has been applied in similar problems). Four machine learning models are constructed; one ‘Main model’ using both time series forecasting and three ‘reference models’ created by removing or exchanging parts of the main model to test the performance of using only time series forecasting or only static features in the prediction. The main model achieves an ROC-AUC of 0.77 on the data set. ANOVA testing is used to determine whether the four models perform differently. A difference cannot be verified at the significance level of 0.05, and thus, the author concludes that the project cannot show either an advantage or a disadvantage to employing a time series based approach over static features in this problem. / Författaren jämför modeller baserade på tidsserieförutsägelser med modeller baserade på statiska, fasta värden, till syfte att identifera patienter som riskerar att lämna ett behandlingsprogram för barnfetma. Fyra maskininlärningsmodeller konstrueras, en ‘Huvudmodell’ som använder sig av både tidsserieförutsägelser och statiska värden, och tre modeller som bryter ut delar av huvudmodellen för undersöka beteendet i modeller baserade enbart på statiska värden respektive enbart baserade på tidsserieförutsägelser. Huvudmodellen uppnår ROC-AUC0.77 på datasetet. ANOVA(variansanalys) används för att avgöra huruvida de fyra modellernas resultat skiljer sig, och en skillnad kan ej verieras vid P = 0:05. Följaktligen drar författaren slutsatsen att projektet inte har kunnat visa vare sig en signifikant fördel eller nackdel med att använda sig av tidsserieförutsägelser inom den aktuella problemdomänen. electronic health record time-series dropout prediction artificial neural networks gated recurrent unit electronic health record tidsserieförutsägelser deltagandebortfall artificiella neurala nätverk gated recurrent unit Computer Sciences Datavetenskap (datalogi)
886	Estimation neuronale de l'information mutuelle. Belghazi, Mohamed 09 1900 (has links) Nous argumentons que l'estimation de l'information mutuelle entre des ensembles de variables aléatoires continues de hautes dimensionnalités peut être réalisée par descente de gradient sur des réseaux de neurones. Nous présentons un estimateur neuronal de l'information mutuelle (MINE) dont la complexité croît linéairement avec la dimensionnalité des variables et la taille de l'échantillon, entrainable par retro-propagation, et fortement consistant au sens statistique. Nous présentons aussi une poignée d'application ou MINE peut être utilisé pour minimiser ou maximiser l'information mutuelle. Nous appliquons MINE pour améliorer les modèles génératifs adversariaux. Nous utilisons aussi MINE pour implémenter la méthode du goulot d'étranglement de l'information dans un cadre de classification supervisé. Nos résultats montrent un gain substantiel en flexibilité et performance. / We argue that the estimation of mutual information between high dimensional continuous random variables can be achieved by gradient descent over neural networks. We present a Mutual Information Neural Estimator (MINE) that is linearly scalable in dimensionality as well as in sample size, trainable through back-prop, and strongly consistent. We present a handful of applications on which MINE can be used to minimize or maximize mutual information. We apply MINE to improve adversarially trained generative models. We also use MINE to implement the Information Bottleneck, applying it to supervised classification; our results demonstrate substantial improvement in flexibility and performance in the settings. Réseau de neurones artificiels Artificial neural networks Théorie de l'information Information theory Modèle génératif Generative model
887	Modelamiento hidrológico de caudales medios mensuales en cuencas sin información hidrométrica aplicando el método Lutz Scholz y las redes neuronales artificiales, en la microcuenca Huajuiri - Oropesa - Antabamba - Apurímac / Hydrological modeling of monthly average flows in basins without hydrometric information applying Lutz Scholz method and artificial neural networks, in Huajuiri micro basin - Oropesa - Antabamba - Apurímac Zárate Torres, Cynthia 30 December 2020 (has links) El problema central planteado en la presente tesis de investigación es la falta de información hidrométrica en algunas cuencas del Perú, en este caso en la microcuenca Huajuiri, localizado en el distrito de Oropesa, provincia de Antabamba, departamento de Apurímac; puesto que al existir un déficit de estaciones hidrométricas respecto a la cantidad de cuencas existentes a nivel nacional, la información hidrométrica es insuficiente y deficiente, lo que trae como consecuencia no contar con datos como los caudales medios mensuales, información hidrológica relevante para poder conocer la disponibilidad del agua en la cuenca para efectuar la distribución del recurso hídrico de acuerdo al requerimiento de las comunidades campesinas aledañas a la fuente de agua, es útil para diseñar futuras obras hidráulicas, así como para realizar proyecciones respecto al comportamiento hídrico de la cuenca. Es por esta razón que se plantea el cálculo de los caudales medios mensuales en cuencas que no disponen de datos hidrométricos, utilizando el método Lutz Scholz y las redes neuronales artificiales. En la cuenca con información hidrométrica (La Angostura) con los modelamientos hidrológicos planteados, se obtuvieron valores cercanos a los caudales medios mensuales medidos. Sin embargo, en la cuenca sin información hidrométrica (Huajuiri), la inclusión en el modelamiento hidrológico de las redes neuronales artificiales permitió obtener valores más cercanos a los aforos realizados, que solamente aplicando el método Lutz Scholz. / Central problem exposed in this thesis is lack of hydrometric information in some basins of Peru, in our case in Huajuiri micro basin, located in Oropesa district, Antabamba province, Apurímac department, there is a deficit of hydrometric stations with respect to the number of existing basins at national level, hydrometric information is insufficient and deficient, which results in not having data such as average monthly flows, relevant hydrological information to be able to know the water availability of basin in order to distribute water according to requirements of rural communities surrounding the water source, it is useful to design future hydraulic construction, as well as to carry out projections regarding the water behavior of basin. It is for this reason that calculation of average monthly flows in basins without information is proposed using Lutz Scholz method and artificial neural networks. In basin with hydrometric information (La Angostura), with proposed hydrological modeling, values were obtained close to average monthly flows. However, in basin without hydrometric information (Huajuiri), inclusion in hydrological modeling of artificial neural networks allowed obtaining values closer to flow measurement, than only by applying Lutz Scholz method. / Tesis Redes Neuronales Artificiales Información hidrométrica Cuenca Método Lutz Scholz Artificial neural networks Hydrometric information
888	On Teaching Quality Improvement of a Mathematical Topic Using Artificial Neural Networks Modeling (With a Case Study) Mustafa, Hassan M., Al-Hamadi, Ayoub 07 May 2012 (has links) This paper inspired by simulation by Artificial Neural Networks (ANNs) applied recently for evaluation of phonics methodology to teach children "how to read". A novel approach for teaching a mathematical topic using a computer aided learning (CAL) package applied at educational field (a children classroom). Interesting practical results obtained after field application of suggested CAL package with and without associated teacher''s voice. Presented study highly recommends application of a novel teaching trend based on behaviorism and individuals'' learning styles. That is to improve quality of children mathematical learning performance. info:eu-repo/classification/ddc/510 ddc:510
889	The application of Eulerian laser Doppler vibrometry to the on-line condition monitoring of axial-flow turbomachinery blades Oberholster, Abraham Johannes (Abrie) 24 June 2010 (has links) The on-line condition monitoring of turbomachinery blades is of utmost importance to ensure the long term health and availability of such machines and as such has been an area of study since the late 1960s. As a result a number of on-line blade vibration measurement techniques are available, each with its own associated advantages and shortcomings. In general, on-blade sensor measurement techniques suffer from sensor lifespan, whereas non-contact techniques usually have measurement bandwidth limitations. One non-contact measurement technique that yields improvements in the area of measurement bandwidth is laser Doppler vibrometry. This thesis presents results and findings from utilizing laser Doppler vibrometry in an Eulerian fashion (i.e. a fixed reference frame) to measure on-line blade vibrations in axial-flow turbomachinery. With this measurement approach, the laser beam is focussed at a fixed point in space and measurements are available for the periods during which each blade sweeps through the beam. The characteristics of the measurement technique are studied analytically with an Euler-Bernoulli cantilever beam and experimental verification is performed. An approach for the numerical simulation of the measurement technique is then presented. Associated with the presented measurement technique are the short periods during which each blade is exposed to the laser beam. This characteristic yields traditional frequency domain signal processing techniques unsuitable for providing useful blade health indicators. To obtain frequency domain information from such short signals, it is necessary to employ non-standard signal processing techniques such as non-harmonic Fourier analysis. Results from experimental testing on a single-blade test rotor at a single rotor speed are presented in the form of phase angle trends obtained with non-harmonic Fourier analysis. Considering the maximum of absolute unwrapped phase angle trends around various reference frequencies, good indicators of blade health deterioration were obtained. These indicators were verified numerically. To extend the application of this condition monitoring approach, measurements were repeated on a five-blade test rotor at four different rotor speeds. Various damage cases were considered as well as different ELDV measurement positions. Using statistical parameters of the abovementioned indicators as well as time domain parameters, it is shown that with this condition monitoring approach, blade damage can successfully be identified and quantified with the aid of artificial neural networks. / Thesis (PhD)--University of Pretoria, 2010. / Mechanical and Aeronautical Engineering / unrestricted Artificial neural networks Finite element modelling Phase angle trends Non-harmonic Fourier analysis Condition monitoring On-line blade vibration Lagrangian measurements Eulerian measurements Laser doppler vibrometry UCTD
890	Lithium-Ion Battery State of Charge Modelling based on Neural Networks Chukka, Vasu 06 April 2022 (has links) Lithium-ion (Li-ion) batteries have become a crucial factor in the recent electro-mobility trend. People's increased interest in electric vehicles (EVs) has motivated several automotive manufacturers and research organizations to develop suitable drivetrain designs involving batteries. Especially the development of the 48V Li-ion battery has been of great importance to reduce CO2 emissions and meet emission standards. However, accurately modeling Li-ion batteries is a difficult task since multiple factors have to be considered. Conservative Methods are using pyhsico-chemical models or electrical circuits in order to mimic the battery behavior. This thesis deals with developing a Li-ion battery model using artificial neural network (ANN) algorithms to predict the state of charge (SOC) as one of the key battery management system states. Due to the rising power of GPUs and the amount of available data, ANNs became popular in recent years. ANNs are also applicable to different areas of battery technology. Using battery data like the battery voltage, temperature, and current as input features, a neural network is trained that predicts battery SOC. A novel approach based on ANNs and one of the most commonly used SOC estimation methods are presented in this thesis to model the battery behavior. Furthermore, an approach for dealing with the highly unbalanced data by creating multidimensional bins and compare different neural network architectures for time series forecasting is introduced. By creating the model, our main priority is to reduce the model's errors in extreme operating areas of the battery. According to our results, long short-term memory (LSTM) architectures appear to be the best fit for this task. Finally, the developed ANN model can successfully learn battery behavior, however the model's accuracy under harsh operating circumstances is highly dependent on the data quality gathered. info:eu-repo/classification/ddc/004 ddc:004 Neuronales Netz

Search results