Spelling suggestions: "subject:"overfitting"" "subject:"overfittings""
11 |
Seismic data processing with curvelets: a multiscale and nonlinear approachHerrmann, Felix J. January 2007 (has links)
In this abstract, we present a nonlinear curvelet-based sparsity-promoting formulation of a seismic processing flow, consisting of the following steps: seismic data regularization and the restoration of migration amplitudes. We show that the curvelet's wavefront detection capability and invariance under the migration-demigration operator lead to a formulation that is stable under noise and missing data.
|
12 |
Investigating techniques for improving accuracy and limiting overfitting for YOLO and real-time object detection on iOSGüven, Jakup January 2019 (has links)
I detta arbete genomförs utvecklingen av ett realtids objektdetekteringssystem för iOS. För detta ändamål används YOLO, en ett-stegs objektdetekterare och ett s.k. ihoplänkat neuralt nätverk vilket åstadkommer betydligt bättre prestanda än övriga realtidsdetek- terare i termer av hastighet och precision. En dörrdetekterare baserad på YOLO tränas och implementeras i en systemutvecklingsprocess. Maskininlärningsprocessen sammanfat- tas och praxis för att undvika överträning eller “overfitting” samt för att öka precision och hastighet diskuteras och appliceras. Vidare genomförs en rad experiment vilka pekar på att dataaugmentation och inkludering av negativ data i ett dataset medför ökad precision. Hyperparameteroptimisering och kunskapsöverföring pekas även ut som medel för att öka en objektdetekringsmodells prestanda. Författaren lyckas öka modellens mAP, ett sätt att mäta precision för objektdetekterare, från 63.76% till 86.73% utifrån de erfarenheter som dras av experimenten. En modells tendens för överträning utforskas även med resultat som pekar på att träning med över 300 epoker rimligen orsakar en övertränad modell. / This paper features the creation of a real time object detection system for mobile iOS using YOLO, a state-of-the-art one stage object detector and convoluted neural network far surpassing other real time object detectors in speed and accuracy. In this process an object detecting model is trained to detect doors. The machine learning process is outlined and practices to combat overfitting and increasing accuracy and speed are discussed. A series of experiments are conducted, the results of which suggests that data augmentation, including negative data in a dataset, hyperparameter optimisation and transfer learning are viable techniques in improving the performance of an object detection model. The author is able to increase mAP, a measurement of accuracy for object detectors, from 63.76% to 86.73% based on the results of experiments. The tendency for overfitting is also explored and results suggest that training beyond 300 epochs is likely to produce an overfitted model.
|
13 |
Attractors of autoencoders : Memorization in neural networks / Attractors of autoencoders : Memorization in neural networksStrandqvist, Jonas January 2020 (has links)
It is an important question in machine learning to understand how neural networks learn. This thesis sheds further light onto this by studying autoencoder neural networks which can memorize data by storing it as attractors.What this means is that an autoencoder can learn a training set and later produce parts or all of this training set even when using other inputs not belonging to this set. We seek out to illuminate the effect on how ReLU networks handle memorization when trained with different setups: with and without bias, for different widths and depths, and using two different types of training images -- from the CIFAR10 dataset and randomly generated. For this, we created controlled experiments in which we train autoencoders and compute the eigenvalues of their Jacobian matrices to discern the number of data points stored as attractors.We also manually verify and analyze these results for patterns and behavior. With this thesis we broaden the understanding of ReLU autoencoders: We find that the structure of the data has an impact on the number of attractors. For instance, we produced autoencoders where every training image became an attractor when we trained with random pictures but not with CIFAR10. Changes to depth and width on these two types of data also show different behaviour.Moreover, we observe that loss has less of an impact than expected on attractors of trained autoencoders.
|
14 |
Comprehensibility, Overfitting and Co-Evolution in Genetic Programming for Technical Trading RulesSeshadri, Mukund 30 April 2003 (has links)
This thesis presents Genetic Programming methodologies to find successful and understandable technical trading rules for financial markets. The methods when applied to the S&P500 consistently beat the buy-and-hold strategy over a 12-year period, even when considering transaction costs. Some of the methods described discover rules that beat the S&P500 with 99% significance. The work describes the use of a complexity-penalizing factor to avoid overfitting and improve comprehensibility of the rules produced by GPs. The effect of this factor on the returns for this domain area is studied and the results indicated that it increased the predictive ability of the rules. A restricted set of operators and domain knowledge were used to improve comprehensibility. In particular, arithmetic operators were eliminated and a number of technical indicators in addition to the widely used moving averages, such as trend lines and local maxima and minima were added. A new evaluation function that tests for consistency of returns in addition to total returns is introduced. Different cooperative coevolutionary genetic programming strategies for improving returns are studied and the results analyzed. We find that paired collaborator coevolution has the best results.
|
15 |
Bayesovské a neuronové sítě / Bayesian and Neural NetworksHložek, Bohuslav January 2017 (has links)
This paper introduces Bayesian neural network based on Occams razor. Basic knowledge about neural networks and Bayes rule is summarized in the first part of this paper. Principles of Occams razor and Bayesian neural network are explained. A real case of use is introduced (about predicting landslide). The second part of this paper introduces how to construct Bayesian neural network in Python. Such an application is shown. Typical behaviour of Bayesian neural networks is demonstrated using example data.
|
16 |
Lithium-Ion Battery State of Charge Modelling based on Neural NetworksChukka, Vasu 06 April 2022 (has links)
Lithium-ion (Li-ion) batteries have become a crucial factor in the recent electro-mobility trend. People's increased interest in electric vehicles (EVs) has motivated several automotive
manufacturers and research organizations to develop suitable drivetrain designs involving batteries. Especially the development of the 48V Li-ion battery has been of
great importance to reduce CO2 emissions and meet emission standards. However, accurately modeling Li-ion batteries is a difficult task since multiple factors have to be
considered. Conservative Methods are using pyhsico-chemical models or electrical circuits in order to mimic the battery behavior. This thesis deals with developing a Li-ion
battery model using artificial neural network (ANN) algorithms to predict the state of charge (SOC) as one of the key battery management system states. Due to the rising
power of GPUs and the amount of available data, ANNs became popular in recent years. ANNs are also applicable to different areas of battery technology. Using battery data
like the battery voltage, temperature, and current as input features, a neural network is trained that predicts battery SOC. A novel approach based on ANNs and one of the
most commonly used SOC estimation methods are presented in this thesis to model the battery behavior. Furthermore, an approach for dealing with the highly unbalanced data
by creating multidimensional bins and compare different neural network architectures for time series forecasting is introduced. By creating the model, our main priority is to reduce
the model's errors in extreme operating areas of the battery. According to our results, long short-term memory (LSTM) architectures appear to be the best fit for this task.
Finally, the developed ANN model can successfully learn battery behavior, however the model's accuracy under harsh operating circumstances is highly dependent on the data
quality gathered.
|
17 |
Evaluation of Convolutional Neural Network Based Classification and Feature Detection Supporting Autonomous Robotic HarvestingGreen, Allison 15 May 2023 (has links)
No description available.
|
18 |
A Comparative study of data splitting algorithms for machine learning model selectionBirba, Delwende Eliane January 2020 (has links)
Data splitting is commonly used in machine learning to split data into a train, test, or validation set. This approach allows us to find the model hyper-parameter and also estimate the generalization performance. In this research, we conducted a comparative analysis of different data partitioning algorithms on both real and simulated data. Our main objective was to address the question of how the choice of data splitting algorithm can improve the estimation of the generalization performance. Data splitting algorithms used in this study were variants of k-fold, Kennard-Stone, SPXY ( sample set partitioning based on joint x-y distance), and random sampling algorithm. Each algorithm divided the data into two subset, training/validation. The training set was used to fit the model and validation for the evaluation. We then analyzed the different data splitting algorithms based on the generalization performances estimated from the validation and the external test set. From the result, we noted that the important determinant for a good generalization is the size of the dataset. For all the data sample methods applied on small data set, the gap between the performance estimated on the validation and test set was significant. However, we noted that the gap reduced when there was more data in training or validation. Too many or few data in the training set can also lead to bad model performance. So it is importance to have a reasonable balance between the training/validation set sizes. In our study, KS and SPXY was the splitting algorithm with poor model performance estimation. Indeed these methods select the most representative samples to train the model, and poor representative samples are left for model performance estimation. / Datapartitionering används vanligtvis i maskininlärning för att dela data i en tränings, test eller valideringsuppsättning. Detta tillvägagångssätt gör det möjligt för oss att hitta hyperparametrar för modellen och även uppskatta generaliseringsprestanda. I denna forskning genomförde vi en jämförande analys av olika datapartitionsalgoritmer på både verkliga och simulerade data. Vårt huvudmål var att undersöka frågan om hur valet avdatapartitioneringsalgoritm kan förbättra uppskattningen av generaliseringsprestanda. Datapartitioneringsalgoritmer som användes i denna studie var varianter av k-faldig korsvalidering, Kennard-Stone (KS), SPXY (partitionering baserat på gemensamt x-y-avstånd) och bootstrap-algoritm. Varje algoritm användes för att dela upp data i två olika datamängder: tränings- och valideringsdata. Vi analyserade sedan de olika datapartitioneringsalgoritmerna baserat på generaliseringsprestanda uppskattade från valideringen och den externa testuppsättningen. Från resultatet noterade vi att det avgörande för en bra generalisering är storleken på data. För alla datapartitioneringsalgoritmer som använts på små datamängder var klyftan mellan prestanda uppskattad på valideringen och testuppsättningen betydande. Vi noterade emellertid att gapet minskade när det fanns mer data för träning eller validering. För mycket eller för litet data i träningsuppsättningen kan också leda till dålig prestanda. Detta belyser vikten av att ha en korrekt balans mellan storlekarna på tränings- och valideringsmängderna. I vår studie var KS och SPXY de algoritmer med sämst prestanda. Dessa metoder väljer de mest representativa instanserna för att träna modellen, och icke-representativa instanser lämnas för uppskattning av modellprestanda.
|
19 |
Deep Convolutional Neural Network for Effective Image Analysis : DESIGN AND IMPLEMENTATION OF A DEEP PIXEL-WISE SEGMENTATION ARCHITECTUREMarti, Marco Ros January 2017 (has links)
This master thesis presents the process of designing and implementing a CNN-based architecture for image recognition included in a larger project in the field of fashion recommendation with deep learning. Concretely, the presented network aims to perform localization and segmentation tasks. Therefore, an accurate analysis of the most well-known localization and segmentation networks in the state of the art has been performed. Afterwards, a multi-task network performing RoI pixel-wise segmentation has been created. This proposal solves the detected weaknesses of the pre-existing networks in the field of application, i.e. fashion recommendation. These weaknesses are basically related with the lack of a fine-grained quality of the segmentation and problems with computational efficiency. When it comes to improve the details of the segmentation, this network proposes to work pixel- wise, i.e. performing a classification task for each of the pixels of the image. Thus, the network is more suitable to detect all the details presented in the analysed images. However, a pixel-wise task requires working in pixel resolution, which implies that the number of operations to perform is usually large. To reduce the total number of operations to perform in the network and increase the computational efficiency, this pixel-wise segmentation is only done in the meaningful regions of the image (Regions of Interest), which are also computed in the network (RoI masks). Then, after a study of the more recent deep learning libraries, the network has been successfully implemented. Finally, to prove the correct operation of the design, a set of experiments have been satisfactorily conducted. In this sense, it must be noted that the evaluation of the results obtained during testing phase with respect to the most well-known architectures is out of the scope of this thesis as the experimental conditions, especially in terms of dataset, have not been suitable for doing so. Nevertheless, the proposed network is totally prepared to perform this evaluation in the future, when the required experimental conditions are available. / Denna examensarbete presenterar processen för att designa och implementera en CNN-baserad arkitektur för bildigenkänning som ingår i ett större projekt inom moderekommendation med djup inlärning. Konkret, det presenterade nätverket syftar till att utföra lokaliseringsoch segmenteringsuppgifter. Därför har en noggrann analys av de mest kända lokaliseringsoch segmenteringsnätena utförts inom den senaste tekniken. Därefter har ett multi-task-nätverk som utför RoI pixel-wise segmentering skapats. Detta förslag löser de upptäckta svagheterna hos de befintliga näten inom tillämpningsområdet, dvs modeanbefaling. Dessa svagheter är i grund och botten relaterade till bristen på en finkornad kvalitet på segmenteringen och problem med beräkningseffektivitet. När det gäller att förbättra detaljerna i segmenteringen, föreslår detta nätverk att arbeta pixelvis, dvs att utföra en klassificeringsuppgift för var och en av bildpunkterna i bilden. Nätverket är sålunda lämpligare att detektera alla detaljer som presenteras i de analyserade bilderna. En pixelvis uppgift kräver dock att man arbetar med pixelupplösning, vilket innebär att antalet operationer som ska utföras är vanligtvis stor. För att minska det totala antalet operationer som ska utföras i nätverket och öka beräkningseffektiviteten görs denna pixelvisa segmentering endast i de meningsfulla regionerna i bilden (intressanta regioner), som också beräknas i nätverket (RoI-masker) . Sedan, efter en studie av de senaste djuplärningsbiblioteken, har nätverket framgångsrikt implementerats. Slutligen, för att bevisa korrekt funktion av konstruktionen, har en uppsättning experiment genomförts på ett tillfredsställande sätt. I detta avseende måste det noteras att utvärderingen av de resultat som uppnåtts under testfasen i förhållande till de mest kända arkitekturerna ligger utanför denna avhandling, eftersom de experimentella förhållandena, särskilt vad gäller dataset, inte har varit lämpliga För att göra det. Ändå är det föreslagna nätverket helt beredd att utföra denna utvärdering i framtiden när de nödvändiga försöksvillkoren är tillgängliga. / En aquest treball de fi de màster es presenta el disseny i la implementació d’una arquitectura pel reconeixement d’imatges fent ús de CNN. Aquesta xarxa es troba inclosa en un projecte de major envergadura en el camp de la recomanació de moda. En concret, la xarxa presentada en aquest document s’encarrega de realitzar les tasques de localització i segmentació. Després d’un estudi a consciència de les xarxes més conegudes de l’estat de l’art, s’ha dissenyat una xarxa multi-tasca encarregada de realitzar una segmentació a resolució de píxel de les regions d’interès de la imatge, les quals han sigut prèviament calculades i emmascarades. Aquesta proposta soluciona les mancances detectades en les xarxes ja existents pel que fa a la tasca de recomanació de moda. Aquestes mancances es basen en la obtenció d’una segmentació sense prou nivell de detalls i en una rellevant complexitat computacional. Pel que fa a la qualitat de la segmentació, aquesta tesi proposa treballar en resolució de píxel, classificant tots els píxels de la imatge de forma individual, per tal de poder adaptar-se a tots els detalls que puguin aparèixer a la imatge analitzada. No obstant, treballar píxel a píxel implica la realització d’una gran quantitat d’operacions. Per reduir-les, proposem fer la segmentació píxel a píxel només a les regions d’interès de la imatge. A continuació, després d’un estudi detallat de les llibreries de deep learnign més destacades, el disseny ha sigut implementat. Finalment s’han dut a terme una sèrie d’experiments per provar el correcte funcionament del disseny. En aquest sentit és important destacar que aquesta tesi no té com a objectiu avaluar el disseny respecte d’altres xarxes ja existents. La raó és que les condicions d’experimentació, sobretot pel que fa a la base de dades, no són adequades per aquesta tasca. No obstant, la xarxa està perfectament preparada per fer aquesta avaluació un cop les condicions d’experimentació així ho permetin.
|
20 |
Online Non-linear Prediction of Financial Time Series Patternsda Costa, Joel 11 September 2020 (has links)
We consider a mechanistic non-linear machine learning approach to learning signals in financial time series data. A modularised and decoupled algorithm framework is established and is proven on daily sampled closing time-series data for JSE equity markets. The input patterns are based on input data vectors of data windows preprocessed into a sequence of daily, weekly and monthly or quarterly sampled feature measurement changes (log feature fluctuations). The data processing is split into a batch processed step where features are learnt using a Stacked AutoEncoder (SAE) via unsupervised learning, and then both batch and online supervised learning are carried out on Feedforward Neural Networks (FNNs) using these features. The FNN output is a point prediction of measured time-series feature fluctuations (log differenced data) in the future (ex-post). Weight initializations for these networks are implemented with restricted Boltzmann machine pretraining, and variance based initializations. The validity of the FNN backtest results are shown under a rigorous assessment of backtest overfitting using both Combinatorially Symmetrical Cross Validation and Probabilistic and Deflated Sharpe Ratios. Results are further used to develop a view on the phenomenology of financial markets and the value of complex historical data under unstable dynamics.
|
Page generated in 0.0513 seconds