• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 13
  • 2
  • 1
  • 1
  • Tagged with
  • 21
  • 10
  • 9
  • 8
  • 7
  • 6
  • 6
  • 6
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Mis-specification tests for neural regression models : applications in business and finance

Holt, William Travis January 1999 (has links)
No description available.
2

Understanding Representations and Reducing their Redundancy in Deep Networks

Cogswell, Michael Andrew 15 March 2016 (has links)
Neural networks in their modern deep learning incarnation have achieved state of the art performance on a wide variety of tasks and domains. A core intuition behind these methods is that they learn layers of features which interpolate between two domains in a series of related parts. The first part of this thesis introduces the building blocks of neural networks for computer vision. It starts with linear models then proceeds to deep multilayer perceptrons and convolutional neural networks, presenting the core details of each. However, the introduction also focuses on intuition by visualizing concrete examples of the parts of a modern network. The second part of this thesis investigates regularization of neural networks. Methods like dropout and others have been proposed to favor certain (empirically better) solutions over others. However, big deep neural networks still overfit very easily. This section proposes a new regularizer called DeCov, which leads to significantly reduced overfitting (difference between train and val performance) and greater generalization, sometimes better than dropout and other times not. The regularizer is based on the cross-covariance of hidden representations and takes advantage of the intuition that different features should try to represent different things, an intuition others have explored with similar losses. Experiments across a range of datasets and network architectures demonstrate reduced overfitting due to DeCov while almost always maintaining or increasing generalization performance and often improving performance over dropout. / Master of Science
3

Robust Margin Based Classifiers For Small Sample Data

January 2011 (has links)
abstract: In many classication problems data samples cannot be collected easily, example in drug trials, biological experiments and study on cancer patients. In many situations the data set size is small and there are many outliers. When classifying such data, example cancer vs normal patients the consequences of mis-classication are probably more important than any other data type, because the data point could be a cancer patient or the classication decision could help determine what gene might be over expressed and perhaps a cause of cancer. These mis-classications are typically higher in the presence of outlier data points. The aim of this thesis is to develop a maximum margin classier that is suited to address the lack of robustness of discriminant based classiers (like the Support Vector Machine (SVM)) to noise and outliers. The underlying notion is to adopt and develop a natural loss function that is more robust to outliers and more representative of the true loss function of the data. It is demonstrated experimentally that SVM's are indeed susceptible to outliers and that the new classier developed, here coined as Robust-SVM (RSVM), is superior to all studied classier on the synthetic datasets. It is superior to the SVM in both the synthetic and experimental data from biomedical studies and is competent to a classier derived on similar lines when real life data examples are considered. / Dissertation/Thesis / Source Code for RSVM(MATLAB) / Presentation on RSVM / M.S. Computer Science 2011
4

Understanding Deep Neural Networks and other Nonparametric Methods in Machine Learning

Yixi Xu (6668192) 02 August 2019 (has links)
<div>It is a central problem in both statistics and computer science to understand the theoretical foundation of machine learning, especially deep learning. During the past decade, deep learning has achieved remarkable successes in solving many complex artificial intelligence tasks. The aim of this dissertation is to understand deep neural networks (DNNs) and other nonparametric methods in machine learning. In particular, three machine learning models have been studied: weight normalized DNNs, sparse DNNs, and the compositional nonparametric model.</div><div></div><div><br></div><div>The first chapter presents a general framework for norm-based capacity control for <i>L<sub>p,q</sub></i> weight normalized DNNs. We establish the upper bound on the Rademacher complexities of this family. Especially, with an <i>L<sub>1,infty</sub></i> normalization, we discuss properties of a width-independent capacity control, which only depends on the depth by a square root term. Furthermore, if the activation functions are anti-symmetric, the bound on the Rademacher complexity is independent of both the width and the depth up to a log factor. In addition, we study the weight normalized deep neural networks with rectified linear units (ReLU) in terms of functional characterization and approximation properties. In particular, for an <i>L<sub>1,infty</sub></i> weight normalized network with ReLU, the approximation error can be controlled by the <i>L<sub>1</sub></i> norm of the output layer.</div><div></div><div><br></div><div>In the second chapter, we study <i>L<sub>1,infty</sub></i>-weight normalization for deep neural networks with bias neurons to achieve the sparse architecture. We theoretically establish the generalization error bounds for both regression and classification under the <i>L<sub>1,infty</sub></i>-weight normalization. It is shown that the upper bounds are independent of the network width and <i>k<sup>1/2</sup></i>-dependence on the network depth <i>k</i>. These results provide theoretical justifications on the usage of such weight normalization to reduce the generalization error. We also develop an easily implemented gradient projection descent algorithm to practically obtain a sparse neural network. We perform various experiments to validate our theory and demonstrate the effectiveness of the resulting approach.</div><div></div><div><br></div><div>In the third chapter, we propose a compositional nonparametric method in which a model is expressed as a labeled binary tree of <i>2k+1</i> nodes, where each node is either a summation, a multiplication, or the application of one of the <i>q</i> basis functions to one of the <i>m<sub>1</sub></i> covariates. We show that in order to recover a labeled binary tree from a given dataset, the sufficient number of samples is <i>O(k </i>log<i>(m<sub>1</sub>q)+</i>log<i>(k!))</i>, and the necessary number of samples is <i>Omega(k </i>log<i>(m<sub>1</sub>q)-</i>log<i>(k!))</i>. We further propose a greedy algorithm for regression in order to validate our theoretical findings through synthetic experiments.</div>
5

The Effect of Optimization of Error Metrics

Khurram Jassal, Muhammad January 2011 (has links)
It is important for a retail company to forecast its sale in correct and accurate way to be ableto plan and evaluate sales and commercial strategies. Various forecasting techniques areavailable for this purpose. Two popular modelling techniques are Predictive Modelling andEconometric Modelling. The models created by these techniques are used to minimize thedifference between the real and the predicted values. There are several different errormetrics that can be used to measure and describe the difference. Each metric focuses ondifferent properties in the forecasts and it is hence important which metrics that is used whena model is created. Most traditional techniques use the sum of squared error which havegood mathematical properties but is not always optimal for forecasting purposes. This thesisfocuses on optimization of three widely used error metrics MAPE, WMAPE and RMSE.Especially the metrics protection against overfitting, which occurs when a predictive modelcatches noise and irregularities in the data, that is not part of the sought relationship, isevaluated in this thesis.Genetic Programming, a general optimization technique based on Darwin’s theories ofevolution. In this study genetic programming is used to optimize predictive models based oneach metrics. The sales data of five products of ICA (a Swedish retail company) has beenused to observe the effects of the optimized error metrics when creating predictive models.This study shows that all three metrics are quite poorly protected against overfitting even ifWMAPE and MAPE are slightly better protected than MAPE. However WMAPE is the mostpromising metric to use for optimization of predictive models. When evaluated against allthree metrics, models optimized based on WMAPE have the best overall result. The results oftraining and test data shows that the results hold in spite of overfitted models. / Program: Magisterutbildning i informatik
6

Topics in genomic image processing

Hua, Jianping 12 April 2006 (has links)
The image processing methodologies that have been actively studied and developed now play a very significant role in the flourishing biotechnology research. This work studies, develops and implements several image processing techniques for M-FISH and cDNA microarray images. In particular, we focus on three important areas: M-FISH image compression, microarray image processing and expression-based classification. Two schemes, embedded M-FISH image coding (EMIC) and Microarray BASICA: Background Adjustment, Segmentation, Image Compression and Analysis, have been introduced for M-FISH image compression and microarray image processing, respectively. In the expression-based classification area, we investigate the relationship between optimal number of features and sample size, either analytically or through simulation, for various classifiers.
7

An Analysis of Overfitting in Particle Swarm Optimised Neural Networks

van Wyk, Andrich Benjamin January 2014 (has links)
The phenomenon of overfitting, where a feed-forward neural network (FFNN) over trains on training data at the cost of generalisation accuracy is known to be speci c to the training algorithm used. This study investigates over tting within the context of particle swarm optimised (PSO) FFNNs. Two of the most widely used PSO algorithms are compared in terms of FFNN accuracy and a description of the over tting behaviour is established. Each of the PSO components are in turn investigated to determine their e ect on FFNN over tting. A study of the maximum velocity (Vmax) parameter is performed and it is found that smaller Vmax values are optimal for FFNN training. The analysis is extended to the inertia and acceleration coe cient parameters, where it is shown that speci c interactions among the parameters have a dominant e ect on the resultant FFNN accuracy and may be used to reduce over tting. Further, the signi cant e ect of the swarm size on network accuracy is also shown, with a critical range being identi ed for the swarm size for e ective training. The study is concluded with an investigation into the e ect of the di erent activation functions. Given strong empirical evidence, an hypothesis is made that stating the gradient of the activation function signi cantly a ects the convergence of the PSO. Lastly, the PSO is shown to be a very effective algorithm for the training of self-adaptive FFNNs, capable of learning from unscaled data. / Dissertation (MSc)--University of Pretoria, 2014. / tm2015 / Computer Science / MSc / Unrestricted
8

An online learning algorithm for technical trading

Murphy, Nicholas John 12 February 2020 (has links)
We use an adversarial expert based online learning algorithm to learn the optimal parameters required to maximise wealth trading zero-cost portfolio strategies. The learning algorithm is used to determine the relative population dynamics of technical trading strategies that can survive historical back-testing as well as form an overall aggregated portfolio trading strategy from the set of underlying trading strategies implemented on daily and intraday Johannesburg Stock Exchange data. The resulting population time-series are investigated using unsupervised learning for dimensionality reduction and visualisation. A key contribution is that the overall aggregated trading strategies are tested for statistical arbitrage using a novel hypothesis test proposed by Jarrow et al. [31] on both daily sampled and intraday time-scales. The (low frequency) daily sampled strategies fail the arbitrage tests after costs, while the (high frequency) intraday sampled strategies are not falsified as statistical arbitrages after costs. The estimates of trading strategy success, cost of trading and slippage are considered along with an offline benchmark portfolio algorithm for performance comparison. In addition, the algorithms generalisation error is analysed by recovering a probability of back-test overfitting estimate using a nonparametric procedure introduced by Bailey et al. [19]. The work aims to explore and better understand the interplay between different technical trading strategies from a data-informed perspective.
9

Implementation of a New Sigmoid Function in Backpropagation Neural Networks.

Bonnell, Jeffrey A 17 August 2011 (has links)
This thesis presents the use of a new sigmoid activation function in backpropagation artificial neural networks (ANNs). ANNs using conventional activation functions may generalize poorly when trained on a set which includes quirky, mislabeled, unbalanced, or otherwise complicated data. This new activation function is an attempt to improve generalization and reduce overtraining on mislabeled or irrelevant data by restricting training when inputs to the hidden neurons are sufficiently small. This activation function includes a flattened, low-training region which grows or shrinks during back-propagation to ensure a desired proportion of inputs inside the low-training region. With a desired low-training proportion of 0, this activation function reduces to a standard sigmoidal curve. A network with the new activation function implemented in the hidden layer is trained on benchmark data sets and compared with the standard activation function in an attempt to improve area under the curve for the receiver operating characteristic in biological and other classification tasks.
10

Seismic data processing with curvelets: a multiscale and nonlinear approach

Herrmann, Felix J. January 2007 (has links)
In this abstract, we present a nonlinear curvelet-based sparsity-promoting formulation of a seismic processing flow, consisting of the following steps: seismic data regularization and the restoration of migration amplitudes. We show that the curvelet's wavefront detection capability and invariance under the migration-demigration operator lead to a formulation that is stable under noise and missing data.

Page generated in 0.0734 seconds