1 |
Scale space feature selection with Multiple kernel learning and its application to oil sand image analysisNilufar, Sharmin Unknown Date
No description available.
|
2 |
Multiple Kernel Learning with Many KernelsAfkanpour, Arash Unknown Date
No description available.
|
3 |
Large scale optimization methods for metric and kernel learningJain, Prateek 06 November 2014 (has links)
A large number of machine learning algorithms are critically dependent on the underlying distance/metric/similarity function. Learning an appropriate distance function is therefore crucial to the success of many methods. The class of distance functions that can be learned accurately is characterized by the amount and type of supervision available to the particular application. In this thesis, we explore a variety of such distance learning problems using different amounts/types of supervision and provide efficient and scalable algorithms to learn appropriate distance functions for each of these problems. First, we propose a generic regularized framework for Mahalanobis metric learning and prove that for a wide variety of regularization functions, metric learning can be used for efficiently learning a kernel function incorporating the available side-information. Furthermore, we provide a method for fast nearest neighbor search using the learned distance/kernel function. We show that a variety of existing metric learning methods are special cases of our general framework. Hence, our framework also provides a kernelization scheme and fast similarity search scheme for such methods. Second, we consider a variation of our standard metric learning framework where the side-information is incremental, streaming and cannot be stored. For this problem, we provide an efficient online metric learning algorithm that compares favorably to existing methods both theoretically and empirically. Next, we consider a contrasting scenario where the amount of supervision being provided is extremely small compared to the number of training points. For this problem, we consider two different modeling assumptions: 1) data lies on a low-dimensional linear subspace, 2) data lies on a low-dimensional non-linear manifold. The first assumption, in particular, leads to the problem of matrix rank minimization over polyhedral sets, which is a problem of immense interest in numerous fields including optimization, machine learning, computer vision, and control theory. We propose a novel online learning based optimization method for the rank minimization problem and provide provable approximation guarantees for it. The second assumption leads to our geometry-aware metric/kernel learning formulation, where we jointly model the metric/kernel over the data along with the underlying manifold. We provide an efficient alternating minimization algorithm for this problem and demonstrate its wide applicability and effectiveness by applying it to various machine learning tasks such as semi-supervised classification, colored dimensionality reduction, manifold alignment etc. Finally, we consider the task of learning distance functions under no supervision, which we cast as a problem of learning disparate clusterings of the data. To this end, we propose a discriminative approach and a generative model based approach and we provide efficient algorithms with convergence guarantees for both the approaches. / text
|
4 |
Optimizing process parameters to increase the quality of the output in a separator : An application of Deep Kernel Learning in combination with the Basin-hopping optimizerHerwin, Eric January 2019 (has links)
Achieving optimal efficiency of production in the industrial sector is a process that is continuously under development. In several industrial installations separators, produced by Alfa Laval, may be found, and therefore it is of interest to make these separators operate more efficiently. The separator that is investigated separates impurities and water from crude oil. The separation performance is partially affected by the settings of process parameters. In this thesis it is investigated whether optimal or near optimal process parametersettings, which minimize the water content in the output, can be obtained.Furthermore, it is also investigated if these settings of a session can be testedto conclude about their suitability for the separator. The data that is usedin this investigation originates from sensors of a factory-installed separator.It consists of five variables which are related to the water content in theoutput. Two additional variables, related to time, are created to enforce thisrelationship. Using this data, optimal or near optimal process parameter settings may be found with an optimization technique. For this procedure, a Gaussian Process with the Deep Kernel Learning extension (GP-DKL) is used to model the relationship between the water content and the sensor data. Three models with different kernel functions are evaluated and the GP-DKL with a Spectral Mixture kernel is demonstrated to be the most suitable option. This combination is used as the objective function in a Basin-hopping optimizer, resulting in settings which correspond to a lower water content.Thus, it is concluded that optimal or near optimal settings can be obtained. Furthermore, the process parameter settings of a session can be tested by utilizing the Bayesian properties of the GP-DKL model. However, due to large posterior variance of the model, it can not be determined if the process parameter settings are suitable for the separator.
|
5 |
A Multiple-Kernel Support Vector Regression Approach for Stock Market Price ForecastingHuang, Chi-wei 05 August 2009 (has links)
Support vector regression has been applied to stock market forecasting problems. However, it is usually needed to tune manually the hyperparameters of the kernel functions. Multiple-kernel learning was developed to deal with this problem, by which the kernel matrix weights and Lagrange multipliers can be simultaneously derived through semidefinite programming. However, the amount of time and space required is very demanding. We develop a two-stage multiple-kernel learning algorithm by incorporating sequential minimal optimization and the gradient projection method.
By this algorithm, advantages from different hyperparameter settings can be combined and overall system performance can be improved. Besides, the user need not specify the hyperparameter settings in advance, and trial-and-error for determining appropriate hyperparameter settings can then be avoided. Experimental results, obtained by running on datasets taken from Taiwan Capitalization Weighted Stock Index, show that our method performs better than other methods.
|
6 |
A Classification Framework for Imbalanced DataPhoungphol, Piyaphol 18 December 2013 (has links)
As information technology advances, the demands for developing a reliable and highly accurate predictive model from many domains are increasing. Traditional classification algorithms can be limited in their performance on highly imbalanced data sets. In this dissertation, we study two common problems when training data is imbalanced, and propose effective algorithms to solve them.
Firstly, we investigate the problem in building a multi-class classification model from imbalanced class distribution. We develop an effective technique to improve the performance of the model by formulating the problem as a multi-class SVM with an objective to maximize G-mean value. A ramp loss function is used to simplify and solve the problem. Experimental results on multiple real-world datasets confirm that our new method can effectively solve the multi-class classification problem when the datasets are highly imbalanced.
Secondly, we explore the problem in learning a global classification model from distributed data sources with privacy constraints. In this problem, not only data sources have different class distributions but combining data into one central data is also prohibited. We propose a privacy-preserving framework for building a global SVM from distributed data sources. Our new framework avoid constructing a global kernel matrix by mapping non-linear inputs to a linear feature space and then solve a distributed linear SVM from these virtual points. Our method can solve both imbalance and privacy problems while achieving the same level of accuracy as regular SVM.
Finally, we extend our framework to handle high-dimensional data by utilizing Generalized Multiple Kernel Learning to select a sparse combination of features and kernels. This new model produces a smaller set of features, but yields much higher accuracy.
|
7 |
Analysis and Visualization of the Two-Dimensional Blood Flow Velocity Field from VideosJun, Yang January 2015 (has links)
We estimate the velocity field of the blood flow in a human face from videos. Our approach first performs spatial preprocessing to improve the signal-to-noise ratio (SNR) and the computational efficiency. The discrete Fourier transform (DFT) and a temporal band-pass filter are then applied to extract the frequency corresponding to the subjects heart rate. We propose multiple kernel based k-NN classification for removing the noise positions from the resulting phase and amplitude maps. The 2D blood flow field is then estimated from the relative phase shift between the pixels. We evaluate our approach about segmentation as well as velocity field on real and synthetic face videos. Our method produces the recall and precision as well as a velocity field with an angular error and magnitude error on the average.
|
8 |
Model Selection in Kernel MethodsYou, Di 16 December 2011 (has links)
No description available.
|
9 |
Parameter Estimation In Generalized Partial Linear Modelswith Tikhanov RegularizationKayhan, Belgin 01 September 2010 (has links) (PDF)
Regression analysis refers to techniques for modeling and analyzing several variables in statistical learning. There are various types of regression models. In our study, we analyzed Generalized Partial Linear Models (GPLMs), which decomposes input variables into two sets, and additively combines classical linear models with nonlinear model part. By separating linear models from nonlinear ones, an inverse problem method Tikhonov regularization was applied for the nonlinear submodels separately, within the entire GPLM. Such a particular representation of submodels provides both
a better accuracy and a better stability (regularity) under noise in the data.
We aim to smooth the nonparametric part of GPLM by using a modified form of Multiple Adaptive Regression Spline (MARS) which is very useful for high-dimensional problems and does not impose any specific relationship between the predictor and
dependent variables. Instead, it can estimate the contribution of the basis functions so that both the additive and interaction effects of the predictors are allowed to determine
the dependent variable. The MARS algorithm has two steps: the forward and backward stepwise algorithms. In the rst one, the model is built by adding basis functions until a maximum level of complexity is reached. On the other hand, the backward stepwise algorithm starts with removing the least significant basis functions from the model.
In this study, we propose to use a penalized residual sum of squares (PRSS) instead of the backward stepwise algorithm and construct PRSS for MARS as a Tikhonov regularization problem. Besides, we provide numeric example with two data sets / one has interaction and the other one does not have. As well as studying the regularization of the nonparametric part, we also mention theoretically the regularization
of the parametric part. Furthermore, we make a comparison between Infinite Kernel Learning (IKL) and Tikhonov regularization by using two data sets, with the difference
consisting in the (non-)homogeneity of the data set. The thesis concludes with an outlook on future research.
|
10 |
Employing Multiple Kernel Support Vector Machines for Counterfeit Banknote RecognitionSu, Wen-pin 29 July 2008 (has links)
Finding an efficient method to detect counterfeit banknotes is imperative. In this study, we propose multiple kernel weighted support vector machine for counterfeit banknote recognition. A variation of SVM in optimizing false alarm rate, called FARSVM, is proposed which provide minimized false negative rate and false positive rate. Each banknote is divided into m ¡Ñ n partitions, and each partition comes with its own kernels. The optimal weight with each kernel matrix in the combination is obtained through the semidefinite programming (SDP) learning method. The amount of time and space required by the original SDP is very demanding. We focus on this framework and adopt two strategies to reduce the time and space requirements. The first strategy is to assume the non-negativity of kernel weights, and the second strategy is to set the sum of weights equal to 1. Experimental results show that regions with zero kernel weights are easy to imitate with today¡¦s digital imaging technology, and regions with nonzero kernel weights are difficult to imitate. In addition, these results show that the proposed approach outperforms single kernel SVM and standard SVM with SDP on Taiwanese banknotes.
|
Page generated in 0.0726 seconds