1 |
A kernel-based fuzzy clustering algorithm and its application in classificationWang, Jiun-hau 25 July 2006 (has links)
In this paper, we purpose a kernel-based fuzzy clustering algorithm to cluster data patterns in the feature space. Our method uses kernel functions to project data from the original space into a high dimensional feature space, and data are divided into groups though their similarities in the feature space with an incremental clustering approach. After clustering, data patterns of the same cluster in the feature space are then grouped with an arbitrarily shaped boundary in the original space. As a result, clusters with arbitrary shapes are discovered in the original space. Clustering, which can be taken as unsupervised classification, has also been utilized in resolving classification problems. So, we extend our method to process the classification problems. By working in the high dimensional feature space where the data are expected to more separable, we can discover the inner structure of the data distribution. Therefore, our method has the advantage of dealing with new incoming data pattern efficiently. The effectiveness of our method is demonstrated in the experiment.
|
2 |
Spectral projection for the dbar-Neumann problemAlsaedy, Ammar, Tarkhanov, Nikolai January 2012 (has links)
We show that the spectral kernel function of the dbar-Neumann problem on a non-compact strongly pseudoconvex manifold is smooth up to the boundary.
|
3 |
Reproducing Kernel Hilbert spaces and complex dynamicsTipton, James Edward 01 December 2016 (has links)
Both complex dynamics and the theory of reproducing kernel Hilbert spaces have found widespread application over the last few decades. Although complex dynamics started over a century ago, the gravity of it's importance was only recently realized due to B.B. Mandelbrot's work in the 1980's. B.B. Mandelbrot demonstrated to the world that fractals, which are chaotic patterns containing a high degree of self-similarity, often times serve as better models to nature than conventional smooth models. The theory of reproducing kernel Hilbert spaces also having started over a century ago, didn't pick up until N. Aronszajn's classic was written in 1950. Since then, the theory has found widespread application to fields including machine learning, quantum mechanics, and harmonic analysis.
In the paper, Infinite Product Representations of Kernel Functions and Iterated Function Systems, the authors, D. Alpay, P. Jorgensen, I. Lewkowicz, and I. Martiziano, show how a kernel function can be constructed on an attracting set of an iterated function system. Furthermore, they show that when certain conditions are met, one can construct an orthonormal basis of the associated Hilbert space via certain pull-back and multiplier operators.
In this thesis we take for our iterated function system, the family of iterates of a given rational map. Thus we investigate for which rational maps their kernel construction holds as well as their orthornormal basis construction. We are able to show that the kernel construction applies to any rational map conjugate to a polynomial with an attracting fixed point at 0. Within such rational maps, we are able to find a family of polynomials for which the orthonormal basis construction holds. It is then natural to ask how the orthonormal basis changes as the polynomial within a given family varies. We are able to determine for certain families of polynomials, that the dynamics of the corresponding orthonormal basis is well behaved. Finally, we conclude with some possible avenues of future investigation.
|
4 |
Assessing the influence of observations on the generalization performance of the kernel Fisher discriminant classifierLamont, Morné Michael Connell 12 1900 (has links)
Thesis (PhD (Statistics and Actuarial Science))—Stellenbosch University, 2008. / Kernel Fisher discriminant analysis (KFDA) is a kernel-based technique that can be used
to classify observations of unknown origin into predefined groups. Basically, KFDA can
be viewed as a non-linear extension of Fisher’s linear discriminant analysis (FLDA). In
this thesis we give a detailed explanation how FLDA is generalized to obtain KFDA. We
also discuss two methods that are related to KFDA. Our focus is on binary classification.
The influence of atypical cases in discriminant analysis has been investigated by many
researchers. In this thesis we investigate the influence of atypical cases on certain aspects
of KFDA. One important aspect of interest is the generalization performance of the KFD
classifier. Several other aspects are also investigated with the aim of developing criteria
that can be used to identify cases that are detrimental to the KFD generalization
performance. The investigation is done via a Monte Carlo simulation study.
The output of KFDA can also be used to obtain the posterior probabilities of belonging to
the two classes. In this thesis we discuss two approaches to estimate posterior
probabilities in KFDA. Two new KFD classifiers are also derived which use these
probabilities to classify observations, and their performance is compared to that of the
original KFD classifier.
The main objective of this thesis is to develop criteria which can be used to identify cases
that are detrimental to the KFD generalization performance. Nine such criteria are
proposed and their merit investigated in a Monte Carlo simulation study as well as on
real-world data sets.
Evaluating the criteria on a leave-one-out basis poses a computational challenge,
especially for large data sets. In this thesis we also propose using the smallest enclosing
hypersphere as a filter, to reduce the amount of computations. The effectiveness of the
filter is tested in a Monte Carlo simulation study as well as on real-world data sets.
|
5 |
Nelineární neparametrické modely pro finanční časové řady / Nonlinear nonparametric models for financial time seriesKlačanská, Júlia January 2012 (has links)
The thesis studies nonlinear nonparametric models used in time series analy- sis. It gives basic introduction to the time series and states different nonlinear nonparametric models including their estimates. Special attention is paid to three of them, CHARN, FAR and AFAR model. Their properties and esti- mation techniques are presented. We also show techniques that select values of the parametres used further in estimation methods. The properties of time series models are investigated in simulation and real data studies. 1
|
6 |
Bayesian Inference In Forecasting Volcanic Hazards: An Example From ArmeniaWeller, Jennifer N 09 November 2004 (has links)
Scientists worldwide are increasingly faced with the need to assess geologic hazards for very infrequent events that have high consequence, for instance, in siting nuclear facilities for volcanic hazards. One of the methods currently being developed for such assessments is the Bayesian method. This paper outlines the Bayesian technique by focusing on the volcanic hazard assessment for the Armenia Nuclear Power Plant, (ANPP), which is located in a Quaternary volcanic field. The Bayesian method presented in this paper relies on the development of a probabilistic model based on the spatial distribution of past volcanic events and a geologic process model.
To develop the probabilistic model a bivariate Gaussian kernel function is used to forecast probabilities based on estimates of λt, temporal recurrence rate and λs, spatial density. Shortcomings often cited in such purely probabilistic assessments are that it takes into account only known features and the event, new volcano formation, is rare and there is no opportunity for repeated experiments or uniform observations, the hallmarks of classical probability. One approach to improving such probabilistic models is to incorporate related geological data that reflect controls on vent distribution and would improve the accuracy of subsequent models.
Geophysical data indicate that volcanism in Armenia is closely linked to crustal movement along major right lateral strike-slip fault systems that generates transtension across region. The surface expression of this transtension is pull-apart basins, filled with thick deposits of sediment, and antithetic normal faults. Volcanism in Armenia is concentrated in these deep sedimentary basins as is reflected in regional gravity data surveys. This means that low gravity anomalies are likely good indicators of future volcanic activity and therefore would improve probabilistic hazard models. Therefore, gravity data are transformed into a likelihood function and combined with the original probability model in quantitative fashion using Bayesian statistics. The result is a model that is based on the distribution of past events but modified to include pertinent geologic information. Using the Bayesian approach in this example increases the uncertainty, or range in probability, which reflects how well we actually know our probability estimate. Therefore, we feel it is appropriate to consider a range in probabilities for volcanic disruption of the ANPP, 1-4 x 10-6 per year (t=1 yr). We note that these values exceed the current International Atomic Energy Agency standard, 1 x 10-7 per year by at least one order of magnitude.
|
7 |
NONPARAMETRIC INFERENCES FOR THE HAZARD FUNCTION WITH RIGHT TRUNCATIONAkcin, Haci Mustafa 03 May 2013 (has links)
Incompleteness is a major feature of time-to-event data. As one type of incompleteness, truncation refers to the unobservability of the time-to-event variable because it is smaller (or greater) than the truncation variable. A truncated sample always involves left and right truncation.
Left truncation has been studied extensively while right truncation has not received the same level of attention. In one of the earliest studies on right truncation, Lagakos et al. (1988) proposed to transform a right truncated variable to a left truncated variable and then apply existing methods to the transformed variable. The reverse-time hazard function is introduced through transformation. However, this quantity does not have a natural interpretation. There exist gaps in the inferences for the regular forward-time hazard function with right truncated data. This dissertation discusses variance estimation of the cumulative hazard estimator, one-sample log-rank test, and comparison of hazard rate functions among finite independent samples under the context of right truncation.
First, the relation between the reverse- and forward-time cumulative hazard functions is clarified. This relation leads to the nonparametric inference for the cumulative hazard function. Jiang (2010) recently conducted a research on this direction and proposed two variance estimators of the cumulative hazard estimator. Some revision to the variance estimators is suggested in this dissertation and evaluated in a Monte-Carlo study.
Second, this dissertation studies the hypothesis testing for right truncated data. A series of tests is developed with the hazard rate function as the target quantity. A one-sample log-rank test is first discussed, followed by a family of weighted tests for comparison between finite $K$-samples. Particular weight functions lead to log-rank, Gehan, Tarone-Ware tests and these three tests are evaluated in a Monte-Carlo study.
Finally, this dissertation studies the nonparametric inference for the hazard rate function for the right truncated data. The kernel smoothing technique is utilized in estimating the hazard rate function. A Monte-Carlo study investigates the uniform kernel smoothed estimator and its variance estimator. The uniform, Epanechnikov and biweight kernel estimators are implemented in the example of blood transfusion infected AIDS data.
|
8 |
Nonparametric Inferences for the Hazard Function with Right TruncationAkcin, Haci Mustafa 03 May 2013 (has links)
Incompleteness is a major feature of time-to-event data. As one type of incompleteness, truncation refers to the unobservability of the time-to-event variable because it is smaller (or greater) than the truncation variable. A truncated sample always involves left and right truncation.
Left truncation has been studied extensively while right truncation has not received the same level of attention. In one of the earliest studies on right truncation, Lagakos et al. (1988) proposed to transform a right truncated variable to a left truncated variable and then apply existing methods to the transformed variable. The reverse-time hazard function is introduced through transformation. However, this quantity does not have a natural interpretation. There exist gaps in the inferences for the regular forward-time hazard function with right truncated data. This dissertation discusses variance estimation of the cumulative hazard estimator, one-sample log-rank test, and comparison of hazard rate functions among finite independent samples under the context of right truncation.
First, the relation between the reverse- and forward-time cumulative hazard functions is clarified. This relation leads to the nonparametric inference for the cumulative hazard function. Jiang (2010) recently conducted a research on this direction and proposed two variance estimators of the cumulative hazard estimator. Some revision to the variance estimators is suggested in this dissertation and evaluated in a Monte-Carlo study.
Second, this dissertation studies the hypothesis testing for right truncated data. A series of tests is developed with the hazard rate function as the target quantity. A one-sample log-rank test is first discussed, followed by a family of weighted tests for comparison between finite $K$-samples. Particular weight functions lead to log-rank, Gehan, Tarone-Ware tests and these three tests are evaluated in a Monte-Carlo study.
Finally, this dissertation studies the nonparametric inference for the hazard rate function for the right truncated data. The kernel smoothing technique is utilized in estimating the hazard rate function. A Monte-Carlo study investigates the uniform kernel smoothed estimator and its variance estimator. The uniform, Epanechnikov and biweight kernel estimators are implemented in the example of blood transfusion infected AIDS data.
|
9 |
The Turkish Catastrophe Insurance Pool Claims Modeling 2000-2008 DataSaribekir, Gozde 01 March 2013 (has links) (PDF)
After the 1999 Marmara Earthquake, social, economic and engineering studies on earthquakes became more intensive. The Turkish Catastrophe Insurance Pool (TCIP) was established after the Marmara Earthquake to share the deficit in the budget of the Government. The TCIP has become a data source for researchers, consisting of variables such as number of claims, claim amount and magnitude. In this thesis, the TCIP earthquake claims, collected between 2000 and 2008, are studied. The number of claims and claim payments (aggregate claim amount) are modeled by using Generalized Linear Models (GLM). Observed sudden jumps in claim data are represented by using the exponential kernel function. Model parameters are estimated by using the Maximum Likelihood Estimation (MLE). The results can be used as recommendation in the computation of expected value of the aggregate claim amounts and the premiums of the TCIP.
|
10 |
Parameter learning and support vector reduction in support vector regressionYang, Chih-cheng 21 July 2006 (has links)
The selection and learning of kernel functions is a very important but rarely studied problem in the field of support vector learning. However, the kernel function of a support vector regression has great influence on its performance. The kernel function projects the dataset from the original data space into the feature space, and therefore the problems which can not be done in low dimensions could be done in a higher dimension through the transform of the kernel function.
In this paper, there are two main contributions. Firstly, we introduce the gradient descent method to the learning of kernel functions. Using the gradient descent method, we can conduct learning rules of the parameters which indicate the shape and distribution of the kernel functions. Therefore, we can obtain better kernel functions by training their parameters with respect to the risk minimization principle. Secondly, In order to reduce the number of support vectors, we use the orthogonal least squares method. By choosing the representative support vectors, we may remove the less important support vectors in the support vector regression model.
The experimental results have shown that our approach can derive better kernel functions than others and has better generalization ability. Also, the number of support vectors can be effectively reduced.
|
Page generated in 0.0611 seconds