1 |
Machine Learning Techniques for Large-Scale System ModelingLv, Jiaqing 31 August 2011 (has links)
This thesis is about some issues in system modeling: The first is a parsimonious
representation of MISO Hammerstein system, which is by projecting the multivariate
linear function into a univariate input function space. This leads to the so-called
semiparamtric Hammerstein model, which overcomes the commonly known “Curse
of dimensionality” for nonparametric estimation on MISO systems. The second issue
discussed in this thesis is orthogonal expansion analysis on a univariate Hammerstein
model and hypothesis testing for the structure of the nonlinear subsystem. The generalization
of this technique can be used to test the validity for parametric assumptions
of the nonlinear function in Hammersteim models. It can also be applied to approximate
a general nonlinear function by a certain class of parametric function in the
Hammerstein models. These techniques can also be extended to other block-oriented
systems, e.g, Wiener systems, with slight modification. The third issue in this thesis is
applying machine learning and system modeling techniques to transient stability studies
in power engineering. The simultaneous variable section and estimation lead to a
substantially reduced complexity and yet possesses a stronger prediction power than
techniques known in the power engineering literature so far.
|
2 |
Machine Learning Techniques for Large-Scale System ModelingLv, Jiaqing 31 August 2011 (has links)
This thesis is about some issues in system modeling: The first is a parsimonious
representation of MISO Hammerstein system, which is by projecting the multivariate
linear function into a univariate input function space. This leads to the so-called
semiparamtric Hammerstein model, which overcomes the commonly known “Curse
of dimensionality” for nonparametric estimation on MISO systems. The second issue
discussed in this thesis is orthogonal expansion analysis on a univariate Hammerstein
model and hypothesis testing for the structure of the nonlinear subsystem. The generalization
of this technique can be used to test the validity for parametric assumptions
of the nonlinear function in Hammersteim models. It can also be applied to approximate
a general nonlinear function by a certain class of parametric function in the
Hammerstein models. These techniques can also be extended to other block-oriented
systems, e.g, Wiener systems, with slight modification. The third issue in this thesis is
applying machine learning and system modeling techniques to transient stability studies
in power engineering. The simultaneous variable section and estimation lead to a
substantially reduced complexity and yet possesses a stronger prediction power than
techniques known in the power engineering literature so far.
|
3 |
Sparse Bayesian Time-Varying Covariance Estimation in Many DimensionsKastner, Gregor 18 September 2016 (has links) (PDF)
Dynamic covariance estimation for multivariate time series suffers from the curse of dimensionality. This renders parsimonious estimation methods essential for conducting reliable statistical inference. In this paper, the issue is addressed by modeling the underlying co-volatility dynamics of a time series vector through a lower dimensional collection of latent time-varying stochastic factors. Furthermore, we apply a Normal-Gamma prior to the elements of the factor loadings matrix. This hierarchical shrinkage prior effectively pulls the factor loadings of unimportant factors towards zero, thereby increasing parsimony even more. We apply the model to simulated data as well as daily log-returns of 300 S&P 500 stocks and demonstrate the effectiveness of the shrinkage prior to obtain sparse loadings matrices and more precise correlation estimates. Moreover, we investigate predictive performance and discuss different choices for the number of latent factors. Additionally to being a stand-alone tool, the algorithm is designed to act as a "plug and play" extension for other MCMC samplers; it is implemented in the R package factorstochvol. (author's abstract) / Series: Research Report Series / Department of Statistics and Mathematics
|
4 |
Advances on Dimension Reduction for Univariate and Multivariate Time SeriesMahappu Kankanamge, Tharindu Priyan De Alwis 01 August 2022 (has links) (PDF)
Advances in modern technologies have led to an abundance of high-dimensional time series data in many fields, including finance, economics, health, engineering, and meteorology, among others. This causes the “curse of dimensionality” problem in both univariate and multivariate time series data. The main objective of time series analysis is to make inferences about the conditional distributions. There are some methods in the literature to estimate the conditional mean and conditional variance functions in time series. However, most of those are inefficient, computationally intensive, or suffer from the overparameterization. We propose some dimension reduction techniques to address the curse of dimensionality in high-dimensional time series dataFor high-dimensional matrix-valued time series data, there are a limited number of methods in the literature that can preserve the matrix structure and reduce the number of parameters significantly (Samadi, 2014, Chen et al., 2021). However, those models cannot distinguish between relevant and irrelevant information and yet suffer from the overparameterization. We propose a novel dimension reduction technique for matrix-variate time series data called the "envelope matrix autoregressive model" (EMAR), which offers substantial dimension reduction and links the mean function and the covariance matrix of the model by using the minimal reducing subspace of the covariance matrix. The proposed model can identify and remove irrelevant information and can achieve substantial efficiency gains by significantly reducing the total number of parameters. We derive the asymptotic properties of the proposed maximum likelihood estimators of the EMAR model. Extensive simulation studies and a real data analysis are conducted to corroborate our theoretical results and to illustrate the finite sample performance of the proposed EMAR model.For univariate time series, we propose sufficient dimension reduction (SDR) methods based on some integral transformation approaches that can preserve sufficient information about the response. In particular, we use the Fourier and Convolution transformation methods (FM and CM) to perform sufficient dimension reduction in univariate time series and estimate the time series central subspace (TS-CS), the time series mean subspace (TS-CMS), and the time series variance subspace (TS-CVS). Using FM and CM procedures and with some distributional assumptions, we derive candidate matrices that can fully recover the TS-CS, TS-CMS, and TS-CVS, and propose an explicit estimate of the candidate matrices. The asymptotic properties of the proposed estimators are established under both normality and non-normality assumptions. Moreover, we develop some data-drive methods to estimate the dimension of the time series central subspaces as well as the lag order. Our simulation results and real data analyses reveal that the proposed methods are not only significantly more efficient and accurate but also offer substantial computational efficiency compared to the existing methods in the literature. Moreover, we develop an R package entitled “sdrt” to easily perform our program code in FM and CM procedures to estimate suffices dimension reduction subspaces in univariate time series.
|
5 |
Recovery and Analysis of Regulatory Networks from Expression Data Using Sums of Separable FunctionsBotts, Ryan T. 22 September 2010 (has links)
No description available.
|
6 |
Classification in High Dimensional Feature Spaces through Random Subspace EnsemblesPathical, Santhosh P. January 2010 (has links)
No description available.
|
7 |
Neue Indexingverfahren für die Ähnlichkeitssuche in metrischen Räumen über großen Datenmengen / New indexing techniques for similarity search in metric spacesGuhlemann, Steffen 06 July 2016 (has links) (PDF)
Ein zunehmend wichtiges Thema in der Informatik ist der Umgang mit Ähnlichkeit in einer großen Anzahl unterschiedlicher Domänen. Derzeit existiert keine universell verwendbare Infrastruktur für die Ähnlichkeitssuche in allgemeinen metrischen Räumen. Ziel der Arbeit ist es, die Grundlage für eine derartige Infrastruktur zu legen, die in klassische Datenbankmanagementsysteme integriert werden könnte.
Im Rahmen einer Analyse des State of the Art wird der M-Baum als am besten geeignete Basisstruktur identifiziert. Dieser wird anschließend zum EM-Baum erweitert, wobei strukturelle Kompatibilität mit dem M-Baum erhalten wird. Die Abfragealgorithmen werden im Hinblick auf eine Minimierung notwendiger Distanzberechnungen optimiert. Aufbauend auf einer mathematischen Analyse der Beziehung zwischen Baumstruktur und Abfrageaufwand werden Freiheitsgrade in Baumänderungsalgorithmen genutzt, um Bäume so zu konstruieren, dass Ähnlichkeitsanfragen mit einer minimalen Anzahl an Anfrageoperationen beantwortet werden können. / A topic of growing importance in computer science is the handling of similarity in multiple heterogenous domains. Currently there is no common infrastructure to support this for the general metric space. The goal of this work is lay the foundation for such an infrastructure, which could be integrated into classical data base management systems.
After some analysis of the state of the art the M-Tree is identified as most suitable base and enhanced in multiple ways to the EM-Tree retaining structural compatibility. The query algorithms are optimized to reduce the number of necessary distance calculations. On the basis of a mathematical analysis of the relation between the tree structure and the query performance degrees of freedom in the tree edit algorithms are used to build trees optimized for answering similarity queries using a minimal number of distance calculations.
|
8 |
Théorie de Perron-Frobenius non linéaire et méthodes numériques max-plus pour la résolution d'équations d'Hamilton-JacobiQu, Zheng 21 October 2013 (has links) (PDF)
Une approche fondamentale pour la résolution de problémes de contrôle optimal est basée sur le principe de programmation dynamique. Ce principe conduit aux équations d'Hamilton-Jacobi, qui peuvent être résolues numériquement par des méthodes classiques comme la méthode des différences finies, les méthodes semi-lagrangiennes, ou les schémas antidiffusifs. À cause de la discrétisation de l'espace d'état, la dimension des problèmes de contrôle pouvant être abordés par ces méthodes classiques est souvent limitée à 3 ou 4. Ce phénomène est appellé malédiction de la dimension. Cette thèse porte sur les méthodes numériques max-plus en contôle optimal deterministe et ses analyses de convergence. Nous étudions et developpons des méthodes numériques destinées à attenuer la malédiction de la dimension, pour lesquelles nous obtenons des estimations théoriques de complexité. Les preuves reposent sur des résultats de théorie de Perron-Frobenius non linéaire. En particulier, nous étudions les propriétés de contraction des opérateurs monotones et non expansifs, pour différentes métriques de Finsler sur un cône (métrique de Thompson, métrique projective d'Hilbert). Nous donnons par ailleurs une généralisation du "coefficient d'ergodicité de Dobrushin" à des opérateurs de Markov sur un cône général. Nous appliquons ces résultats aux systèmes de consensus ainsi qu'aux équations de Riccati généralisées apparaissant en contrôle stochastique.
|
9 |
Classification in high dimensional feature spaces / by H.O. van DykVan Dyk, Hendrik Oostewald January 2009 (has links)
In this dissertation we developed theoretical models to analyse Gaussian and multinomial distributions. The analysis is focused on classification in high dimensional feature spaces and provides a basis for dealing with issues such as data sparsity and feature selection (for Gaussian and multinomial distributions, two frequently used models for high dimensional applications). A Naïve Bayesian philosophy is followed to deal with issues associated with the curse of dimensionality. The core treatment on Gaussian and multinomial models consists of finding analytical expressions for classification error performances. Exact analytical expressions were found for calculating error rates of binary class systems with Gaussian features of arbitrary dimensionality and using any type of quadratic decision boundary (except for degenerate paraboloidal boundaries).
Similarly, computationally inexpensive (and approximate) analytical error rate expressions were derived for classifiers with multinomial models. Additional issues with regards to the curse of dimensionality that are specific to multinomial models (feature sparsity) were dealt with and tested on a text-based language identification problem for all eleven official languages of South Africa. / Thesis (M.Ing. (Computer Engineering))--North-West University, Potchefstroom Campus, 2009.
|
10 |
Classification in high dimensional feature spaces / by H.O. van DykVan Dyk, Hendrik Oostewald January 2009 (has links)
In this dissertation we developed theoretical models to analyse Gaussian and multinomial distributions. The analysis is focused on classification in high dimensional feature spaces and provides a basis for dealing with issues such as data sparsity and feature selection (for Gaussian and multinomial distributions, two frequently used models for high dimensional applications). A Naïve Bayesian philosophy is followed to deal with issues associated with the curse of dimensionality. The core treatment on Gaussian and multinomial models consists of finding analytical expressions for classification error performances. Exact analytical expressions were found for calculating error rates of binary class systems with Gaussian features of arbitrary dimensionality and using any type of quadratic decision boundary (except for degenerate paraboloidal boundaries).
Similarly, computationally inexpensive (and approximate) analytical error rate expressions were derived for classifiers with multinomial models. Additional issues with regards to the curse of dimensionality that are specific to multinomial models (feature sparsity) were dealt with and tested on a text-based language identification problem for all eleven official languages of South Africa. / Thesis (M.Ing. (Computer Engineering))--North-West University, Potchefstroom Campus, 2009.
|
Page generated in 0.0293 seconds