Global ETD Search

1	Robust algorithms for linear regression and locally linear embedding / Algoritmos robustos para regressão linear e locally linear embedding Rettes, Julio Alberto Sibaja January 2017 (has links) RETTES, Julio Alberto Sibaja. Robust algorithms for linear regression and locally linear embedding. 2017. 105 f. Dissertação (Mestrado em Ciência da Computação)- Universidade Federal do Ceará, Fortaleza, 2017. / Submitted by Weslayne Nunes de Sales (weslaynesales@ufc.br) on 2017-03-30T13:15:27Z No. of bitstreams: 1 2017_dis_rettesjas.pdf: 3569500 bytes, checksum: 46cedc2d9f96d0f58bcdfe3e0d975d78 (MD5) / Approved for entry into archive by Rocilda Sales (rocilda@ufc.br) on 2017-04-04T11:10:44Z (GMT) No. of bitstreams: 1 2017_dis_rettesjas.pdf: 3569500 bytes, checksum: 46cedc2d9f96d0f58bcdfe3e0d975d78 (MD5) / Made available in DSpace on 2017-04-04T11:10:44Z (GMT). No. of bitstreams: 1 2017_dis_rettesjas.pdf: 3569500 bytes, checksum: 46cedc2d9f96d0f58bcdfe3e0d975d78 (MD5) Previous issue date: 2017 / Nowadays a very large quantity of data is flowing around our digital society. There is a growing interest in converting this large amount of data into valuable and useful information. Machine learning plays an essential role in the transformation of data into knowledge. However, the probability of outliers inside the data is too high to marginalize the importance of robust algorithms. To understand that, various models of outliers are studied. In this work, several robust estimators within the generalized linear model for regression framework are discussed and analyzed: namely, the M-Estimator, the S-Estimator, the MM-Estimator, the RANSAC and the Theil-Sen estimator. This choice is motivated by the necessity of examining algorithms with different working principles. In particular, the M-, S-, MM-Estimator are based on a modification of the least squares criterion, whereas the RANSAC is based on finding the smallest subset of points that guarantees a predefined model accuracy. The Theil Sen, on the other hand, uses the median of least square models to estimate. The performance of the estimators under a wide range of experimental conditions is compared and analyzed. In addition to the linear regression problem, the dimensionality reduction problem is considered. More specifically, the locally linear embedding, the principal component analysis and some robust approaches of them are treated. Motivated by giving some robustness to the LLE algorithm, the RALLE algorithm is proposed. Its main idea is to use different sizes of neighborhoods to construct the weights of the points; to achieve this, the RAPCA is executed in each set of neighbors and the risky points are discarded from the corresponding neighborhood. The performance of the LLE, the RLLE and the RALLE over some datasets is evaluated. / Na atualidade um grande volume de dados é produzido na nossa sociedade digital. Existe um crescente interesse em converter esses dados em informação útil e o aprendizado de máquinas tem um papel central nessa transformação de dados em conhecimento. Por outro lado, a probabilidade dos dados conterem outliers é muito alta para ignorar a importância dos algoritmos robustos. Para se familiarizar com isso, são estudados vários modelos de outliers. Neste trabalho, discutimos e analisamos vários estimadores robustos dentro do contexto dos modelos de regressão linear generalizados: são eles o M-Estimator, o S-Estimator, o MM-Estimator, o RANSAC e o Theil-Senestimator. A escolha dos estimadores é motivada pelo principio de explorar algoritmos com distintos conceitos de funcionamento. Em particular os estimadores M, S e MM são baseados na modificação do critério de minimização dos mínimos quadrados, enquanto que o RANSAC se fundamenta em achar o menor subconjunto que permita garantir uma acurácia predefinida ao modelo. Por outro lado o Theil-Sen usa a mediana de modelos obtidos usando mínimos quadradosno processo de estimação. O desempenho dos estimadores em uma ampla gama de condições experimentais é comparado e analisado. Além do problema de regressão linear, considera-se o problema de redução da dimensionalidade. Especificamente, são tratados o Locally Linear Embedding, o Principal ComponentAnalysis e outras abordagens robustas destes. É proposto um método denominado RALLE com a motivação de prover de robustez ao algoritmo de LLE. A ideia principal é usar vizinhanças de tamanhos variáveis para construir os pesos dos pontos; para fazer isto possível, o RAPCA é executado em cada grupo de vizinhos e os pontos sob risco são descartados da vizinhança correspondente. É feita uma avaliação do desempenho do LLE, do RLLE e do RALLE sobre algumas bases de dados. Outliers Estatística robusta Regressão linear Redução da dimensionalidade Locally Linear Embedding
2	Locally linear embedding algorithm:extensions and applications Kayo, O. (Olga) 25 April 2006 (has links) Abstract Raw data sets taken with various capturing devices are usually multidimensional and need to be preprocessed before applying subsequent operations, such as clustering, classification, outlier detection, noise filtering etc. One of the steps of data preprocessing is dimensionality reduction. It has been developed with an aim to reduce or eliminate information bearing secondary importance, and retain or highlight meaningful information while reducing the dimensionality of data. Since the nature of real-world data is often nonlinear, linear dimensionality reduction techniques, such as principal component analysis (PCA), fail to preserve a structure and relationships in a highdimensional space when data are mapped into a low-dimensional space. This means that nonlinear dimensionality reduction methods are in demand in this case. Among them is a method called locally linear embedding (LLE), which is the focus of this thesis. Its main attractive characteristics are few free parameters to be set and a non-iterative solution avoiding the convergence to a local minimum. In this thesis, several extensions to the conventional LLE are proposed, which aid us to overcome some limitations of the algorithm. The study presents a comparison between LLE and three nonlinear dimensionality reduction techniques (isometric feature mapping (Isomap), self-organizing map (SOM) and fast manifold learning based on Riemannian normal coordinates (S-LogMap) applied to manifold learning. This comparison is of interest, since all of the listed methods reduce high-dimensional data in different ways, and it is worth knowing for which case a particular method outperforms others. A number of applications of dimensionality reduction techniques exist in data mining. One of them is visualization of high-dimensional data sets. The main goal of data visualization is to find a one, two or three-dimensional descriptive data projection, which captures and highlights important knowledge about data while eliminating the information loss. This process helps people to explore and understand the data structure that facilitates the choice of a proper method for the data analysis, e.g., selecting simple or complex classifier etc. The application of LLE for visualization is described in this research. The benefits of dimensionality reduction are commonly used in obtaining compact data representation before applying a classifier. In this case, the main goal is to obtain a low-dimensional data representation, which possesses good class separability. For this purpose, a supervised variant of LLE (SLLE) is proposed in this thesis. classification clustering dimensionality reduction locally linear embedding visualization
3	Realistic Motion Estimation Using Accelerometers Xie, Liguang 04 August 2009 (has links) A challenging goal for both the game industry and the research community of computer graphics is the generation of 3D virtual avatars that automatically perform realistic human motions with high speed at low monetary cost. So far, full body motion estimation of human complexity remains an important open problem. We propose a realistic motion estimation framework to control the animation of 3D avatars. Instead of relying on a motion capture device as the control signal, we use low-cost and ubiquitously available 3D accelerometer sensors. The framework is developed in a data-driven fashion, which includes two phases: model learning from an existing high quality motion database, and motion synthesis from the control signal. In the phase of model learning, we built a high quality motion model of less complexity that learned from a large motion capture database. Then, by taking the 3D accelerometer sensor signal as input, we were able to synthesize high-quality motion from the motion model we learned. In this thesis, we present two different techniques for model learning and motion synthesis, respectively. Linear and nonlinear reduction techniques for data dimensionality are applied to search for the proper low dimensional representation of motion data. Two motion synthesis methods, interpolation and optimization, are compared using the 3D acceleration signals with high noise. We evaluate the result visually compared to the real video and quantitatively compared to the ground truth motion. The system performs well, which makes it available to a wide range of interactive applications, such as character control in 3D virtual environments and occupational training. / Master of Science Locally linear embedding Performance animation Optimization Motion synthesis Accelerometers Interpolation
4	Dimension Reduction Techniques in Morhpometrics / Dimension Reduction Techniques in Morhpometrics Kratochvíl, Jakub January 2011 (has links) This thesis centers around dimensionality reduction and its usage on landmark-type data which are often used in anthropology and morphometrics. In particular we focus on non-linear dimensionality reduction methods - locally linear embedding and multidimensional scaling. We introduce a new approach to dimensionality reduction called multipass dimensionality reduction and show that improves the quality of classification as well as requiring less dimensions for successful classification than the traditional singlepass methods.
5	A Contribution To Modern Data Reduction Techniques And Their Applications By Applied Mathematics And Statistical Learning Sakarya, Hatice 01 January 2010 (has links) (PDF) High-dimensional data take place from digital image processing, gene expression micro arrays, neuronal population activities to financial time series. Dimensionality Reduction - extracting low dimensional structure from high dimension - is a key problem in many areas like information processing, machine learning, data mining, information retrieval and pattern recognition, where we find some data reduction techniques. In this thesis we will give a survey about modern data reduction techniques, representing the state-of-the-art of theory, methods and application, by introducing the language of mathematics there. This needs a special care concerning the questions of, e.g., how to understand discrete structures as manifolds, to identify their structure, preparing the dimension reduction, and to face complexity in the algorithmically methods. A special emphasis will be paid to Principal Component Analysis, Locally Linear Embedding and Isomap Algorithms. These algorithms are studied by a research group from Vilnius, Lithuania and Zeev Volkovich, from Software Engineering Department, ORT Braude College of Engineering, Karmiel, and others. The main purpose of this study is to compare the results of the three of the algorithms. While the comparison is beeing made we will focus the results and duration. QA Mathematics 1-939
6	Multi-label classification on locally-linear data: Application to chemical toxicity prediction Yap, Xiu Huan 16 August 2021 (has links) No description available. Computer Science Toxicology Predictive Toxicology Multi-label Classification Locally-linear data Locality-sensitive deep learner attention
7	A New Hands-free Face to Face Video Communication Method : Profile based frontal face video reconstruction LI, Songyu January 2018 (has links) This thesis proposes a method to reconstruct a frontal facial video basedon encoding done with the facial profile of another video sequence.The reconstructed facial video will have the similar facial expressionchanges as the changes in the profile video. First, the profiles for boththe reference video and for the test video are captured by edge detection.Then, asymmetrical principal component analysis is used to model thecorrespondence between the profile and the frontal face. This allows en-coding from a profile and decoding of the frontal face of another video.Another solution is to use dynamic time warping to match the profilesand select the best matching corresponding frontal face frame for re-construction. With this method, we can reconstructed the test frontalvideo to make it have the similar changing in facial expressions as thereference video. To improve the quality of the result video, Local Lin-ear Embedding is used to give the result video a smoother transitionbetween frames. Övrig annan teknik
8	Linear and Nonlinear Dimensionality-Reduction-Based Surrogate Models for Real-Time Design Space Exploration of Structural Responses Bird, Gregory David 03 August 2020 (has links) Design space exploration (DSE) is a tool used to evaluate and compare designs as part of the design selection process. While evaluating every possible design in a design space is infeasible, understanding design behavior and response throughout the design space may be accomplished by evaluating a subset of designs and interpolating between them using surrogate models. Surrogate modeling is a technique that uses low-cost calculations to approximate the outcome of more computationally expensive calculations or analyses, such as finite element analysis (FEA). While surrogates make quick predictions, accuracy is not guaranteed and must be considered. This research addressed the need to improve the accuracy of surrogate predictions in order to improve DSE of structural responses. This was accomplished by performing comparative analyses of linear and nonlinear dimensionality-reduction-based radial basis function (RBF) surrogate models for emulating various FEA nodal results. A total of four dimensionality reduction methods were investigated, namely principal component analysis (PCA), kernel principal component analysis (KPCA), isometric feature mapping (ISOMAP), and locally linear embedding (LLE). These methods were used in conjunction with surrogate modeling to predict nodal stresses and coordinates of a compressor blade. The research showed that using an ISOMAP-based dual-RBF surrogate model for predicting nodal stresses decreased the estimated mean error of the surrogate by 35.7% compared to PCA. Using nonlinear dimensionality-reduction-based surrogates did not reduce surrogate error for predicting nodal coordinates. A new metric, the manifold distance ratio (MDR), was introduced to measure the nonlinearity of the data manifolds. When applied to the stress and coordinate data, the stress space was found to be more nonlinear than the coordinate space for this application. The upfront training cost of the nonlinear dimensionality-reduction-based surrogates was larger than that of their linear counterparts but small enough to remain feasible. After training, all the dual-RBF surrogates were capable of making real-time predictions. This same process was repeated for a separate application involving the nodal displacements of mode shapes obtained from a FEA modal analysis. The modal assurance criterion (MAC) calculation was used to compare the predicted mode shapes, as well as their corresponding true mode shapes obtained from FEA, to a set of reference modes. The research showed that two nonlinear techniques, namely LLE and KPCA, resulted in lower surrogate error in the more complex design spaces. Using a RBF kernel, KPCA achieved the largest average reduction in error of 13.57%. The results also showed that surrogate error was greatly affected by mode shape reversal. Four different approaches of identifying reversed mode shapes were explored, all of which resulted in varying amounts of surrogate error. Together, the methods explored in this research were shown to decrease surrogate error when performing DSE of a turbomachine compressor blade. As surrogate accuracy increases, so does the ability to correctly make engineering decisions and judgements throughout the design process. Ultimately, this will help engineers design better turbomachines. design space exploration surrogate modeling dimensionality reduction principal component analysis kernel principal component analysis isometric feature mapping locally linear embedding finite element analysis modal analysis modal assurance criterion turbomachinery compressor blades Engineering
9	Quantile regression in risk calibration Chao, Shih-Kang 05 June 2015 (has links) Die Quantilsregression untersucht die Quantilfunktion QY \|X (τ ), sodass ∀τ ∈ (0, 1), FY \|X [QY \|X (τ )] = τ erfu ̈llt ist, wobei FY \|X die bedingte Verteilungsfunktion von Y gegeben X ist. Die Quantilsregression ermo ̈glicht eine genauere Betrachtung der bedingten Verteilung u ̈ber die bedingten Momente hinaus. Diese Technik ist in vielerlei Hinsicht nu ̈tzlich: beispielsweise fu ̈r das Risikomaß Value-at-Risk (VaR), welches nach dem Basler Akkord (2011) von allen Banken angegeben werden muss, fu ̈r ”Quantil treatment-effects” und die ”bedingte stochastische Dominanz (CSD)”, welches wirtschaftliche Konzepte zur Messung der Effektivit ̈at einer Regierungspoli- tik oder einer medizinischen Behandlung sind. Die Entwicklung eines Verfahrens zur Quantilsregression stellt jedoch eine gro ̈ßere Herausforderung dar, als die Regression zur Mitte. Allgemeine Regressionsprobleme und M-Scha ̈tzer erfordern einen versierten Umgang und es muss sich mit nicht- glatten Verlustfunktionen besch ̈aftigt werden. Kapitel 2 behandelt den Einsatz der Quantilsregression im empirischen Risikomanagement w ̈ahrend einer Finanzkrise. Kapitel 3 und 4 befassen sich mit dem Problem der h ̈oheren Dimensionalit ̈at und nichtparametrischen Techniken der Quantilsregression. / Quantile regression studies the conditional quantile function QY\|X(τ) on X at level τ which satisfies FY \|X QY \|X (τ ) = τ , where FY \|X is the conditional CDF of Y given X, ∀τ ∈ (0,1). Quantile regression allows for a closer inspection of the conditional distribution beyond the conditional moments. This technique is par- ticularly useful in, for example, the Value-at-Risk (VaR) which the Basel accords (2011) require all banks to report, or the ”quantile treatment effect” and ”condi- tional stochastic dominance (CSD)” which are economic concepts in measuring the effectiveness of a government policy or a medical treatment. Given its value of applicability, to develop the technique of quantile regression is, however, more challenging than mean regression. It is necessary to be adept with general regression problems and M-estimators; additionally one needs to deal with non-smooth loss functions. In this dissertation, chapter 2 is devoted to empirical risk management during financial crises using quantile regression. Chapter 3 and 4 address the issue of high-dimensionality and the nonparametric technique of quantile regression. Quantilsregression M-Schätzer Konfidenzbereiche für Quantilfunktionen faktorisierbaren multivariaten Modellen Ky-Fan-Norm CoVaR Value-at-Risk Quantile regression Locally linear quantile regression Partially linear model 330 Wirtschaft 17 Wirtschaft QH 234 ddc:330
10	Categorical structural optimization : methods and applications / Optimisation structurelle catégorique : méthodes et applications Gao, Huanhuan 07 February 2019 (has links) La thèse se concentre sur une recherche méthodologique sur l'optimisation structurelle catégorielle au moyen d'un apprentissage multiple. Dans cette thèse, les variables catégorielles non ordinales sont traitées comme des variables discrètes multidimensionnelles. Afin de réduire la dimensionnalité, les nombreuses techniques d'apprentissage sont introduites pour trouver la dimensionnalité intrinsèque et mapper l'espace de conception d'origine sur un espace d'ordre réduit. Les mécanismes des techniques d'apprentissage à la fois linéaires et non linéaires sont d'abord étudiés. Ensuite, des exemples numériques sont testés pour comparer les performances de nombreuses techniques d’apprentissage. Sur la base de la représentation d'ordre réduit obtenue par Isomap, les opérateurs de mutation et de croisement évolutifs basés sur les graphes sont proposés pour traiter des problèmes d'optimisation structurelle catégoriels, notamment la conception du dôme, du cadre rigide de six étages et des structures en forme de dame. Ensuite, la méthode de recherche continue consistant à déplacer des asymptotes est exécutée et fournit une solution compétitive, mais inadmissible, en quelques rares itérations. Ensuite, lors de la deuxième étape, une stratégie de recherche discrète est proposée pour rechercher de meilleures solutions basées sur la recherche de voisins. Afin de traiter le cas dans lequel les instances de conception catégorielles sont réparties sur plusieurs variétés, nous proposons une méthode d'apprentissage des variétés k-variétés basée sur l'analyse en composantes principales pondérées. / The thesis concentrates on a methodological research on categorical structural optimizationby means of manifold learning. The main difficulty of handling the categorical optimization problems lies in the description of the categorical variables: they are presented in a category and do not have any orders. Thus the treatment of the design space is a key issue. In this thesis, the non-ordinal categorical variables are treated as multi-dimensional discrete variables, thus the dimensionality of corresponding design space becomes high. In order to reduce the dimensionality, the manifold learning techniques are introduced to find the intrinsic dimensionality and map the original design space to a reduced-order space. The mechanisms of both linear and non-linear manifold learning techniques are firstly studied. Then numerical examples are tested to compare the performance of manifold learning techniques mentioned above. It is found that the PCA and MDS can only deal with linear or globally approximately linear cases. Isomap preserves the geodesic distances for non-linear manifold however, its time consuming is the most. LLE preserves the neighbour weights and can yield good results in a short time. KPCA works like a non-linear classifier and we proves why it cannot preserve distances or angles in some cases. Based on the reduced-order representation obtained by Isomap, the graph-based evolutionary crossover and mutation operators are proposed to deal with categorical structural optimization problems, including the design of dome, six-story rigid frame and dame-like structures. The results show that the proposed graph-based evolutionary approach constructed on the reduced-order space performs more efficiently than traditional methods including simplex approach or evolutionary approach without reduced-order space. In chapter 5, the LLE is applied to reduce the data dimensionality and a polynomial interpolation helps to construct the responding surface from lower dimensional representation to original data. Then the continuous search method of moving asymptotes is executed and yields a competitively good but inadmissible solution within only a few of iteration numbers. Then in the second stage, a discrete search strategy is proposed to find out better solutions based on a neighbour search. The ten-bar truss and dome structural design problems are tested to show the validity of the method. In the end, this method is compared to the Simulated Annealing algorithm and Covariance Matrix Adaptation Evolutionary Strategy, showing its better optimization efficiency. In chapter 6, in order to deal with the case in which the categorical design instances are distributed on several manifolds, we propose a k-manifolds learning method based on the Weighted Principal Component Analysis. And the obtained manifolds are integrated in the lower dimensional design space. Then the method introduced in chapter 4 is applied to solve the ten-bar truss, the dome and the dame-like structural design problems. Optimisation structurelle Apprentissage multiple Réduction de la dimensionnalité Structure en treillis Categorical optimization Structural optimization Manifold learning Dimensionality reduction Polynomial fitting Locally linear embedding Isomap K-manifolds learning Evolutionary methods Kernel functions Polynomial fitting Truss structure Weighted principal component analysis

Search results