Global ETD Search

1	Improving Stability and Parameter Selection of Data Processing Programs Wen-Chuan Lee (8206287) 07 January 2020 (has links) <div>Data-processing programs are becoming increasingly important in the Big-data era. However, two notable problems of these programs may cause sub-optimal data- processing results. On one hand, these programs contain large number of floating-point computations. Due to the limited precision of floating-point representations, errors are introduced, propagated and accumulated in series of computations, making the computation results unreliable. We call this problem as floating-point instability. On the other hand, these programs are heavily parameterized. As no universal optimal parameter configuration exists for all possible inputs, the setting of program parameters should be carefully chosen and tuned for each input. Otherwise, the result would be sub-optimal. Manual tuning is infeasible because the number of parameters and the range of each parameter value may be big.</div><div><br></div><div>We try to address these two challenges in this dissertation. For floating-point instability problem, we develop a novel runtime technique to capture different output variations in the presence of instability. It features the idea of transforming every floating point value to a vector of multiple values $-$ the values added to create the vector are obtained by introducing artificial errors that are upper bounds of actual errors. The propagation of artificial errors models the propagation of actual errors. When values in vectors result in discrete execution differences (e.g., following different paths), the execution is forked to capture the resulting output variations.</div><div><br></div><div>For parameterized data-processing programs, we develop a white-box program tuning framework to tune the program parameter configuration for optimal data-processing result of each program input. </div><div>To further reduce the parameter configuration overhead, we propose the first general framework to inject artificial intelligence (AI) in the program, so the intelligent program is able to predict the parameter configuration for each incoming input directly. However, similar to many other ML/AI applications, the crucial challenge lies in feature selection, i.e., selection of the feature variables for predicting the target parameter specified by the users.</div><div>Thus, we propose a novel approach by combining program analysis and statistical analysis for better program feature variables selection which further helps better target parameter prediction and improves the result.</div> Software Engineering testing and debugging compiler optimiztions programming frameworks Runtime parameter choice
2	Regularization properties of the discrepancy principle for Tikhonov regularization in Banach spaces Anzengruber, Stephan W., Hofmann, Bernd, Mathé, Peter 11 December 2012 (has links) (PDF) The stable solution of ill-posed non-linear operator equations in Banach space requires regularization. One important approach is based on Tikhonov regularization, in which case a one-parameter family of regularized solutions is obtained. It is crucial to choose the parameter appropriately. Here, a variant of the discrepancy principle is analyzed. In many cases such parameter choice exhibits the feature, called regularization property below, that the chosen parameter tends to zero as the noise tends to zero, but slower than the noise level. Here we shall show such regularization property under two natural assumptions. First, exact penalization must be excluded, and secondly, the discrepancy principle must stop after a finite number of iterations. We conclude this study with a discussion of some consequences for convergence rates obtained by the discrepancy principle under the validity of some kind of variational inequality, a recent tool for the analysis of inverse problems. inverse Probleme Tikhonov-Regularisierung discrepancy principle parameter choice properties Inverse problems Tikhonov-type regularization ddc:510 inkorrekt gestelltes Problem
3	Extending covariance structure analysis for multivariate and functional data Sheppard, Therese January 2010 (has links) For multivariate data, when testing homogeneity of covariance matrices arising from two or more groups, Bartlett's (1937) modified likelihood ratio test statistic is appropriate to use under the null hypothesis of equal covariance matrices where the null distribution of the test statistic is based on the restrictive assumption of normality. Zhang and Boos (1992) provide a pooled bootstrap approach when the data cannot be assumed to be normally distributed. We give three alternative bootstrap techniques to testing homogeneity of covariance matrices when it is both inappropriate to pool the data into one single population as in the pooled bootstrap procedure and when the data are not normally distributed. We further show that our alternative bootstrap methodology can be extended to testing Flury's (1988) hierarchy of covariance structure models. Where deviations from normality exist, we show, by simulation, that the normal theory log-likelihood ratio test statistic is less viable compared with our bootstrap methodology. For functional data, Ramsay and Silverman (2005) and Lee et al (2002) together provide four computational techniques for functional principal component analysis (PCA) followed by covariance structure estimation. When the smoothing method for smoothing individual profiles is based on using least squares cubic B-splines or regression splines, we find that the ensuing covariance matrix estimate suffers from loss of dimensionality. We show that ridge regression can be used to resolve this problem, but only for the discretisation and numerical quadrature approaches to estimation, and that choice of a suitable ridge parameter is not arbitrary. We further show the unsuitability of regression splines when deciding on the optimal degree of smoothing to apply to individual profiles. To gain insight into smoothing parameter choice for functional data, we compare kernel and spline approaches to smoothing individual profiles in a nonparametric regression context. Our simulation results justify a kernel approach using a new criterion based on predicted squared error. We also show by simulation that, when taking account of correlation, a kernel approach using a generalized cross validatory type criterion performs well. These data-based methods for selecting the smoothing parameter are illustrated prior to a functional PCA on a real data set. 519.5
4	Regularization properties of the discrepancy principle for Tikhonov regularization in Banach spaces: Regularization properties of the discrepancy principle for Tikhonov regularization in Banach spaces Anzengruber, Stephan W., Hofmann, Bernd, Mathé, Peter January 2012 (has links) The stable solution of ill-posed non-linear operator equations in Banach space requires regularization. One important approach is based on Tikhonov regularization, in which case a one-parameter family of regularized solutions is obtained. It is crucial to choose the parameter appropriately. Here, a variant of the discrepancy principle is analyzed. In many cases such parameter choice exhibits the feature, called regularization property below, that the chosen parameter tends to zero as the noise tends to zero, but slower than the noise level. Here we shall show such regularization property under two natural assumptions. First, exact penalization must be excluded, and secondly, the discrepancy principle must stop after a finite number of iterations. We conclude this study with a discussion of some consequences for convergence rates obtained by the discrepancy principle under the validity of some kind of variational inequality, a recent tool for the analysis of inverse problems. info:eu-repo/classification/ddc/510 ddc:510 inkorrekt gestelltes Problem
5	Optimization framework for large-scale sparse blind source separation / Stratégies d'optimisation pour la séparation aveugle de sources parcimonieuses grande échelle Kervazo, Christophe 04 October 2019 (has links) Lors des dernières décennies, la Séparation Aveugle de Sources (BSS) est devenue un outil de premier plan pour le traitement de données multi-valuées. L’objectif de ce doctorat est cependant d’étudier les cas grande échelle, pour lesquels la plupart des algorithmes classiques obtiennent des performances dégradées. Ce document s’articule en quatre parties, traitant chacune un aspect du problème: i) l’introduction d’algorithmes robustes de BSS parcimonieuse ne nécessitant qu’un seul lancement (malgré un choix d’hyper-paramètres délicat) et fortement étayés mathématiquement; ii) la proposition d’une méthode permettant de maintenir une haute qualité de séparation malgré un nombre de sources important: iii) la modification d’un algorithme classique de BSS parcimonieuse pour l’application sur des données de grandes tailles; et iv) une extension au problème de BSS parcimonieuse non-linéaire. Les méthodes proposées ont été amplement testées, tant sur données simulées que réalistes, pour démontrer leur qualité. Des interprétations détaillées des résultats sont proposées. / During the last decades, Blind Source Separation (BSS) has become a key analysis tool to study multi-valued data. The objective of this thesis is however to focus on large-scale settings, for which most classical algorithms fail. More specifically, it is subdivided into four sub-problems taking their roots around the large-scale sparse BSS issue: i) introduce a mathematically sound robust sparse BSS algorithm which does not require any relaunch (despite a difficult hyper-parameter choice); ii) introduce a method being able to maintain high quality separations even when a large-number of sources needs to be estimated; iii) make a classical sparse BSS algorithm scalable to large-scale datasets; and iv) an extension to the non-linear sparse BSS problem. The methods we propose are extensively tested on both simulated and realistic experiments to demonstrate their quality. In-depth interpretations of the results are proposed. Représentations Parcimonieuses Optimisation Multi-Convexe Choix de paramètres de régularisation Large-Scale Blind Source Separation Sparse Representations Regularization Parameter Choice Non-Linear Blind Source Separation

1

Page generated in 0.0602 seconds