• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 133
  • 24
  • 12
  • 7
  • 5
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 213
  • 53
  • 31
  • 30
  • 29
  • 27
  • 24
  • 21
  • 21
  • 20
  • 19
  • 19
  • 19
  • 18
  • 18
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
171

Inverse Problems of Deconvolution Applied in the Fields of Geosciences and Planetology / Problèmes inverses de déconvolution appliqués aux Géosciences et à la Planétologie

Meresescu, Alina-Georgiana 25 September 2018 (has links)
Le domaine des problèmes inverses est une discipline qui se trouve à la frontière des mathématiques appliquées et de la physique et qui réunit les différentes solutions pour résoudre les problèmes d'optimisation mathématique. Dans le cas de la déconvolution 1D, ce domaine apporte un formalisme pour proposer des solutions avec deux grands types d'approche: les problèmes inverses avec régularisation et les problèmes inverses bayésiens. Sous l'effet du déluge de données, les géosciences et la planétologie nécessitent des algorithmes de plus en plus plus complexe pour obtenir des informations pertinentes. Dans le cadre de cette thèse, nous proposons d'apporter des connaissances dans trois problèmes de déconvolution 1D sous contrainte avec régularisation dans le domaine de l'hydrologie, la sismologie et de la spectroscopie. Pour chaque problème nous posons le modèle direct, le modèle inverse, et nous proposons un algorithme spécifique pour atteindre la solution. Les algorithmes sont définis ainsi que les différentes stratégies pour déterminer les hyper-paramètres. Aussi, des tests sur des données synthétiques et sur des données réelles sont exposés et discuté du point de vue de l'optimisation mathématique et du point de vue du domaine de l'application choisi. Finalement, les algorithmes proposés ont l'objectif de mettre à portée de main l'utilisation des méthodes des problèmes inverses pour la communauté des Géosciences. / The inverse problem field is a domain at the border between applied mathematics and physics that encompasses the solutions for solving mathematical optimization problems. In the case of 1D deconvolution, the discipline provides a formalism to designing solutions in the frames of its two main approaches: regularization based inverse problems and bayesian based inverse problems. Under the data deluge, geosciences and planetary sciences require more and more complex algorithms for obtaining pertinent information. In this thesis, we solve three 1D deconvolution problems under constraints with regularization based inverse problem methodology: in hydrology, in seismology and in spectroscopy. For every of the three problems, we pose the direct problem, the inverse problem, and we propose a specific algorithm to reach the solution. Algorithms are defined but also the different strategies to determine the hyper-parameters. Furthermore, tests on synthetic data and on real data are presented and commented from the point of view of the inverse problem formulation and that of the application field. Finally, the proposed algorithms aim at making approachable the use of inverse problem methodology for the Geoscience community.
172

Estimation non paramétrique de densités conditionnelles : grande dimension, parcimonie et algorithmes gloutons. / Nonparametric estimation of sparse conditional densities in moderately large dimensions by greedy algorithms.

Nguyen, Minh-Lien Jeanne 08 July 2019 (has links)
Nous considérons le problème d’estimation de densités conditionnelles en modérément grandes dimensions. Beaucoup plus informatives que les fonctions de régression, les densités condi- tionnelles sont d’un intérêt majeur dans les méthodes récentes, notamment dans le cadre bayésien (étude de la distribution postérieure, recherche de ses modes...). Après avoir rappelé les problèmes liés à l’estimation en grande dimension dans l’introduction, les deux chapitres suivants développent deux méthodes qui s’attaquent au fléau de la dimension en demandant : d’être efficace computation- nellement grâce à une procédure itérative gloutonne, de détecter les variables pertinentes sous une hypothèse de parcimonie, et converger à vitesse minimax quasi-optimale. Plus précisément, les deux méthodes considèrent des estimateurs à noyau bien adaptés à l’estimation de densités conditionnelles et sélectionnent une fenêtre multivariée ponctuelle en revisitant l’algorithme glouton RODEO (Re- gularisation Of Derivative Expectation Operator). La première méthode ayant des problèmes d’ini- tialisation et des facteurs logarithmiques supplémentaires dans la vitesse de convergence, la seconde méthode résout ces problèmes, tout en ajoutant l’adaptation à la régularité. Dans l’avant-dernier cha- pitre, on traite de la calibration et des performances numériques de ces deux procédures, avant de donner quelques commentaires et perspectives dans le dernier chapitre. / We consider the problem of conditional density estimation in moderately large dimen- sions. Much more informative than regression functions, conditional densities are of main interest in recent methods, particularly in the Bayesian framework (studying the posterior distribution, find- ing its modes...). After recalling the estimation issues in high dimension in the introduction, the two following chapters develop on two methods which address the issues of the curse of dimensionality: being computationally efficient by a greedy iterative procedure, detecting under some suitably defined sparsity conditions the relevant variables, while converging at a quasi-optimal minimax rate. More precisely, the two methods consider kernel estimators well-adapted for conditional density estimation and select a pointwise multivariate bandwidth by revisiting the greedy algorithm RODEO (Regular- isation Of Derivative Expectation Operator). The first method having some initialization problems and extra logarithmic factors in its convergence rate, the second method solves these problems, while adding adaptation to the smoothness. In the penultimate chapter, we discuss the calibration and nu- merical performance of these two procedures, before giving some comments and perspectives in the last chapter.
173

Safe optimization algorithms for variable selection and hyperparameter tuning / Algorithmes d’optimisation sûrs pour la sélection de variables et le réglage d’hyperparamètre

Ndiaye, Eugene 04 October 2018 (has links)
Le traitement massif et automatique des données requiert le développement de techniques de filtration des informations les plus importantes. Parmi ces méthodes, celles présentant des structures parcimonieuses se sont révélées idoines pour améliorer l’efficacité statistique et computationnelle des estimateurs, dans un contexte de grandes dimensions. Elles s’expriment souvent comme solution de la minimisation du risque empirique régularisé s’écrivant comme une somme d’un terme lisse qui mesure la qualité de l’ajustement aux données, et d’un terme non lisse qui pénalise les solutions complexes. Cependant, une telle manière d’inclure des informations a priori, introduit de nombreuses difficultés numériques pour résoudre le problème d’optimisation sous-jacent et pour calibrer le niveau de régularisation. Ces problématiques ont été au coeur des questions que nous avons abordées dans cette thèse.Une technique récente, appelée «Screening Rules», propose d’ignorer certaines variables pendant le processus d’optimisation en tirant bénéfice de la parcimonie attendue des solutions. Ces règles d’élimination sont dites sûres lorsqu’elles garantissent de ne pas rejeter les variables à tort. Nous proposons un cadre unifié pour identifier les structures importantes dans ces problèmes d’optimisation convexes et nous introduisons les règles «Gap Safe Screening Rules». Elles permettent d’obtenir des gains considérables en temps de calcul grâce à la réduction de la dimension induite par cette méthode. De plus, elles s’incorporent facilement aux algorithmes itératifs et s’appliquent à un plus grand nombre de problèmes que les méthodes précédentes.Pour trouver un bon compromis entre minimisation du risque et introduction d’un biais d’apprentissage, les algorithmes d’homotopie offrent la possibilité de tracer la courbe des solutions en fonction du paramètre de régularisation. Toutefois, ils présentent des instabilités numériques dues à plusieurs inversions de matrice, et sont souvent coûteux en grande dimension. Aussi, ils ont des complexités exponentielles en la dimension du modèle dans des cas défavorables. En autorisant des solutions approchées, une approximation de la courbe des solutions permet de contourner les inconvénients susmentionnés. Nous revisitons les techniques d’approximation des chemins de régularisation pour une tolérance prédéfinie, et nous analysons leur complexité en fonction de la régularité des fonctions de perte en jeu. Il s’ensuit une proposition d’algorithmes optimaux ainsi que diverses stratégies d’exploration de l’espace des paramètres. Ceci permet de proposer une méthode de calibration de la régularisation avec une garantie de convergence globale pour la minimisation du risque empirique sur les données de validation.Le Lasso, un des estimateurs parcimonieux les plus célèbres et les plus étudiés, repose sur une théorie statistique qui suggère de choisir la régularisation en fonction de la variance des observations. Ceci est difficilement utilisable en pratique car, la variance du modèle est une quantité souvent inconnue. Dans de tels cas, il est possible d’optimiser conjointement les coefficients de régression et le niveau de bruit. Ces estimations concomitantes, apparues dans la littérature sous les noms de Scaled Lasso, Square-Root Lasso, fournissent des résultats théoriques aussi satisfaisants que celui du Lasso tout en étant indépendant de la variance réelle. Bien que présentant des avancées théoriques et pratiques importantes, ces méthodes sont aussi numériquement instables et les algorithmes actuellement disponibles sont coûteux en temps de calcul. Nous illustrons ces difficultés et nous proposons à la fois des modifications basées sur des techniques de lissage pour accroitre la stabilité numérique de ces estimateurs, ainsi qu’un algorithme plus efficace pour les obtenir. / Massive and automatic data processing requires the development of techniques able to filter the most important information. Among these methods, those with sparse structures have been shown to improve the statistical and computational efficiency of estimators in a context of large dimension. They can often be expressed as a solution of regularized empirical risk minimization and generally lead to non differentiable optimization problems in the form of a sum of a smooth term, measuring the quality of the fit, and a non-smooth term, penalizing complex solutions. Although it has considerable advantages, such a way of including prior information, unfortunately introduces many numerical difficulties both for solving the underlying optimization problem and to calibrate the level of regularization. Solving these issues has been at the heart of this thesis. A recently introduced technique, called "Screening Rules", proposes to ignore some variables during the optimization process by benefiting from the expected sparsity of the solutions. These elimination rules are said to be safe when the procedure guarantees to not reject any variable wrongly. In this work, we propose a unified framework for identifying important structures in these convex optimization problems and we introduce the "Gap Safe Screening Rules". They allows to obtain significant gains in computational time thanks to the dimensionality reduction induced by this method. In addition, they can be easily inserted into iterative algorithms and apply to a large number of problems.To find a good compromise between minimizing risk and introducing a learning bias, (exact) homotopy continuation algorithms offer the possibility of tracking the curve of the solutions as a function of the regularization parameters. However, they exhibit numerical instabilities due to several matrix inversions and are often expensive in large dimension. Another weakness is that a worst-case analysis shows that they have exact complexities that are exponential in the dimension of the model parameter. Allowing approximated solutions makes possible to circumvent the aforementioned drawbacks by approximating the curve of the solutions. In this thesis, we revisit the approximation techniques of the regularization paths given a predefined tolerance and we propose an in-depth analysis of their complexity w.r.t. the regularity of the loss functions involved. Hence, we propose optimal algorithms as well as various strategies for exploring the parameters space. We also provide calibration method (for the regularization parameter) that enjoys globalconvergence guarantees for the minimization of the empirical risk on the validation data.Among sparse regularization methods, the Lasso is one of the most celebrated and studied. Its statistical theory suggests choosing the level of regularization according to the amount of variance in the observations, which is difficult to use in practice because the variance of the model is oftenan unknown quantity. In such case, it is possible to jointly optimize the regression parameter as well as the level of noise. These concomitant estimates, appeared in the literature under the names of Scaled Lasso or Square-Root Lasso, and provide theoretical results as sharp as that of theLasso while being independent of the actual noise level of the observations. Although presenting important advances, these methods are numerically unstable and the currently available algorithms are expensive in computation time. We illustrate these difficulties and we propose modifications based on smoothing techniques to increase stability of these estimators as well as to introduce a faster algorithm.
174

Mitigation of Data Scarcity Issues for Semantic Classification in a Virtual Patient Dialogue Agent

Stiff, Adam January 2020 (has links)
No description available.
175

Tackling the Communication Bottlenecks of Distributed Deep Learning Training Workloads

Ho, Chen-Yu 08 1900 (has links)
Deep Neural Networks (DNNs) find widespread applications across various domains, including computer vision, recommendation systems, and natural language processing. Despite their versatility, training DNNs can be a time-consuming process, and accommodating large models and datasets on a single machine is often impractical. To tackle these challenges, distributed deep learning (DDL) training workloads have gained increasing significance. However, DDL training introduces synchronization requirements among nodes, and the mini-batch stochastic gradient descent algorithm heavily burdens network connections. This dissertation proposes, analyzes, and evaluates three solutions addressing the communication bottleneck in DDL learning workloads. The first solution, SwitchML, introduces an in-network aggregation (INA) primitive that accelerates DDL workloads. By aggregating model updates from multiple workers within the network, SwitchML reduces the volume of exchanged data. This approach, which incorporates switch processing, end-host protocols, and Deep Learning frameworks, enhances training speed by up to 5.5 times for real-world benchmark models. The second solution, OmniReduce, is an efficient streaming aggregation system designed for sparse collective communication. It optimizes performance for parallel computing applications, such as distributed training of large-scale recommendation systems and natural language processing models. OmniReduce achieves maximum effective bandwidth utilization by transmitting only nonzero data blocks and leveraging fine-grained parallelization and pipelining. Compared to state-of-the-art TCP/IP and RDMA network solutions, OmniReduce outperforms them by 3.5 to 16 times, delivering significantly better performance for network-bottlenecked DNNs, even at 100 Gbps. The third solution, CoInNetFlow, addresses congestion in shared data centers, where multiple DNN training jobs compete for bandwidth on the same node. The study explores the feasibility of coflow scheduling methods in hierarchical and multi-tenant in-network aggregation communication patterns. CoInNetFlow presents an innovative utilization of the Sincronia priority assignment algorithm. Through packet-level DDL job simulation, the research demonstrates that appropriate weighting functions, transport layer priority scheduling, and gradient compression on low-priority tensors can significantly improve the median Job Completion Time Inflation by over $70\%$. Collectively, this dissertation contributes to mitigating the network communication bottleneck in distributed deep learning. The proposed solutions can enhance the efficiency and speed of distributed deep learning systems, ultimately improving the performance of DNN training across various domains.
176

Enhancing Simulated Sonar Images With CycleGAN for Deep Learning in Autonomous Underwater Vehicles / Djupinlärning, maskininlärning, sonar, simulering, GAN, cycleGAN, YOLO-v4, gles data, osäkerhetsanalys

Norén, Aron January 2021 (has links)
This thesis addresses the issues of data sparsity in the sonar domain. A data pipeline is set up to generate and enhance sonar data. The possibilities and limitations of using cycleGAN as a tool to enhance simulated sonar images for the purpose of training neural networks for detection and classification is studied. A neural network is trained on the enhanced simulated sonar images and tested on real sonar images to evaluate the quality of these images.The novelty of this work lies in extending previous methods to a more general framework and showing that GAN enhanced simulations work for complex tasks on field data.Using real sonar images to enhance the simulated images, resulted in improved classification compared to a classifier trained on solely simulated images. / Denna rapport ämnar undersöka problemet med gles data för djupinlärning i sonardomänen. Ett dataflöde för att generera och höja kvalitén hos simulerad sonardata sätts upp i syfte att skapa en stor uppsättning data för att träna ett neuralt nätverk. Möjligheterna och begränsningarna med att använda cycleGAN för att höja kvalitén hos simulerad sonardata studeras och diskuteras. Ett neuralt nätverk för att upptäcka och klassificera objekt i sonarbilder tränas i syfte att evaluera den förbättrade simulerade sonardatan.Denna rapport bygger vidare på tidigare metoder genom att generalisera dessa och visa att metoden har potential även för komplexa uppgifter baserad på icke trivial data.Genom att träna ett nätverk för klassificering och detektion på simulerade sonarbilder som använder cycleGAN för att höja kvalitén, ökade klassificeringsresultaten markant jämfört med att träna på enbart simulerade bilder.
177

Image and Video Resolution Enhancement Using Sparsity Constraints and Bilateral Total Variation Filter

Ashouri, Talouki Zahra 10 1900 (has links)
<p>In this thesis we present new methods for image and video super resolution and video deinterlacing. For image super resolution a new approach for finding a High Resolution (HR) image from a single Low Resolution (LR) image has been introduced. We have done this by employing Compressive Sensing (CS) theory. In CS framework images are assumed to be sparse in a transform domain such as wavelets or contourlets. Using this fact we have developed an approach in which the contourlet domain is considered as the transform domain and a CS algorithm is used to find the high resolution image. Following that, we extend our image super resolution scheme to video super resolution. Our video super resolution method has two steps, the first step consists of our image super resolution method which is applied on each frame separately. Then a post processing step is performed on estimated outputs to increase the video quality. The post processing step consists of a deblurring and a Bilateral Total Variation (BTV) filtering for increasing the video consistency. Experimental results show significant improvement over existing image and video super resolution methods both objectively and subjectively.</p> <p>For video deinterlacing problem a method has been proposed which is also a two step approach. At first 6 interpolators are applied to each missing line and the interpolator which gives the minimum error is selected. An initial deinterlaced frame is constructed using selected interpolator. In the next step this initial deinterlaced frame is fed into a post processing step. The post processing step is a modified version of 2-D Bilateral Total Variation filter. The proposed deinterlacing technique outperforms many existing deinterlacing algorithms.</p> / Master of Science (MSc)
178

Estimation and Determination of Carrying Capacity in Loblolly Pine

Yang, Sheng-I 27 May 2016 (has links)
Stand carrying capacity is the maximum size of population for a species under given environmental conditions. Site resources limit the maximum volume or biomass that can be sustained in forest stands. This study was aimed at estimating and determining the carrying capacity in loblolly pine. Maximum stand basal area (BA) that can be sustained over a long period of time can be regarded as a measure of carrying capacity. To quantify and project stand BA carrying capacity, one approach is to use the estimate from a fitted cumulative BA-age equation; another approach is to obtain BA estimates implied by maximum size-density relationships (MSDRs), denoted implied maximum stand BA. The efficacy of three diameter-based MSDR measures: Reineke's self-thinning rule, competition-density rule and Nilson's sparsity index, were evaluated. Estimates from three MSDR measures were compared with estimates from the Chapman-Richards (C-R) equation fitted to the maximum stand BA observed on plots from spacing trials. The spacing trials, established in the two physiographic regions (Piedmont and Coastal Plain), and at two different scales (operational and miniature) were examined and compared, which provides a sound empirical basis for evaluating potential carrying capacity. Results showed that the stands with high initial planting density approached the stand BA carrying capacity sooner than the stands with lower initial planting density. The maximum stand BA associated with planting density developed similarly at the two scales. The potential carrying capacity in the two physiographic regions was significantly different. The value of implied maximum stand BA converted from three diameter-based MSDR measures was similar to the maximum stand BA curve obtained from the C-R equation. Nilson's sparsity index was the most stable and reliable estimate of stand BA carrying capacity. The flexibility of Nilson's sparsity index can illustrate the effect of physiographic regions on stand BA carrying capacity. Because some uncontrollable factors on long-term operational experiments can make estimates of stand BA carrying capacity unreliable for loblolly pine, it is suggested that the stand BA carrying capacity could be estimated from high initial planting density stands in a relatively short period of time so that the risk of damages and the costs of experiments could be reduced. For estimating carrying capacity, another attractive option is to choose a miniature scale trial (microcosm) because it shortens the experiment time and reduces costs greatly. / Master of Science
179

Addressing Challenges in Graphical Models: MAP estimation, Evidence, Non-Normality, and Subject-Specific Inference

Sagar K N Ksheera (15295831) 17 April 2023 (has links)
<p>Graphs are a natural choice for understanding the associations between variables, and assuming a probabilistic embedding for the graph structure leads to a variety of graphical models that enable us to understand these associations even further. In the realm of high-dimensional data, where the number of associations between interacting variables is far greater than the available number of data points, the goal is to infer a sparse graph. In this thesis, we make contributions in the domain of Bayesian graphical models, where our prior belief on the graph structure, encoded via uncertainty on the model parameters, enables the estimation of sparse graphs.</p> <p><br></p> <p>We begin with the Gaussian Graphical Model (GGM) in Chapter 2, one of the simplest and most famous graphical models, where the joint distribution of interacting variables is assumed to be Gaussian. In GGMs, the conditional independence among variables is encoded in the inverse of the covariance matrix, also known as the precision matrix. Under a Bayesian framework, we propose a novel prior--penalty dual called the `graphical horseshoe-like' prior and penalty, to estimate precision matrix. We also establish the posterior convergence of the precision matrix estimate and the frequentist consistency of the maximum a posteriori (MAP) estimator.</p> <p><br></p> <p>In Chapter 3, we develop a general framework based on local linear approximation for MAP estimation of the precision matrix in GGMs. This general framework holds true for any graphical prior, where the element-wise priors can be written as a Laplace scale mixture. As an application of the framework, we perform MAP estimation of the precision matrix under the graphical horseshoe penalty.</p> <p><br></p> <p>In Chapter 4, we focus on graphical models where the joint distribution of interacting variables cannot be assumed Gaussian. Motivated by the quantile graphical models, where the Gaussian likelihood assumption is relaxed, we draw inspiration from the domain of precision medicine, where personalized inference is crucial to tailor individual-specific treatment plans. With an aim to infer Directed Acyclic Graphs (DAGs), we propose a novel quantile DAG learning framework, where the DAGs depend on individual-specific covariates, making personalized inference possible. We demonstrate the potential of this framework in the regime of precision medicine by applying it to infer protein-protein interaction networks in Lung adenocarcinoma and Lung squamous cell carcinoma.</p> <p><br></p> <p>Finally, we conclude this thesis in Chapter 5, by developing a novel framework to compute the marginal likelihood in a GGM, addressing a longstanding open problem. Under this framework, we can compute the marginal likelihood for a broad class of priors on the precision matrix, where the element-wise priors on the diagonal entries can be written as gamma or scale mixtures of gamma random variables and those on the off-diagonal terms can be represented as normal or scale mixtures of normal. This result paves new roads for model selection using Bayes factors and tuning of prior hyper-parameters.</p>
180

A Signal Processing Approach to Voltage-Sensitive Dye Optical Imaging / Une approche mathématique de l'imagerie optique par colorant potentiométrique

Raguet, Hugo 22 September 2014 (has links)
L’imagerie optique par colorant potentiométrique est une méthode d’enregistrement de l’activité corticale prometteuse, mais dont le potentiel réel est limité par la présence d’artefacts et d’interférences dans les acquisitions. À partir de modèles existant dans la littérature, nous proposons un modèle génératif du signal basé sur un mélange additif de composantes, chacune contrainte dans une union d’espaces linéaires déterminés par son origine biophysique. Motivés par le problème de séparation de composantes qui en découle, qui est un problème inverse linéaire sous-déterminé, nous développons : (1) des régularisations convexes structurées spatialement, favorisant en particulier des solutions parcimonieuses ; (2) un nouvel algorithme proximal de premier ordre pour minimiser efficacement la fonctionnelle qui en résulte ; (3) des méthodes statistiques de sélection de paramètre basées sur l’estimateur non biaisé du risque de Stein. Nous étudions ces outils dans un cadre général, et discutons leur utilité pour de nombreux domaines des mathématiques appliqués, en particulier pour les problèmes inverses ou de régression en grande dimension. Nous développons par la suite un logiciel de séparation de composantes en présence de bruit, dans un environnement intégré adapté à l’imagerie optique par colorant potentiométrique. Finalement, nous évaluons ce logiciel sur différentes données, synthétiques et réelles, montrant des résultats encourageants quant à la possibilité d’observer des dynamiques corticales complexes. / Voltage-sensitive dye optical imaging is a promising recording modality for the cortical activity, but its practical potential is limited by many artefacts and interferences in the acquisitions. Inspired by existing models in the literature, we propose a generative model of the signal, based on an additive mixtures of components, each one being constrained within an union of linear spaces, determined by its biophysical origin. Motivated by the resulting component separation problem, which is an underdetermined linear inverse problem, we develop: (1) convex, spatially structured regularizations, enforcing in particular sparsity on the solutions; (2) a new rst-order proximal algorithm for minimizing e›ciently the resulting functional; (3) statistical methods for automatic parameters selection, based on Stein’s unbiased risk estimate.We study thosemethods in a general framework, and discuss their potential applications in variouselds of applied mathematics, in particular for large scale inverse problems or regressions. We develop subsequently a soŸware for noisy component separation, in an integrated environment adapted to voltage-sensitive dye optical imaging. Finally, we evaluate this soŸware on dišerent data set, including synthetic and real data, showing encouraging perspectives for the observation of complex cortical dynamics.

Page generated in 0.3884 seconds