• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 50
  • 9
  • 5
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 90
  • 90
  • 29
  • 29
  • 29
  • 22
  • 22
  • 15
  • 15
  • 14
  • 13
  • 13
  • 12
  • 11
  • 11
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

Discrete-time Concurrent Learning for System Identification and Applications: Leveraging Memory Usage for Good Learning

Djaneye-Boundjou, Ouboti Seydou Eyanaa January 2017 (has links)
No description available.
82

Feasibility Study of Implementation of Machine Learning Models on Card Transactions / Genomförbarhetsstudie på Implementering av Maskininlärningsmodeller på Korttransaktioner

Alzghaier, Samhar, Can Kaya, Mervan January 2022 (has links)
Several studies have been conducted within machine learning, and various variations have been applied to a wide spectrum of other fields. However, a thorough feasibility study within the payment processing industry using machine learning classifier algorithms is yet to be explored. Here, we construct a rule-based response vector and use that in combination with a magnitude of varying feature vectors across different machine learning classifier algorithms to try and determine whether individual transactions can be considered profitable from a business point of view. These algorithms include Naive-Bayes, AdaBoosting, Stochastic Gradient Descent, K-Nearest Neighbors, Decision Trees and Random Forests, all helped us build a model with a high performance that acts as a robust confirmation of both the benefits and a theoretical guide on the implementation of machine learning algorithms in the payment processing industry. The results as such are a firm confirmation on the benefits of data intensive models, even in complex industries similar to Swedbank Pay’s. These Implications help further boost innovation and revenue as they offer a better understanding of the current pricing mechanisms. / Många studier har utförts inom ämnet maskininlärning, och olika variationer har applicerats på ett brett spektrum av andra ämnen. Däremot, så har en ordentlig genomförbarhetsstudie inom betalningsleveransindustrin med hjälp av klassificeringsalgortimer har ännu ej utforskats. Här har vi konstruerat en regelbaserad responsvektor och använt den, tillsammans med en rad olika och varierande egenskapvektorer på olika maskininlärningsklassificeringsalgoritmer för att försöka avgöra ifall individuella transaktioner är lönsamma utifrån företagets perspektiv. Dessa algoritmer är Naive-Bayes, AdaBoosting, Stokastisk gradient medåkning, K- Närmaste grannar, beslutsträd och slumpmässiga beslutsskogar. Alla dessa har hjälpt oss bygga en teoretisk vägledning om implementering av maskininlärningsalgoritmer inom betalningsleveransindustrin. Dessa resultat är en robust bekräftelse på fördelarna av dataintensiva modeller även inom sådana komplexa industrier Swedbank Pay är verksamma inom. Implikationerna hjälper vidare att förstärka innovationen och öka intäkterna eftersom de erbjuder en bättre förståelse för deras nuvarande prissättningsmekanism.
83

ONLINE STATISTICAL INFERENCE FOR LOW-RANK REINFORCEMENT LEARNING

Qiyu Han (18284758) 01 April 2024 (has links)
<p dir="ltr">We propose a fully online procedure to conduct statistical inference with adaptively collected data. The low-rank structure of the model parameter and the adaptivity nature of the data collection process make this task challenging: standard low-rank estimators are biased and cannot be obtained in a sequential manner while existing inference approaches in sequential decision-making algorithms fail to account for the low-rankness and are also biased. To tackle the challenges previously outlined, we first develop an online low-rank estimation process employing Stochastic Gradient Descent with noisy observations. Subsequently, to facilitate statistical inference using the online low-rank estimator, we introduced a novel online debiasing technique designed to address both sources of bias simultaneously. This method yields an unbiased estimator suitable for parameter inference. Finally, we developed an inferential framework capable of establishing an online estimator for performing inference on the optimal policy value. In theory, we establish the asymptotic normality of the proposed online debiased estimators and prove the validity of the constructed confidence intervals for both inference tasks. Our inference results are built upon a newly developed low-rank stochastic gradient descent estimator and its non-asymptotic convergence result, which is also of independent interest.</p>
84

Apprentissage de circuits quantiques par descente de gradient classique

Lamarre, Aldo 07 1900 (has links)
Nous présentons un nouvel algorithme d’apprentissage de circuits quantiques basé sur la descente de gradient classique. Comme ce sujet unifie deux disciplines, nous expliquons les deux domaines aux gens de l’autre discipline. Conséquemment, nous débutons par une présentation du calcul quantique et des circuits quantiques pour les gens en apprentissage automatique suivi d’une présentation des algorithmes d’apprentissage automatique pour les gens en informatique quantique. Puis, pour motiver et mettre en contexte nos résultats, nous passons à une légère revue de littérature en apprentissage automatique quantique. Ensuite, nous présentons notre modèle, son algorithme, ses variantes et quelques résultats empiriques. Finalement, nous critiquons notre implémentation en montrant des extensions et des nouvelles approches possibles. Les résultats principaux se situent dans ces deux dernières parties, qui sont respectivement les chapitres 4 et 5 de ce mémoire. Le code de l’algorithme et des expériences que nous avons créé pour ce mémoire se trouve sur notre github à l’adresse suivante : https://github.com/AldoLamarre/quantumcircuitlearning. / We present a new learning algorithm for quantum circuits based on gradient descent. Since this subject unifies two areas of research, we explain each field for people working in the other domain. Consequently, we begin by introducing quantum computing and quantum circuits to machine learning specialists, followed by an introduction of machine learning to quantum computing specialists. To give context and motivate our results we then give a light literature review on quantum machine learning. After this, we present our model, its algorithms and its variants, then discuss our currently achieved empirical results. Finally, we criticize our models by giving extensions and future work directions. These last two parts are our main results. They can be found in chapter 4 and 5 respectively. Our code which helped obtain these results can be found on github at this link : https://github.com/ AldoLamarre/quantumcircuitlearning.
85

Recommender System for Gym Customers

Sundaramurthy, Roshni January 2020 (has links)
Recommender systems provide new opportunities for retrieving personalized information on the Internet. Due to the availability of big data, the fitness industries are now focusing on building an efficient recommender system for their end-users. This thesis investigates the possibilities of building an efficient recommender system for gym users. BRP Systems AB has provided the gym data for evaluation and it consists of approximately 896,000 customer interactions with 8 features. Four different matrix factorization methods, Latent semantic analysis using Singular value decomposition, Alternating least square, Bayesian personalized ranking, and Logistic matrix factorization that are based on implicit feedback are applied for the given data. These methods decompose the implicit data matrix of user-gym group activity interactions into the product of two lower-dimensional matrices. They are used to calculate the similarities between the user and activity interactions and based on the score, the top-k recommendations are provided. These methods are evaluated by the ranking metrics such as Precision@k, Mean average precision (MAP) @k, Area under the curve (AUC) score, and Normalized discounted cumulative gain (NDCG) @k. The qualitative analysis is also performed to evaluate the results of the recommendations. For this specific dataset, it is found that the optimal method is the Alternating least square method which achieved around 90\% AUC for the overall system and managed to give personalized recommendations to the users.
86

Apprentissage basé sur le Qini pour la prédiction de l’effet causal conditionnel

Belbahri, Mouloud-Beallah 08 1900 (has links)
Les modèles uplift (levier en français) traitent de l'inférence de cause à effet pour un facteur spécifique, comme une intervention de marketing. En pratique, ces modèles sont construits sur des données individuelles issues d'expériences randomisées. Un groupe traitement comprend des individus qui font l'objet d'une action; un groupe témoin sert de comparaison. La modélisation uplift est utilisée pour ordonner les individus par rapport à la valeur d'un effet causal, par exemple, positif, neutre ou négatif. Dans un premier temps, nous proposons une nouvelle façon d'effectuer la sélection de modèles pour la régression uplift. Notre méthodologie est basée sur la maximisation du coefficient Qini. Étant donné que la sélection du modèle correspond à la sélection des variables, la tâche est difficile si elle est effectuée de manière directe lorsque le nombre de variables à prendre en compte est grand. Pour rechercher de manière réaliste un bon modèle, nous avons conçu une méthode de recherche basée sur une exploration efficace de l'espace des coefficients de régression combinée à une pénalisation de type lasso de la log-vraisemblance. Il n'y a pas d'expression analytique explicite pour la surface Qini, donc la dévoiler n'est pas facile. Notre idée est de découvrir progressivement la surface Qini comparable à l'optimisation sans dérivée. Le but est de trouver un maximum local raisonnable du Qini en explorant la surface près des valeurs optimales des coefficients pénalisés. Nous partageons ouvertement nos codes à travers la librairie R tools4uplift. Bien qu'il existe des méthodes de calcul disponibles pour la modélisation uplift, la plupart d'entre elles excluent les modèles de régression statistique. Notre librairie entend combler cette lacune. Cette librairie comprend des outils pour: i) la discrétisation, ii) la visualisation, iii) la sélection de variables, iv) l'estimation des paramètres et v) la validation du modèle. Cette librairie permet aux praticiens d'utiliser nos méthodes avec aise et de se référer aux articles méthodologiques afin de lire les détails. L'uplift est un cas particulier d'inférence causale. L'inférence causale essaie de répondre à des questions telle que « Quel serait le résultat si nous donnions à ce patient un traitement A au lieu du traitement B? ». La réponse à cette question est ensuite utilisée comme prédiction pour un nouveau patient. Dans la deuxième partie de la thèse, c’est sur la prédiction que nous avons davantage insisté. La plupart des approches existantes sont des adaptations de forêts aléatoires pour le cas de l'uplift. Plusieurs critères de segmentation ont été proposés dans la littérature, tous reposant sur la maximisation de l'hétérogénéité. Cependant, dans la pratique, ces approches sont sujettes au sur-ajustement. Nous apportons une nouvelle vision pour améliorer la prédiction de l'uplift. Nous proposons une nouvelle fonction de perte définie en tirant parti d'un lien avec l'interprétation bayésienne du risque relatif. Notre solution est développée pour une architecture de réseau de neurones jumeaux spécifique permettant d'optimiser conjointement les probabilités marginales de succès pour les individus traités et non-traités. Nous montrons que ce modèle est une généralisation du modèle d'interaction logistique de l'uplift. Nous modifions également l'algorithme de descente de gradient stochastique pour permettre des solutions parcimonieuses structurées. Cela aide dans une large mesure à ajuster nos modèles uplift. Nous partageons ouvertement nos codes Python pour les praticiens désireux d'utiliser nos algorithmes. Nous avons eu la rare opportunité de collaborer avec l'industrie afin d'avoir accès à des données provenant de campagnes de marketing à grande échelle favorables à l'application de nos méthodes. Nous montrons empiriquement que nos méthodes sont compétitives avec l'état de l'art sur les données réelles ainsi qu'à travers plusieurs scénarios de simulations. / Uplift models deal with cause-and-effect inference for a specific factor, such as a marketing intervention. In practice, these models are built on individual data from randomized experiments. A targeted group contains individuals who are subject to an action; a control group serves for comparison. Uplift modeling is used to order the individuals with respect to the value of a causal effect, e.g., positive, neutral, or negative. First, we propose a new way to perform model selection in uplift regression models. Our methodology is based on the maximization of the Qini coefficient. Because model selection corresponds to variable selection, the task is haunting and intractable if done in a straightforward manner when the number of variables to consider is large. To realistically search for a good model, we conceived a searching method based on an efficient exploration of the regression coefficients space combined with a lasso penalization of the log-likelihood. There is no explicit analytical expression for the Qini surface, so unveiling it is not easy. Our idea is to gradually uncover the Qini surface in a manner inspired by surface response designs. The goal is to find a reasonable local maximum of the Qini by exploring the surface near optimal values of the penalized coefficients. We openly share our codes through the R Package tools4uplift. Though there are some computational methods available for uplift modeling, most of them exclude statistical regression models. Our package intends to fill this gap. This package comprises tools for: i) quantization, ii) visualization, iii) variable selection, iv) parameters estimation and v) model validation. This library allows practitioners to use our methods with ease and to refer to methodological papers in order to read the details. Uplift is a particular case of causal inference. Causal inference tries to answer questions such as ``What would be the result if we gave this patient treatment A instead of treatment B?" . The answer to this question is then used as a prediction for a new patient. In the second part of the thesis, it is on the prediction that we have placed more emphasis. Most existing approaches are adaptations of random forests for the uplift case. Several split criteria have been proposed in the literature, all relying on maximizing heterogeneity. However, in practice, these approaches are prone to overfitting. In this work, we bring a new vision to uplift modeling. We propose a new loss function defined by leveraging a connection with the Bayesian interpretation of the relative risk. Our solution is developed for a specific twin neural network architecture allowing to jointly optimize the marginal probabilities of success for treated and control individuals. We show that this model is a generalization of the uplift logistic interaction model. We modify the stochastic gradient descent algorithm to allow for structured sparse solutions. This helps fitting our uplift models to a great extent. We openly share our Python codes for practitioners wishing to use our algorithms. We had the rare opportunity to collaborate with industry to get access to data from large-scale marketing campaigns favorable to the application of our methods. We show empirically that our methods are competitive with the state of the art on real data and through several simulation setting scenarios.
87

Characterization and Stabilization of Transverse Spatial Modes of Light in Few-Mode Optical Fibers

Pihl, Oscar January 2023 (has links)
With the growing need for secure and high-capacity communications, innovative solutions are needed to meet the demands of tomorrow. One such innovation is to make use of the still unutilized spatial dimension of light in communications, which has promising applications in both enabling higher data traffic as well as the security protocols of the future in quantum communications. The perhaps most promising way of realizing this technology is through spatial division multiplexing (SDM) in optical fibers. There are many challenges and open questions in implementing this, such as how perturbations to the signal should be kept under control and which type of optical fiber to use. Consequently, this thesis focuses on the implementation of SDM in few-mode fibers where the perturbation effects on the spatial distribution have been investigated. Following this investigation, an implementation of adaptive spatial mode control using a motorized polarization controller has been implemented. The mode control has been done with the focus on having relevance for quantum technology applications such as Quantum Key Distribution (QKD) and quantum random number generation (QRNG) but also for spatial division multiplexing (SDM) for general communications. For this reason, two evaluation metrics have been optimized for: extinction ratio and equal amplitude. The control algorithm used is an adaptation of the optimization algorithm Stochastic Parallel Gradient Descent (SPGD). Control has been achieved in stabilizing the extinction ratio of LP11a and LP11b over 12 hours with an average extinction ratio of 98 %. Additionally, equal amplitude between LP11a and LP11b has been achieved over 1 hour with an average relative difference of 0.42 % and 0.45 %. Out of the perturbation effects investigated; temperature caused large disturbances to the signal which later is corrected for with the implemented algorithm.
88

Non-convex Bayesian Learning via Stochastic Gradient Markov Chain Monte Carlo

Wei Deng (11804435) 18 December 2021 (has links)
<div>The rise of artificial intelligence (AI) hinges on the efficient training of modern deep neural networks (DNNs) for non-convex optimization and uncertainty quantification, which boils down to a non-convex Bayesian learning problem. A standard tool to handle the problem is Langevin Monte Carlo, which proposes to approximate the posterior distribution with theoretical guarantees. However, non-convex Bayesian learning in real big data applications can be arbitrarily slow and often fails to capture the uncertainty or informative modes given a limited time. As a result, advanced techniques are still required.</div><div><br></div><div>In this thesis, we start with the replica exchange Langevin Monte Carlo (also known as parallel tempering), which is a Markov jump process that proposes appropriate swaps between exploration and exploitation to achieve accelerations. However, the na\"ive extension of swaps to big data problems leads to a large bias, and the bias-corrected swaps are required. Such a mechanism leads to few effective swaps and insignificant accelerations. To alleviate this issue, we first propose a control variates method to reduce the variance of noisy energy estimators and show a potential to accelerate the exponential convergence. We also present the population-chain replica exchange and propose a generalized deterministic even-odd scheme to track the non-reversibility and obtain an optimal round trip rate. Further approximations are conducted based on stochastic gradient descents, which yield a user-friendly nature for large-scale uncertainty approximation tasks without much tuning costs. </div><div><br></div><div>In the second part of the thesis, we study scalable dynamic importance sampling algorithms based on stochastic approximation. Traditional dynamic importance sampling algorithms have achieved successes in bioinformatics and statistical physics, however, the lack of scalability has greatly limited their extensions to big data applications. To handle this scalability issue, we resolve the vanishing gradient problem and propose two dynamic importance sampling algorithms based on stochastic gradient Langevin dynamics. Theoretically, we establish the stability condition for the underlying ordinary differential equation (ODE) system and guarantee the asymptotic convergence of the latent variable to the desired fixed point. Interestingly, such a result still holds given non-convex energy landscapes. In addition, we also propose a pleasingly parallel version of such algorithms with interacting latent variables. We show that the interacting algorithm can be theoretically more efficient than the single-chain alternative with an equivalent computational budget.</div>
89

A deep learning theory for neural networks grounded in physics

Scellier, Benjamin 12 1900 (has links)
Au cours de la dernière décennie, l'apprentissage profond est devenu une composante majeure de l'intelligence artificielle, ayant mené à une série d'avancées capitales dans une variété de domaines. L'un des piliers de l'apprentissage profond est l'optimisation de fonction de coût par l'algorithme du gradient stochastique (SGD). Traditionnellement en apprentissage profond, les réseaux de neurones sont des fonctions mathématiques différentiables, et les gradients requis pour l'algorithme SGD sont calculés par rétropropagation. Cependant, les architectures informatiques sur lesquelles ces réseaux de neurones sont implémentés et entraînés souffrent d’inefficacités en vitesse et en énergie, dues à la séparation de la mémoire et des calculs dans ces architectures. Pour résoudre ces problèmes, le neuromorphique vise à implementer les réseaux de neurones dans des architectures qui fusionnent mémoire et calculs, imitant plus fidèlement le cerveau. Dans cette thèse, nous soutenons que pour construire efficacement des réseaux de neurones dans des architectures neuromorphiques, il est nécessaire de repenser les algorithmes pour les implémenter et les entraîner. Nous présentons un cadre mathématique alternative, compatible lui aussi avec l’algorithme SGD, qui permet de concevoir des réseaux de neurones dans des substrats qui exploitent mieux les lois de la physique. Notre cadre mathématique s'applique à une très large classe de modèles, à savoir les systèmes dont l'état ou la dynamique sont décrits par des équations variationnelles. La procédure pour calculer les gradients de la fonction de coût dans de tels systèmes (qui dans de nombreux cas pratiques ne nécessite que de l'information locale pour chaque paramètre) est appelée “equilibrium propagation” (EqProp). Comme beaucoup de systèmes en physique et en ingénierie peuvent être décrits par des principes variationnels, notre cadre mathématique peut potentiellement s'appliquer à une grande variété de systèmes physiques, dont les applications vont au delà du neuromorphique et touchent divers champs d'ingénierie. / In the last decade, deep learning has become a major component of artificial intelligence, leading to a series of breakthroughs across a wide variety of domains. The workhorse of deep learning is the optimization of loss functions by stochastic gradient descent (SGD). Traditionally in deep learning, neural networks are differentiable mathematical functions, and the loss gradients required for SGD are computed with the backpropagation algorithm. However, the computer architectures on which these neural networks are implemented and trained suffer from speed and energy inefficiency issues, due to the separation of memory and processing in these architectures. To solve these problems, the field of neuromorphic computing aims at implementing neural networks on hardware architectures that merge memory and processing, just like brains do. In this thesis, we argue that building large, fast and efficient neural networks on neuromorphic architectures also requires rethinking the algorithms to implement and train them. We present an alternative mathematical framework, also compatible with SGD, which offers the possibility to design neural networks in substrates that directly exploit the laws of physics. Our framework applies to a very broad class of models, namely those whose state or dynamics are described by variational equations. This includes physical systems whose equilibrium state minimizes an energy function, and physical systems whose trajectory minimizes an action functional (principle of least action). We present a simple procedure to compute the loss gradients in such systems, called equilibrium propagation (EqProp), which requires solely locally available information for each trainable parameter. Since many models in physics and engineering can be described by variational principles, our framework has the potential to be applied to a broad variety of physical systems, whose applications extend to various fields of engineering, beyond neuromorphic computing.
90

Left ventricle functional analysis in 2D+t contrast echocardiography within an atlas-based deformable template model framework

Casero Cañas, Ramón January 2008 (has links)
This biomedical engineering thesis explores the opportunities and challenges of 2D+t contrast echocardiography for left ventricle functional analysis, both clinically and within a computer vision atlas-based deformable template model framework. A database was created for the experiments in this thesis, with 21 studies of contrast Dobutamine Stress Echo, in all 4 principal planes. The database includes clinical variables, human expert hand-traced myocardial contours and visual scoring. First the problem is studied from a clinical perspective. Quantification of endocardial global and local function using standard measures shows expected values and agreement with human expert visual scoring, but the results are less reliable for myocardial thickening. Next, the problem of segmenting the endocardium with a computer is posed in a standard landmark and atlas-based deformable template model framework. The underlying assumption is that these models can emulate human experts in terms of integrating previous knowledge about the anatomy and physiology with three sources of information from the image: texture, geometry and kinetics. Probabilistic atlases of contrast echocardiography are computed, while noting from histograms at selected anatomical locations that modelling texture with just mean intensity values may be too naive. Intensity analysis together with the clinical results above suggest that lack of external boundary definition may preclude this imaging technique for appropriate measuring of myocardial thickening, while endocardial boundary definition is appropriate for evaluation of wall motion. Geometry is presented in a Principal Component Analysis (PCA) context, highlighting issues about Gaussianity, the correlation and covariance matrices with respect to physiology, and analysing different measures of dimensionality. A popular extension of deformable models ---Active Appearance Models (AAMs)--- is then studied in depth. Contrary to common wisdom, it is contended that using a PCA texture space instead of a fixed atlas is detrimental to segmentation, and that PCA models are not convenient for texture modelling. To integrate kinetics, a novel spatio-temporal model of cardiac contours is proposed. The new explicit model does not require frame interpolation, and it is compared to previous implicit models in terms of approximation error when the shape vector changes from frame to frame or remains constant throughout the cardiac cycle. Finally, the 2D+t atlas-based deformable model segmentation problem is formulated and solved with a gradient descent approach. Experiments using the similarity transformation suggest that segmentation of the whole cardiac volume outperforms segmentation of individual frames. A relatively new approach ---the inverse compositional algorithm--- is shown to decrease running times of the classic Lucas-Kanade algorithm by a factor of 20 to 25, to values that are within real-time processing reach.

Page generated in 0.0833 seconds