Global ETD Search

481	Towards Privacy and Communication Efficiency in Distributed Representation Learning Sheikh S Azam (12836108) 10 June 2022 (has links) <p>Over the past decade, distributed representation learning has emerged as a popular alternative to conventional centralized machine learning training. The increasing interest in distributed representation learning, specifically federated learning, can be attributed to its fundamental property that promotes data privacy and communication savings. While conventional ML encourages aggregating data at a central location (e.g., data centers), distributed representation learning advocates keeping data at the source and instead transmitting model parameters across the network. However, since the advent of deep learning, model sizes have become increasingly large often comprising million-billions of parameters, which leads to the problem of communication latency in the learning process. In this thesis, we propose to tackle the problem of communication latency in two different ways: (i) learning private representation of data to enable its sharing, and (ii) reducing the communication latency by minimizing the corresponding long-range communication requirements.</p> <p><br></p> <p>To tackle the former goal, we first start by studying the problem of learning representations that are private yet informative, i.e., providing information about intended ''ally'' targets while hiding sensitive ''adversary'' attributes. We propose Exclusion-Inclusion Generative Adversarial Network (EIGAN), a generalized private representation learning (PRL) architecture that accounts for multiple ally and adversary attributes, unlike existing PRL solutions. We then address the practical constraints of the distributed datasets by developing Distributed EIGAN (D-EIGAN), the first distributed PRL method that learns a private representation at each node without transmitting the source data. We theoretically analyze the behavior of adversaries under the optimal EIGAN and D-EIGAN encoders and the impact of dependencies among ally and adversary tasks on the optimization objective. Our experiments on various datasets demonstrate the advantages of EIGAN in terms of performance, robustness, and scalability. In particular, EIGAN outperforms the previous state-of-the-art by a significant accuracy margin (47% improvement), and D-EIGAN's performance is consistently on par with EIGAN under different network settings.</p> <p><br></p> <p>We next tackle the latter objective - reducing the communication latency - and propose two timescale hybrid federated learning (TT-HF), a semi-decentralized learning architecture that combines the conventional device-to-server communication paradigm for federated learning with device-to-device (D2D) communications for model training. In TT-HF, during each global aggregation interval, devices (i) perform multiple stochastic gradient descent iterations on their individual datasets, and (ii) aperiodically engage in consensus procedure of their model parameters through cooperative, distributed D2D communications within local clusters. With a new general definition of gradient diversity, we formally study the convergence behavior of TT-HF, resulting in new convergence bounds for distributed ML. We leverage our convergence bounds to develop an adaptive control algorithm that tunes the step size, D2D communication rounds, and global aggregation period of TT-HF over time to target a sublinear convergence rate of O(1/t) while minimizing network resource utilization. Our subsequent experiments demonstrate that TT-HF significantly outperforms the current art in federated learning in terms of model accuracy and/or network energy consumption in different scenarios where local device datasets exhibit statistical heterogeneity. Finally, our numerical evaluations demonstrate robustness against outages caused by fading channels, as well favorable performance with non-convex loss functions.</p> Knowledge representation and reasoning Pattern recognition Data and information privacy Data engineering and data science Cloud computing Adversarial machine learning Deep learning Neural networks representation learning Federated learning Adversarial learning Deep Learning Framework
482	DEVELOPMENT AND EVALUATION OF A DIGITAL SYSTEM FOR ASSEMBLY BOLT PATTERN TRACEABILITY AND POKA-YOKE Eric J Kozikowski (10716654) 28 April 2021 (has links) <div>The manufacturing industry has begun its transition into a digital age, where data-driven decisions aim to improve product quality, output, and efficiency. Decisions made based on manufacturing data can help identify key problem areas in an assembly line and mitigate any defects from progressing through to the next step in the assembly process. But what if the products’ as manufactured data was inaccurate or didn’t exist at all? Decisions based on incorrect data can lead to defective parts being passed as good parts, costing manufacturers millions of dollars in rework or recalls. When specifically referring to mechanically fastened assemblies, products that experience rotation, like an aircraft propeller, or compress to create a seal, like an oil pipe flange, all require specific torque pattern sequences to be followed during assembly. When incorrectly torqued, the parts can have catastrophic failures resulting in consumer injury or ecological contamination. This paper outlines the development and feasibility of a system and its components for tracking and error-proofing the assembly of bolted joints in an industrial environment.</div><div>Using a machine vision system, the system traces the tool location relative to the mechanical fastener and records which order the fasteners were torqued in, if an error is detected, the system does not allow the user to progress through the assembly process, notifying if an error is detected. The system leverages open source machine learning algorithms from TensorFlow2 and OpenCv, that allow efficient object detection model training. The proposed system was tested using a series of tests and evaluated using the STEP method. The data collected aims to understand the system's feasibility and effectiveness in an industrial setting. </div><div>The tests aim to understand the effectiveness of the system under standard and variable industrial work conditions. Using the STEP method and other statistical analysis, an evaluation matrix was completed, ranking the system's ability to successfully meet all predetermined benchmarks and successfully record the torque pattern used to assemble apart</div> Flexible Manufacturing Systems Assembly Traceability Bolt Patterns Fasteners Digital Thread Machine Vision Manufacturing Assembly Guidance Workforce Development Poka-Yoke Digital Enterprise Internet of things(IoT) Connected Tools
483	Community Detection of Anomaly in Large-Scale Network Dissertation - Adefolarin Bolaji .pdf Adefolarin Alaba Bolaji (10723926) 29 April 2021 (has links) <p>The detection of anomalies in real-world networks is applicable in different domains; the application includes, but is not limited to, credit card fraud detection, malware identification and classification, cancer detection from diagnostic reports, abnormal traffic detection, identification of fake media posts, and the like. Many ongoing and current researches are providing tools for analyzing labeled and unlabeled data; however, the challenges of finding anomalies and patterns in large-scale datasets still exist because of rapid changes in the threat landscape. </p><p>In this study, I implemented a novel and robust solution that combines data science and cybersecurity to solve complex network security problems. I used Long Short-Term Memory (LSTM) model, Louvain algorithm, and PageRank algorithm to identify and group anomalies in large-scale real-world networks. The network has billions of packets. The developed model used different visualization techniques to provide further insight into how the anomalies in the network are related. </p><p>Mean absolute error (MAE) and root mean square error (RMSE) was used to validate the anomaly detection models, the results obtained for both are 5.1813e-04 and 1e-03 respectively. The low loss from the training phase confirmed the low RMSE at loss: 5.1812e-04, mean absolute error: 5.1813e-04, validation loss: 3.9858e-04, validation mean absolute error: 3.9858e-04. The result from the community detection shows an overall modularity value of 0.914 which is proof of the existence of very strong communities among the anomalies. The largest sub-community of the anomalies connects 10.42% of the total nodes of the anomalies. </p><p>The broader aim and impact of this study was to provide sophisticated, AI-assisted countermeasures to cyber-threats in large-scale networks. To close the existing gaps created by the shortage of skilled and experienced cybersecurity specialists and analysts in the cybersecurity field, solutions based on out-of-the-box thinking are inevitable; this research was aimed at yielding one of such solutions. It was built to detect specific and collaborating threat actors in large networks and to help speed up how the activities of anomalies in any given large-scale network can be curtailed in time.</p><div><div><div> </div> </div> </div> <br> Applied Computer Science Computer System Security Computer Communications Networks Anomaly Detection Community Detection Artificial Intelligence Deep Learning Network Traffic Large Scale Networks Big Data Analytics Network Graph Modularity Data Visualization
484	Parallel and Decentralized Algorithms for Big-data Optimization over Networks Amir Daneshmand (11153640) 22 July 2021 (has links) <p>Recent decades have witnessed the rise of data deluge generated by heterogeneous sources, e.g., social networks, streaming, marketing services etc., which has naturally created a surge of interests in theory and applications of large-scale convex and non-convex optimization. For example, real-world instances of statistical learning problems such as deep learning, recommendation systems, etc. can generate sheer volumes of spatially/temporally diverse data (up to Petabytes of data in commercial applications) with millions of decision variables to be optimized. Such problems are often referred to as Big-data problems. Solving these problems by standard optimization methods demands intractable amount of centralized storage and computational resources which is infeasible and is the foremost purpose of parallel and decentralized algorithms developed in this thesis.</p><p><br></p><p>This thesis consists of two parts: (I) Distributed Nonconvex Optimization and (II) Distributed Convex Optimization.</p><p><br></p><p>In Part (I), we start by studying a winning paradigm in big-data optimization, Block Coordinate Descent (BCD) algorithm, which cease to be effective when problem dimensions grow overwhelmingly. In particular, we considered a general family of constrained non-convex composite large-scale problems defined on multicore computing machines equipped with shared memory. We design a hybrid deterministic/random parallel algorithm to efficiently solve such problems combining synergically Successive Convex Approximation (SCA) with greedy/random dimensionality reduction techniques. We provide theoretical and empirical results showing efficacy of the proposed scheme in face of huge-scale problems. The next step is to broaden the network setting to general mesh networks modeled as directed graphs, and propose a class of gradient-tracking based algorithms with global convergence guarantees to critical points of the problem. We further explore the geometry of the landscape of the non-convex problems to establish second-order guarantees and strengthen our convergence to local optimal solutions results to global optimal solutions for a wide range of Machine Learning problems.</p><p><br></p><p>In Part (II), we focus on a family of distributed convex optimization problems defined over meshed networks. Relevant state-of-the-art algorithms often consider limited problem settings with pessimistic communication complexities with respect to the complexity of their centralized variants, which raises an important question: can one achieve the rate of centralized first-order methods over networks, and moreover, can one improve upon their communication costs by using higher-order local solvers? To answer these questions, we proposed an algorithm that utilizes surrogate objective functions in local solvers (hence going beyond first-order realms, such as proximal-gradient) coupled with a perturbed (push-sum) consensus mechanism that aims to track locally the gradient of the central objective function. The algorithm is proved to match the convergence rate of its centralized counterparts, up to multiplying network factors. When considering in particular, Empirical Risk Minimization (ERM) problems with statistically homogeneous data across the agents, our algorithm employing high-order surrogates provably achieves faster rates than what is achievable by first-order methods. Such improvements are made without exchanging any Hessian matrices over the network. </p><p><br></p><p>Finally, we focus on the ill-conditioning issue impacting the efficiency of decentralized first-order methods over networks which rendered them impractical both in terms of computation and communication cost. A natural solution is to develop distributed second-order methods, but their requisite for Hessian information incurs substantial communication overheads on the network. To work around such exorbitant communication costs, we propose a “statistically informed” preconditioned cubic regularized Newton method which provably improves upon the rates of first-order methods. The proposed scheme does not require communication of Hessian information in the network, and yet, achieves the iteration complexity of centralized second-order methods up to the statistical precision. In addition, (second-order) approximate nature of the utilized surrogate functions, improves upon the per-iteration computational cost of our earlier proposed scheme in this setting.</p> Distributed Computing Operations Research Optimisation distributed optimization Large-Scale Optimization Distributed Machine Learning decentralized algorithms Parallel Computing convex optimization Nonconvex optimization Parallel algorithms
485	Efficient and Scalable Subgraph Statistics using Regenerative Markov Chain Monte Carlo Mayank Kakodkar (12463929) 26 April 2022 (has links) <p>In recent years there has been a growing interest in data mining and graph machine learning for techniques that can obtain frequencies of <em>k</em>-node Connected Induced Subgraphs (<em>k</em>-CIS) contained in large real-world graphs. While recent work has shown that 5-CISs can be counted exactly, no exact polynomial-time algorithms are known that solve this task for <em>k </em>> 5. In the past, sampling-based algorithms that work well in moderately-sized graphs for <em>k</em> ≤ 8 have been proposed. In this thesis I push this boundary up to <em>k</em> ≤ 16 for graphs containing up to 120M edges, and to <em>k</em> ≤ 25 for smaller graphs containing between a million to 20M edges. I do so by re-imagining two older, but elegant and memory-efficient algorithms -- FANMOD and PSRW -- which have large estimation errors by modern standards. This is because FANMOD produces highly correlated k-CIS samples and the cost of sampling the PSRW Markov chain becomes prohibitively expensive for k-CIS’s larger than <em>k </em>> 8.</p> <p>In this thesis, I introduce:</p> <p>(a) <strong>RTS:</strong> a novel regenerative Markov chain Monte Carlo (MCMC) sampling procedure on the tree, generated on-the-fly by the FANMOD algorithm. RTS is able to run on multiple cores and multiple machines (embarrassingly parallel) and compute confidence intervals of estimates, all this while preserving the memory-efficient nature of FANMOD. RTS is thus able to estimate subgraph statistics for <em>k</em> ≤ 16 for larger graphs containing up to 120M edges, and for <em>k</em> ≤ 25 for smaller graphs containing between a million to 20M edges.</p> <p>(b) <strong>R-PSRW:</strong> which scales the PSRW algorithm to larger CIS-sizes using a rejection sampling procedure to efficiently sample transitions from the PSRW Markov chain. R-PSRW matches RTS in terms of scaling to larger CIS sizes.</p> <p>(c) <strong>Ripple:</strong> which achieves unprecedented scalability by stratifying the R-PSRW Markov chain state-space into ordered strata via a new technique that I call <em>sequential stratified regeneration</em>. I show that the Ripple estimator is consistent, highly parallelizable, and scales well. Ripple is able to <em>count</em> CISs of size up to <em>k </em>≤ 12 in real world graphs containing up to 120M edges.</p> <p>My empirical results show that the proposed methods offer a considerable improvement over the state-of-the-art. Moreover my methods are able to run at a scale that has been considered unreachable until now, not only by prior MCMC-based methods but also by other sampling approaches. </p> <p><strong>Optimization of Restricted Boltzmann Machines. </strong>In addition, I also propose a regenerative transformation of MCMC samplers of Restricted Boltzmann Machines RBMs. My approach, Markov Chain Las Vegas (MCLV) gives statistical guarantees in exchange for random running times. MCLV uses a stopping set built from the training data and has a maximum number of Markov chain step-count <em>K</em> (referred as MCLV-<em>K</em>). I present a MCLV-<em>K</em> gradient estimator (LVS-<em>K</em>) for RBMs and explore the correspondence and differences between LVS-<em>K</em> and Contrastive Divergence (CD-<em>K</em>). LVS-<em>K</em> significantly outperforms CD-<em>K</em> in the task of training RBMs over the MNIST dataset, indicating MCLV to be a promising direction in learning generative models.</p> Pattern Recognition and Data Mining Markov Chain Monte Carlo Random Walk Regenerative Sampling Motif Analysis Subgraph Counting Graph Mining Energy Based Models Generative Models Markov Random Fields Restricted Boltzmann Machine Random Walk Tours
486	Automatic taxonomy evaluation Gao, Tianjian 12 1900 (has links) This thesis would not be made possible without the generous support of IATA. / Les taxonomies sont une représentation essentielle des connaissances, jouant un rôle central dans de nombreuses applications riches en connaissances. Malgré cela, leur construction est laborieuse que ce soit manuellement ou automatiquement, et l'évaluation quantitative de taxonomies est un sujet négligé. Lorsque les chercheurs se concentrent sur la construction d'une taxonomie à partir de grands corpus non structurés, l'évaluation est faite souvent manuellement, ce qui implique des biais et se traduit souvent par une reproductibilité limitée. Les entreprises qui souhaitent améliorer leur taxonomie manquent souvent d'étalon ou de référence, une sorte de taxonomie bien optimisée pouvant service de référence. Par conséquent, des connaissances et des efforts spécialisés sont nécessaires pour évaluer une taxonomie. Dans ce travail, nous soutenons que l'évaluation d'une taxonomie effectuée automatiquement et de manière reproductible est aussi importante que la génération automatique de telles taxonomies. Nous proposons deux nouvelles méthodes d'évaluation qui produisent des scores moins biaisés: un modèle de classification de la taxonomie extraite d'un corpus étiqueté, et un modèle de langue non supervisé qui sert de source de connaissances pour évaluer les relations hyperonymiques. Nous constatons que nos substituts d'évaluation corrèlent avec les jugements humains et que les modèles de langue pourraient imiter les experts humains dans les tâches riches en connaissances. / Taxonomies are an essential knowledge representation and play an important role in classification and numerous knowledge-rich applications, yet quantitative taxonomy evaluation remains to be overlooked and left much to be desired. While studies focus on automatic taxonomy construction (ATC) for extracting meaningful structures and semantics from large corpora, their evaluation is usually manual and subject to bias and low reproducibility. Companies wishing to improve their domain-focused taxonomies also suffer from lacking ground-truths. In fact, manual taxonomy evaluation requires substantial labour and expert knowledge. As a result, we argue in this thesis that automatic taxonomy evaluation (ATE) is just as important as taxonomy construction. We propose two novel taxonomy evaluation methods for automatic taxonomy scoring, leveraging supervised classification for labelled corpora and unsupervised language modelling as a knowledge source for unlabelled data. We show that our evaluation proxies can exert similar effects and correlate well with human judgments and that language models can imitate human experts on knowledge-rich tasks. Taxonomie Ontologie Apprentissage de taxonomie Évaluation d’ontologie Extraction de connaissances Représentation des connaissances Extraction de l’information Modélisation du langage Découverte d’hyperonymes Taxonomy Ontology Taxonomy learning Ontology evaluation Knowledge representation Knowledge extraction Information retrieval Information extraction Hypernym discovery Language modelling
487	Eine funktionale Methode der Wissensrepräsentation Oertel, Wolfgang 01 March 2024 (has links) Das Anliegen der Arbeit besteht in der Entwicklung eines Wissensrepräsentationsmodells, das sich insbesondere für die Beschreibung komplex strukturierter Objekte eignet. Den Ausgangspunkt bildet eine Charakterisierung der Problematik der Wissensrepräsentation. Aus der Darstellung eines für das Gebiet der rechnergestützten Konstruktion typischen Diskursbereiches Getriebekonstruktion lassen sich Anforderungen an Modelle zur Beschreibung komplex strukturierter Objekte in Wissensbasen ableiten. Der Hauptteil der Arbeit besteht in der Entwicklung eines funktionalen Wissensrepräsentationsmodells, das diesen Anforderungen gerecht wird. Das Modell ermöglicht gleichzeitig eine effiziente Implementation wissensbasierter Systeme auf der Grundlage der Programmiersprache LISP sowie das Herstellen von Beziehungen zu Datenmodellen einerseits und Wissensrepräsentationsmodellen, insbesondere der Prädikatenlogik erster Ordnung, andererseits. Unter Bezugnahme auf die Datenbanktechnologie wird die Struktur von Wissensbanksystemen beschrieben. Ein wesentlicher Aspekt der Arbeit besteht im Aufzeigen der Möglichkeit und des Weges, das Wissen eines Konstrukteurs zu formalisieren und in eine Wissensbasis abzubilden.:1. Einleitung 2. Wissensrepräsentation in technischen Systemen 3. Beispielsdiskursbereiche 4. Funktionales Wissensrepräsentationsmodell 5. Beziehungen zwischen Prädikatenlogik erster Ordnung und funktionalem Wissensrepräsentationsmodell 6. Aufbau von Wissensbanksystemen 7. Anwendung des funktionalen Wissensrepräsentationsmodells für die Implementation wissensbasierter Systeme 8. Schlussbemerkungen info:eu-repo/classification/ddc/006.3 ddc:006.3 Künstliche Intelligenz Wissensverarbeitung Wissensrepräsentation Funktionale Programmierung Wissensbanksystem
488	Representing and Reasoning on Conceptual Queries Over Image Databases Rigotti, Christophe, Hacid, Mohand-Saïd 20 May 2022 (has links) The problem of content management of multimedia data types (e.g., image, video, graphics) is becoming increasingly important with the development of advanced multimedia applications. Traditional database management systems are inadequate for the handling of such data types. They require new techniques for query formulation, retrieval, evaluation, and navigation. In this paper we develop a knowledge-based framework for modeling and retrieving image data by content. To represent the various aspects of an image object's characteristics, we propose a model which consists of three layers: (1) Feature and Content Layer, intended to contain image visual features such as contours, shapes,etc.; (2) Object Layer, which provides the (conceptual) content dimension of images; and (3) Schema Layer, which contains the structured abstractions of images, i.e., a general schema about the classes of objects represented in the object layer. We propose two abstract languages on the basis of description logics: one for describing knowledge of the object and schema layers, and the other, more expressive, for making queries. Queries can refer to the form dimension (i.e., information of the Feature and Content Layer) or to the content dimension (i.e., information of the Object Layer). These languages employ a variable free notation, and they are well suited for the design, verification and complexity analysis of algorithms. As the amount of information contained in the previous layers may be huge and operations performed at the Feature and Content Layer are time-consuming, resorting to the use of materialized views to process and optimize queries may be extremely useful. For that, we propose a formal framework for testing containment of a query in a view expressed in our query language. The algorithm we propose is sound and complete and relatively efficient. / This is an extended version of the article in: Eleventh International Symposium on Methodologies for Intelligent Systems, Warsaw, Poland, 1999. info:eu-repo/classification/ddc/004 ddc:004
489	Learning From Data Across Domains: Enhancing Human and Machine Understanding of Data From the Wild Sean Michael Kulinski (17593182) 13 December 2023 (has links) <p dir="ltr">Data is collected everywhere in our world; however, it often is noisy and incomplete. Different sources of data may have different characteristics, quality levels, or come from dynamic and diverse environments. This poses challenges for both humans who want to gain insights from data and machines which are learning patterns from data. How can we leverage the diversity of data across domains to enhance our understanding and decision-making? In this thesis, we address this question by proposing novel methods and applications that use multiple domains as more holistic sources of information for both human and machine learning tasks. For example, to help human operators understand environmental dynamics, we show the detection and localization of distribution shifts to problematic features, as well as how interpretable distributional mappings can be used to explain the differences between shifted distributions. For robustifying machine learning, we propose a causal-inspired method to find latent factors that are robust to environmental changes and can be used for counterfactual generation or domain-independent training; we propose a domain generalization framework that allows for fast and scalable models that are robust to distribution shift; and we introduce a new dataset based on human matches in StarCraft II that exhibits complex and shifting multi-agent behaviors. We showcase our methods across various domains such as healthcare, natural language processing (NLP), computer vision (CV), etc. to demonstrate that learning from data across domains can lead to more faithful representations of data and its generating environments for both humans and machines.</p> Knowledge representation and reasoning Natural language processing Planning and decision making Data engineering and data science Data mining and knowledge discovery Stream and sensor data Human-computer interaction Mixed initiative and human-in-the-loop Machine Learning Distribution Shifts Domain Generalization Artificial Intelligence
490	ONTOLOGY-DRIVEN SEMI-SUPERVISED MODEL FOR CONCEPTUAL ANALYSIS OF DESIGN SPECIFICATIONS Shankar, Arunprasath 29 August 2014 (has links) No description available. Computer Engineering Information Systems Computer Science Systems Design

Search results