Global ETD Search

11	GRAPE : parallel graph query engine Xu, Jingbo January 2017 (has links) The need for graph computations is evident in a multitude of use cases. To support computations on large-scale graphs, several parallel systems have been developed. However, existing graph systems require users to recast algorithms into new models, which makes parallel graph computations as a privilege to experienced users only. Moreover, real world applications often require much more complex graph processing workflows than previously evaluated. In response to these challenges, the thesis presents GRAPE, a distributed graph computation system, shipped with various applications for social network analysis, social media marketing and functional dependencies on graphs. Firstly, the thesis presents the foundation of GRAPE. The principled approach of GRAPE is based on partial evaluation and incremental computation. Sequential graph algorithms can be plugged into GRAPE with minor changes, and get parallelized as a whole. The termination and correctness are guaranteed under a monotonic condition. Secondly, as an application on GRAPE, the thesis proposes graph-pattern association rules (GPARs) for social media marketing. GPARs help users discover regularities between entities in social graphs and identify potential customers by exploring social influence. The thesis studies the problem of discovering top-k diversified GPARs and the problem of identifying potential customers with GPARs. Although both are NP- hard, parallel scalable algorithms on GRAPE are developed, which guarantee a polynomial speedup over sequential algorithms with the increase of processors. Thirdly, the thesis proposes quantified graph patterns (QGPs), an extension of graph patterns by supporting simple counting quantifiers on edges. QGPs naturally express universal and existential quantification, numeric and ratio aggregates, as well as negation. The thesis proves that the matching problem of QGPs remains NP-complete in the absence of negation, and is DP-complete for general QGPs. In addition, the thesis introduces quantified graph association rules defined with QGPs, to identify potential customers in social media marketing. Finally, to address the issue of data consistency, the thesis proposes a class of functional dependencies for graphs, referred to as GFDs. GFDs capture both attribute-value dependencies and topological structures of entities. The satisfiability and implication problems for GFDs are studied and proved to be coNP-complete and NP-complete, respectively. The thesis also proves that the validation problem for GFDs is coNP- complete. The parallel algorithms developed on GRAPE verify that GFDs provide an effective approach to detecting inconsistencies in knowledge and social graphs.
12	A Database Supported Modeling Environment for Pandemic Planning and Course of Action Analysis Ma, Yifei 24 June 2013 (has links) Pandemics can significantly impact public health and society, for instance, the 2009 H1N1<br />and the 2003 SARS. In addition to analyzing the historic epidemic data, computational simulation of epidemic propagation processes and disease control strategies can help us understand the spatio-temporal dynamics of epidemics in the laboratory. Consequently, the public can be better prepared and the government can control future epidemic outbreaks more effectively. Recently, epidemic propagation simulation systems, which use high performance computing technology, have been proposed and developed to understand disease propagation processes. However, run-time infection situation assessment and intervention adjustment, two important steps in modeling disease propagation, are not well supported in these simulation systems. In addition, these simulation systems are computationally efficient in their simulations, but most of them have<br />limited capabilities in terms of modeling interventions in realistic scenarios.<br />In this dissertation, we focus on building a modeling and simulation environment for epidemic propagation and propagation control strategy. The objective of this work is to<br />design such a modeling environment that both supports the previously missing functions,<br />meanwhile, performs well in terms of the expected features such as modeling fidelity,<br />computational efficiency, modeling capability, etc. Our proposed methodologies to build<br />such a modeling environment are: 1) decoupled and co-evolving models for disease propagation, situation assessment, and propagation control strategy, and 2) assessing situations and simulating control strategies using relational databases. Our motivation for exploring these methodologies is as follows: 1) a decoupled and co-evolving model allows us to design modules for each function separately and makes this complex modeling system design simpler, and 2) simulating propagation control strategies using relational databases improves the modeling capability and human productivity of using this modeling environment. To evaluate our proposed methodologies, we have designed and built a loosely coupled and database supported epidemic modeling and simulation environment. With detailed experimental results and realistic case studies, we demonstrate that our modeling environment provides the missing functions and greatly enhances many expected features, such as modeling capability, without significantly sacrificing computational efficiency and scalability. / Ph. D. Epidemic simulation Database system Distributed system
13	Modeling and Computation of Complex Interventions in Large-scale Epidemiological Simulations using SQL and Distributed Database Kaw, Rushi 30 August 2014 (has links) Scalability is an important problem in epidemiological applications that simulate complex intervention scenarios over large datasets. Indemics is one such interactive data intensive framework for High-performance computing (HPC) based large-scale epidemic simulations. In the Indemics framework, interventions are supplied from an external, standalone database which proved to be an effective way of implementing interventions. Although this setup performs well for simple interventions and small datasets, performance and scalability of complex interventions and large datasets remain an issue. In this thesis, we present IndemicsXC, a scalable and massively parallel high-performance data engine for Indemics in a supercomputing environment. IndemicsXC has the ability to implement complex interventions over large datasets. Our distributed database solution retains the simplicity of Indemics by using the same SQL query interface for expressing interventions. We show that our solution implements the most complex interventions by intelligently offloading them to the supercomputer nodes and processing them in parallel. We present an extensive performance evaluation of our database engine with the help of various intervention case studies over synthetic population datasets. The evaluation of our parallel and distributed database framework illustrates its scalability over standalone database. Our results show that the distributed data engine is efficient as it is parallel, scalable and cost-efficient means of implementing interventions. The proposed cost-model in this thesis could be used to approximate intervention query execution time with decent accuracy. The usefulness of our distributed database framework could be leveraged for fast, accurate and sensible decisions by the public health officials during an outbreak. Finally, we discuss the considerations for using distributed databases for driving large-scale simulations. / Master of Science epidemic simulation distributed system database system
14	A Distributed System Interface for a Flight Simulator Zeitoun, Omar 11 1900 (has links) The importance of flight training has been realized since the inception of manned flight. In this thesis, a project about the interfacing of hardware cockpit instruments with a flight simulation software over a distributed system is to be described. A TRC472 Flight Cockpit was to be used while linked with Presagis FlightSIM to fully simulate a Cessna 172 Skyhawk aircraft. The TRC 472 contains flight input gauges (Airspeed Indicator, RPM indicator... etc.), pilot control devices (Rudder, Yoke...etc.) and navigation systems (VOR,ADF...etc.) all connected to computer through separate USBs and identified as HID's (Human Interface Devices). These devices required real-time interaction with FlightSIM software; in total 21 devices communicating at the same time. The TRC472 Flight Cockpit and the FlightSIM software were to be running on a distributed system of computers and to be communicating together through Ethernet. Serialization was to be used for the data transfer across the connection link so objects can be reproduced seamlessly on the different computers. Some of the TRC472 devices were straight forward in writing and reading from, but some of them required some calibrations of raw I/O data and buffers. The project also required making plugins to overwrite and extend FlightSIM software to communicate with the TRC472 Flight Cockpit. The final product is to be a full fledged flight experience with complete environment and physics of the Cessna 172. / Thesis / Master of Applied Science (MASc) flight simulation distributed system interface cessna 172
15	Towards the Inference, Understanding, and Reasoning on Edge Devices Ma, Guoqing 10 May 2023 (has links) This thesis explores the potential of edge devices in three applications: indoor localization, urban traffic prediction, and multi-modal representation learning. For indoor localization, we propose a reliable data transmission network and robust data processing framework by visible light communications and machine learning to enhance the intelligence of smart buildings. The urban traffic prediction proposes a dynamic spatial and temporal origin-destination feature enhanced deep network with the graph convolutional network to collaboratively learn a low-dimensional representation for each region to predict in-traffic and out-traffic for every city region simultaneously. The multi-modal representation learning proposes using dynamic contexts to uniformly model visual and linguistic causalities, introducing a novel dynamic-contexts-based similarity metric that considers the correlation of potential causes and effects to measure the relevance among images. To enhance distributed training on edge devices, we introduced a new system called Distributed Artificial Intelligence Over-the-Air (AirDAI), which involves local training on raw data and sending trained outputs, such as model parameters, from local clients back to a central server for aggregation. To aid the development of AirDAI in wireless communication networks, we suggested a general system design and an associated simulator that can be tailored based on wireless channels and system-level configurations. We also conducted experiments to confirm the effectiveness and efficiency of the proposed system design and presented an analysis of the effects of wireless environments to facilitate future implementations and updates. This thesis proposes FedForest to address the communication and computation limitations in heterogeneous edge networks, which optimizes the global network by distilling knowledge from aggregated sub-networks. The sub-network sampling process is differentiable, and the model size is used as an additional constraint to extract a new sub-network for the subsequent local optimization process. FedForest significantly reduces server-to-client communication and local device computation costs compared to conventional algorithms while maintaining performance with the benchmark Top-K sparsification method. FedForest can accelerate the deployment of large-scale deep learning models on edge devices. Edge AI Model Compression Wireless Distributed System
16	Distributed Manufacturing Simulation Environment Ma, Qingwei 27 November 2002 (has links) No description available. Synchronization Distributed System Integration IGRIP SoftLogix
17	Clustering avec reconfigurations locales pour des systèmes distribués dynamiques / Clusterization with local reconfiguration for the dynamical distributed system Kudireti, Abdurusul 17 June 2011 (has links) Nous proposons dans ces travaux des algorithmes distribués de clusterisation destinés à répondre à la problématique de la croissance des réseaux. Après avoir donné une spécification pour ce problème, nous fournissons un premier algorithme distribué à base de marches aléatoires pour le résoudre. Cet algorithme n’utilise que des informations locales, et utilise des marches aléatoires pour construire en parallèle des ensembles connexes de noeuds appelés les coeurs des clusters, auxquels on ajoute des noeuds adjacents. La taille de chaque coeur est comprise entre 2 et un paramètre de l’algorithme. L’algorithme garantit que si deux clusters sont adjacents, au moins l’un d’entre eux a un coeur de taille maximale. Un deuxième algorithme, adaptatif à la mobilité, garantit en plus de ces propriétés que la reconstruction consécutive à un changement topologique est locale. Cette propriété différencie notre solution des nombreuses solutions existantes : elle permet d’éviter des destructions en chaîne suite à un changement de topologie. Nous présentons enfin un algorithme de clustering auto-stabilisant qui conserve les propriétés des algorithmes précédents en y ajoutant la tolérance aux pannes. Grâce au parallélisme de la construction des clusters et au caractère local des reconstructions de clusters, ces algorithmes passent à l'échelle, ce qui est confirmé par les simulations que nous avons menées. / We propose in this work distributed clustering algorithms designed to address the problem of growing networks. After giving a specification for this problem, we provide a first distributed algorithm based on random walks to solve it. This algorithm uses only local information,and uses random walks to build connected sets of nodes called cores of clusters in parallel, to which we add adjacent nodes. The size of each core is between 2 and a parameter of the algorithm. The algorithm guarantees that if two clusters are adjacent, at least one of them has a core of maximum size. A second, mobility-adaptive, algorithm ensures, besides those properties, that the reconfiguration following a topological change is local. This property differentiates our solution from many solutions : it avoids chain destruction following a topology change. Finally, we present a self-stabilizing clustering algorithm that preserves the properties of previous algorithms and adds fault tolerance. With the parallel construction of clusters and the local nature of the reconstruction of clusters, these algorithms guarantee the scabability, which is confirmed by simulations. Clusterisation Réseau dynamique Système distribué Clusterization Dynamical network Distributed system
18	Vehicular Group Membership Resilient to Malicious Attacks Fischer, Benjamin January 2019 (has links) There is a range of tools and techniques in the realm of information security that can be used to enhance the security of a distributed network protocol and some of them introduce new problems. A security analysis of the distributed network protocol SLMP is made and three vulnerabilities are identified; messages can be intercepted and tampered with, nodes can fake id, and leader nodes can do a lot of harm if they are malicious. Three versions of SLMP that aims to remedy these vulnerabilities are implemented and the results show that while they remedy the vulnerabilities some of them introduce new problems. SLMP distributed system security VPKI Communication Systems Kommunikationssystem
19	Desenvolvimento de técnicas de anycast na camada de aplicação para a provisão de qualidade de serviço em computação na nuvem / Development of application layer anycast techniques for quality of service provision in cloud computing Adami, Lucas Junqueira 13 October 2015 (has links) Nos últimos anos, houve um aumento da complexidade e variedade de serviços disponíveis na Internet, fato que tem levado à busca por técnicas eficientes de roteamento de requisições de um cliente ao melhor servidor disponível, sendo uma delas conhecida como application layer anycast (ALA). O objetivo deste mestrado é elaborar meios eficientes de prover anycast na camada de aplicação com qualidade de serviço no contexto de computação em nuvem. Para atingir esse objetivo, um novo sistema foi proposto (GALA, Global Application Layer Anycast). Ele herda características de um outro sistema existente e emprega a geolocalização como diferencial, a fim de melhorar o desempenho geral do algoritmo. Experimentos foram realizados por meio de simulação e os resultados mostraram que esse novo sistema, comparado ao algoritmo herdado, mantém a eficiência das requisições realizadas pelos clientes e diminui consideravelmente o tempo de latência dessas operações. Ainda, o sistema proposto foi desenvolvido em um ambiente real a fim de fortalecer os resultados das simulações. Com os resultados obtidos, o sistema modelado foi validado e sua eficácia confirmada. / In the past years, the complexity and variety of available services expanded in the Internet, fact that is drawing attention of many researchers that wish to find out efficient techniques of routing client requests to the closest server, being one of them known as application layer anycast (ALA). Thus, the objective of this research is to elaborate ways to offer application layer anycast that are scalable and select the closest servers with the shortest latency possible, in the context of cloud computing. To achieve this goal, a new system was proposed (GALA, Global Application Layer Anycast). It inherits features from an existing system and applies geolocation to improve its overall performance. Simulation results indicated that the new system, compared to its antecessor, has the same efficiency but decreases considerably the requests latency. Yet, the proposed system was deployed in a real environment to strengthen the simulations results. With the obtained data, the modeled system was validated and its efficiency confirmed. Anycast Anycast Distributed system Seleção de serviços Service delection Sistemas distribuídos
20	A scalable data store and analytic platform for real-time monitoring of data-intensive scientific infrastructure Suthakar, Uthayanath January 2017 (has links) Monitoring data-intensive scientific infrastructures in real-time such as jobs, data transfers, and hardware failures is vital for efficient operation. Due to the high volume and velocity of events that are produced, traditional methods are no longer optimal. Several techniques, as well as enabling architectures, are available to support the Big Data issue. In this respect, this thesis complements existing survey work by contributing an extensive literature review of both traditional and emerging Big Data architecture. Scalability, low-latency, fault-tolerance, and intelligence are key challenges of the traditional architecture. However, Big Data technologies and approaches have become increasingly popular for use cases that demand the use of scalable, data intensive processing (parallel), and fault-tolerance (data replication) and support for low-latency computations. In the context of a scalable data store and analytics platform for monitoring data-intensive scientific infrastructure, Lambda Architecture was adapted and evaluated on the Worldwide LHC Computing Grid, which has been proven effective. This is especially true for computationally and data-intensive use cases. In this thesis, an efficient strategy for the collection and storage of large volumes of data for computation is presented. By moving the transformation logic out from the data pipeline and moving to analytics layers, it simplifies the architecture and overall process. Time utilised is reduced, untampered raw data are kept at storage level for fault-tolerance, and the required transformation can be done when needed. An optimised Lambda Architecture (OLA), which involved modelling an efficient way of joining batch layer and streaming layer with minimum code duplications in order to support scalability, low-latency, and fault-tolerance is presented. A few models were evaluated; pure streaming layer, pure batch layer and the combination of both batch and streaming layers. Experimental results demonstrate that OLA performed better than the traditional architecture as well the Lambda Architecture. The OLA was also enhanced by adding an intelligence layer for predicting data access pattern. The intelligence layer actively adapts and updates the model built by the batch layer, which eliminates the re-training time while providing a high level of accuracy using the Deep Learning technique. The fundamental contribution to knowledge is a scalable, low-latency, fault-tolerant, intelligent, and heterogeneous-based architecture for monitoring a data-intensive scientific infrastructure, that can benefit from Big Data, technologies and approaches.

Search results