Global ETD Search

351	Improving Performance of Highly-Programmable Concurrent Applications by Leveraging Parallel Nesting and Weaker Isolation Levels Niles, Duane Francis Jr. 15 July 2015 (has links) The recent development of multi-core computer architectures has largely affected the creation of everyday applications, requiring the adoption of concurrent programming to significantly utilize the divided processing power of computers. Applications must be split into sections able to execute in parallel, without any of these sections conflicting with one another, thereby necessitating some form of synchronization to be declared. The most commonly used methodology is lock-based synchronization; although, to improve performance the most, developers must typically form complex, low-level implementations for large applications, which can easily create potential errors or hindrances. An abstraction from database systems, known as transactions, is a rising concurrency control design aimed to circumvent the challenges with programmability, composability, and scalability in lock-based synchronization. Transactions execute their operations speculatively and are capable of being restarted (or rolled back) when there exist conflicts between concurrent actions. As such issues can occur later in the lifespans of transactions, entire rollbacks are not that effective for performance. One particular method, known as nesting, was created to counter that drawback. Nesting is the act of enclosing transactions within other transactions, essentially dividing the work into pieces called sub-transactions. These sub-transactions can roll back without affecting the entire main transaction, although general nesting models only allow one sub-transaction to perform work at a time. The first main contribution in this thesis is SPCN, an algorithm that parallelizes nested transactions while automatically processing any potential conflicts that may arise, eliminating the burden of additional processing from the application developers. Two versions of SPCN exist: Strict, which enforces the sub-transactions' work to be made visible in a serialized order; and Relaxed, which allows sub-transactions to distribute their information immediately as they finish (therefore invalidation may occur after-the-fact and must be handled). Despite the additional logic required by SPCN, it outperforms traditional closed nesting by 1.78x at the lowest and 3.78x at the highest in the experiments run. Another method to alter transactional execution and boost performance is to relax the rules of visibility for parallel operations (known as their isolation). Depending on the application, correctness is not broken even if some transactions see external work that may later be undone due to a rollback, or if an object is written while another transaction is using an older instance of its data. With lock-based synchronization, developers would have to explicitly design their application with varying amounts of locks, and different lock organizations or hierarchies, to change the strictness of the execution. With transactional systems, the processing performed by the system itself can be set to utilize different rulings, which can change the performance of an application without requiring it to be largely redesigned. This notion leads to the second contribution in this thesis: AsR, or As-Serializable transactions. Serializability is the general form of isolation or strictness for transactions in many applications. In terms of execution, its definition is equivalent to only one transaction running at a time in a given system. Many transactional systems use their own internal form of locking to create Serializable executions, but it is typically too strict for many applications. AsR transactions allow the internal processing to be relaxed while additional meta-data is used external to the system, without requiring any interaction from the developer or any changes to the given application. AsR transactions offer multiple orders of magnitude more in throughput in highly-contentious scenarios, due to their capability to outlast traditional levels of isolation. / Master of Science Concurrency Transactional Memory Parallel Nesting Distributed Systems Databases Transactions Serializability Weaker Isolation Levels
352	Generalized Consensus for Practical Fault-Tolerance Garg, Mohit 07 September 2018 (has links) Despite extensive research on Byzantine Fault Tolerant (BFT) systems, overheads associated with such solutions preclude widespread adoption. Past efforts such as the Cross Fault Tolerance (XFT) model address this problem by making a weaker assumption that a majority of processes are correct and communicate synchronously. Although XPaxos of Liu et al. (using the XFT model) achieves similar performance as Paxos, it does not scale with the number of faults. Also, its reliance on a single leader introduces considerable downtime in case of failures. This thesis presents Elpis, the first multi-leader XFT consensus protocol. By adopting the Generalized Consensus specification from the Crash Fault Tolerance model, we were able to devise a multi-leader protocol that exploits the commutativity property inherent in the commands ordered by the system. Elpis maps accessed objects to non-faulty processes during periods of synchrony. Subsequently, these processes order all commands which access these objects. Experimental evaluation confirms the effectiveness of this approach: Elpis achieves up to 2x speedup over XPaxos and up to 3.5x speedup over state-of-the-art Byzantine Fault-Tolerant Consensus Protocols. / Master of Science / Online services like Facebook, Twitter, Netflix and Spotify to cloud services like Google and Amazon serve millions of users which include individuals as well as organizations. They use many distributed technologies to deliver a rich experience. The distributed nature of these technologies has removed geographical barriers to accessing data, services, software, and hardware. An essential aspect of these technologies is the concept of the shared state. Distributed databases with multiple replicated data nodes are an example of this shared state. Maintaining replicated data nodes provides several advantages such as (1) availability so that in case one node goes down the data can still be accessed from other nodes, (2) quick response times, by placing data nodes closer to the user, the data can be obtained quickly, (3) scalability by enabling multiple users to access different nodes so that a single node does not cause bottlenecks. To maintain this shared state some mechanism is required to maintain consistency, that is the copies of these shared state must be identical on all the data nodes. This mechanism is called Consensus, and several such mechanisms exist in practice today which use the Crash Fault Tolerance (CFT). The CFT model implies that these mechanisms provide consistency in the presence of nodes crashing. While the state-of-the-art for security has moved from assuming a trusted environment inside a firewall to a perimeter-less and semi-trusted environment with every service living on the internet, only the application layer is required to be secured while the core is built just with an idea of crashes in mind. While there exists comprehensive research on secure Consensus mechanisms which utilize what is called the Byzantine Fault Tolerance (BFT) model, the extra costs required to implement these mechanisms and comparatively lower performance in a geographically distributed setting has impeded widespread adoption. A new model recently proposed tries to find a cross between these models that is achieving security while paying no extra costs called the Cross Fault Tolerance (XFT). This thesis presents Elpis, a consensus mechanism which uses precisely this model that will secure the shared state from its core without modifications to the existing setups while delivering high performance and lower response times. We perform a comprehensive evaluation on AWS and demonstrate that Elpis achieves 3.5x over the state-of-the-art while improving response times by as much as 50%. Distributed Systems Fault Tolerance Byzantine Fault Tolerance State Machine Replication Multi-Leader Consensus Blockchain
353	HyFlow: A High Performance Distributed Software Transactional Memory Framework Saad Ibrahim, Mohamed Mohamed 14 June 2011 (has links) We present HyFlow - a distributed software transactional memory (D-STM) framework for distributed concurrency control. Lock-based concurrency control suffers from drawbacks including deadlocks, livelocks, and scalability and composability challenges. These problems are exacerbated in distributed systems due to their distributed versions which are more complex to cope with (e.g., distributed deadlocks). STM and D-STM are promising alternatives to lock-based and distributed lock-based concurrency control for centralized and distributed systems, respectively, that overcome these difficulties. HyFlow is a Java framework for DSTM, with pluggable support for directory lookup protocols, transactional synchronization and recovery mechanisms, contention management policies, cache coherence protocols, and network communication protocols. HyFlow exports a simple distributed programming model that excludes locks: using (Java 5) annotations, atomic sections are defiend as transactions, in which reads and writes to shared, local and remote objects appear to take effect instantaneously. No changes are needed to the underlying virtual machine or compiler. We describe HyFlow's architecture and implementation, and report on experimental studies comparing HyFlow against competing models including Java remote method invocation (RMI) with mutual exclusion and read/write locks, distributed shared memory (DSM), and directory-based D-STM. / Master of Science Directory Protocols Cache Coherence Contention Management Control-Flow Dataflow Software Transactional Memory Distributed Systems
354	Optimizing Distributed Tracing Overhead in a Cloud Environment with OpenTelemetry Elias, Norgren January 2024 (has links) To gain observability in distributed systems, some telemetry generation and gathering must be implemented. This is especially important when systems have layers of dependencies on other microservices. One method for observability is called distributed tracing. Distributed tracing is the act of building causal event chains between microservices, which are called traces. Finding bottlenecks and dependencies within each call chain is possible with the traces. One framework for implementing distributed tracing is OpenTelemetry. The developer must determine design choices when deploying OpenTelemetry in a Kubernetes cluster. For example, OpenTelemetry provides a collector that collects spans, which are parts of a trace from microservices. These collectors can be deployed one on each node, called a daemonset. Or it can be deployed with one for each service, called sidecars. This study compared the performance impact of the sidecar and daemonset setup to that of having no OpenTelemetry implemented. The resources analyzed were CPU usage, network usage, and RAM usage. Tests were done in a permutation of 4 different scenarios. Experiments were run on 4 and 2 nodes, as well as a balanced and unbalanced service placement setup. The experiments were run in a cloud environment using Kubernetes. The tested system was an emulation of one of Nasdaq's systems based on real data from the company. The study concluded that having OpenTelemetry added overhead / increased resource usage in all cases. Having the daemonset setup, compared to no OpenTelemetry, increased CPU usage by 46.5 %, network usage by 18.25 %, and memory usage by 47.5 % on average. Sidecar did, in most cases, perform worse than the daemonset setup in most cases and resources, especially in RAM and CPU usage. OpenTelemetry Cloud Distributed tracing Collector Optimization Kubernetes tracing Distributed systems Computer Sciences Datavetenskap (datalogi)
355	Performance Overhead Of OpenTelemetry Sampling Methods In A Cloud Infrastructure Karkan, Tahir Mert January 2024 (has links) This thesis explores the overhead of distributed tracing in OpenTelemetry, using different sampling strategies, in a cloud environment. Distributed tracing is telemetry data that allows developers to analyse causal events in a system with temporal information. This comes at the cost of overhead, in terms of CPU, memory and network usage, as the telemetry data has to be generated and sent through collectors that handle traces and at last sends them to a backend. By sampling using three different sampling strategies, head and tail based sampling and a mixture of those two, overhead can be reduced at the price of losing some information. To gain a measure of how this information loss impacts application performance, synthetic error messages are introduced in traces and used to gauge how many traces with errors the sampling strategies can detect. All three sampling strategies were compared for services that sent more and less data between nodes in Kubernetes. The experiments were also tested in a two and four nodes setup. This thesis was conducted with Nasdaq as it is of their interest to have high performing monitoring tools and their systems were analysed and emulated for relevance. The thesis concluded that tail based sampling had the highest overhead (71.33% CPU, 23.7% memory and 5.6% network average overhead compared to head based sampling) for the benefit of capturing all the errors. Head based sampling had the least overhead, except in the node that had deployed Jaeger as the backend for traces, where its higher total sampling rate added on average 12.75% CPU overhead for the four node setup compared to mixed sampling. Although, mixed sampling captured more errors. When measuring the overall time taken for the experiments, the highest impact could be observed when more requests had to be sent between nodes. OpenTelemetry Distributed tracing Sampling Kubernetes Cloud Tracing Distributed Systems Computer Sciences Datavetenskap (datalogi)
356	SUPPORTING MULTIPLE ISOLATION LEVELS IN REPLICATED ENVIRONMENTS Bernabe Gisbert, Jose Maria 20 March 2014 (has links) La replicación de bases de datos aporta fiabilidad y escalabilidad aunque hacerlo de forma transparente no es una tarea sencilla. Una base de datos replicada es transparente si puede reemplazar a una base de datos centralizada tradicional sin que sea necesario adaptar el resto de componentes del sistema. La transparencia en bases de datos replicadas puede obtenerse siempre que (a) la gestión de la replicación quede totalmente oculta a dichos componentes y (b) se ofrezca la misma funcionalidad que en una base de datos tradicional. Para mejorar el rendimiento general del sistema, los gestores de bases de datos centralizadas actuales permiten ejecutar de forma concurrente transacciones bajo distintos niveles de aislamiento. Por ejemplo, la especificación del benchmark TPC-C permite la ejecución de algunas transacciones con niveles de aislamiento débiles. No obstante, este soporte todavía no está disponible en los protocolos de replicación. En esta tesis mostramos cómo estos protocolos pueden ser extendidos para permitir la ejecución de transacciones con distintos niveles de aislamiento. / Bernabe Gisbert, JM. (2014). SUPPORTING MULTIPLE ISOLATION LEVELS IN REPLICATED ENVIRONMENTS [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/36535 Databases Replication Distributed systems Isolation Computer science Algorithms Fault tolerance Reliability Consistency LENGUAJES Y SISTEMAS INFORMATICOS
357	[en] INTEGRATION OF A BATCH SUBMISSION SYSTEM WITH A CLOUD COMPUTING ENVIRONMENT / [pt] INTEGRAÇÃO DE UM SISTEMA DE SUBMISSÃO BATCH COM UM AMBIENTE DE COMPUTAÇÃO EM NUVEM DALTRO SIMOES GAMA 20 July 2016 (has links) [pt] A computação em nuvem, com sua promessa de redução de custos de manutenção e facilidades de configuração, está despertando cada vez mais o interesse da comunidade científica que depende de muitas máquinas para executar seus programas. Neste trabalho implementamos uma nova integração para o sistema CSGrid, do Tecgraf/PUC-Rio, que o torna apto a submeter programas para execução no ambiente de nuvem pública Microsoft Azure, usufruindo assim dos benefícios da elasticidade de recursos computacionais. Para tal, apresentamos algumas medidas de desempenho para o caso de uso da nuvem pública Microsoft Azure pelo sistema CSGrid, no que se refere a custos de transferência de dados e provisionamento de máquinas virtuais. O objetivo com essa integração é avaliar os benefícios e as dificuldades que envolvem o uso de um modelo de execução em nuvem por um sistema tipicamente voltado a execução de aplicações de alto desempenho em clusters. / [en] Cloud computing appeals to those who need many machines to run their programs, attracted by low maintenance costs and easy configuration. In this work we implemented a new integration for the CSGrid system, from Tecgraf/PUC-Rio, enabling it to submit workloads to Microsoft Azure public cloud, thus enjoying the benefits of elastic computing resources. For this purpose, we present related works and some performance measures in the case of CSGrid s use of Microsoft Azure public cloud, with regard to costs on data transfers and provisioning of virtual machines. With this integration, we could evaluate the benefits and difficulties involved in using cloud resources in a system designed for the submission of HPC applications to clusters. [pt] SISTEMAS DISTRIBUIDOS [pt] CSGRID [pt] MICROSOFT AZURE [pt] COMPUTACAO EM NUVEM [en] DISTRIBUTED SYSTEMS
358	Using Task Parallelism for Distributed Parallel Skeleton Programming : Implementing a StarPU Back-End to SkePU 2 / Distribuerade parallellprogrammeringsskelett genom uppgiftsparallellism : Implementation av en StarPU-baserad SkePU 2 backend Henrik, Henriksson January 2024 (has links) We extended the parallel skeleton programming framework SkePU 2 with a new back-end utilizing StarPU, a task programming framework for hybrid and distributed architectures. The aim was to allow SkePU to run on distributed clusters, using MPI through StarPU. The implemented back-end distributes data and work across participating ranks. While we did not implement the full SkePU API, the Map and Reduce1D skeletons were successfully implemented. During the implementation, we discovered some differences in API design between SkePU and StarPU. We combine the type-safe templates used in the SkePU API with the C-style void*-heavy API of StarPU. This requires the implementation to use more complex templates than normally desired. While we could preserve most of the SkePU 2 API when moving to a distributed memory situation, some parts had to change. In particular, we needed to change the semantics of SkePU 2 containers with regards to iterators and random access. We benchmarked the performance of the implemented back-end against an MPI+OpenMP reference implementation on two problems, n-body and a simple reduction. While the n-body problem demonstrates promising scaling properties, reductions do not scale well to larger number of ranks. A performance comparison against the MPI+OpenMP reference implementation reveals that, aside from the higher communication overhead, there may also be some overhead in the work performed between communications, potentially performing at below 60-70% of the reference. In most cases, the new back-end to SkePU exhibits significantly lower performance than the reference. Extending the implemented solution to cover the full API and improving performance could provide a high level interface to distributed programming for application programmers. Indeed, subsequent developments of SkePU 3 extend and improve our StarPU back-end. HPC StarPU SkePU parallel porgramming skeleton programming distributed systems MPI Computer Engineering Datorteknik
359	Extending Model Checking Using Inductive Proofs in Distributed Digital Currency Protocols Storey, Kyle R. 26 June 2023 (has links) (PDF) Model checking is an effective method to verify both safety and liveness properties in distributed systems. However, the complexity of model checking grows exponentially with the number of entities which makes it suitable only for small systems. Interactive theorem provers allow for machine-checked proofs. These proofs can include inductive reasoning which allows them to reason about an arbitrarily large number of entities. However, proving safety and liveness properties in these proofs can be difficult. This work explores how combining model checking and inductive proofs can be an effective method for formally verifying complex distributed protocols. This is demonstrated on a part of MyCHIPs, a novel digital currency based on the value of personal credit. It has been selected as a case study because it requires certain properties to hold on a non-trivial distributed algorithm. Model checking interactive theorem provers formal verification distributed systems digital currency Physical Sciences and Mathematics
360	Cooperative Automated Vehicle Movement Optimization at Uncontrolled Intersections using Distributed Multi-Agent System Modeling Mahmoud, Abdallah Abdelrahman Hassan 28 February 2017 (has links) Optimizing connected automated vehicle movements through roadway intersections is a challenging problem. Traditional traffic control strategies, such as traffic signals are not optimal, especially for heavy traffic. Alternatively, centralized automated vehicle control strategies are costly and not scalable given that the ability of a central controller to track and schedule the movement of hundreds of vehicles in real-time is highly questionable. In this research, a series of fully distributed heuristic algorithms are proposed where vehicles in the vicinity of an intersection continuously cooperate with each other to develop a schedule that allows them to safely proceed through the intersection while incurring minimum delays. An algorithm is proposed for the case of an isolated intersection then a number of algorithms are proposed for a network of intersections where neighboring intersections communicate directly or indirectly to help the distributed control at each intersection makes a better estimation of traffic in the whole network. An algorithm based on the Godunov scheme outperformed optimized signalized control. The simulated experiments show significant reductions in the average delay. The base algorithm is successfully added to the INTEGRATION micro-simulation model and the results demonstrate improvements in delay, fuel consumption, and emissions when compared to roundabout, signalized, and stop sign controlled intersections. The study also shows the capability of the proposed technique to favor emergency vehicles, producing significant increases in mobility with minimum delays to the other vehicles in the network. / Ph. D. / Intelligent self-driving cars are getting much closer to reality than fiction. Technological advances make it feasible to produce such vehicles at low affordable cost. This type of vehicles is also promising to significantly reduce car accidents saving people lives and health. Moreover, the congested roads in cities and metropolitan areas especially at rush hours can benefit from this technology to avoid or at least to reduce the delays experienced by car passengers during their trips. One major challenge facing the operation of an intelligent self-driving car is how to pass an intersection as fast as possible without any collision with cars approaching from other directions of the intersection. The use of current traffic lights or stop signs is not the best choice to make the best use of the capabilities of future cars. In this dissertation, the aim is to study and propose ways to make sure the future intersections are ready for such self-driving intelligent cars. Assuming that an intersection has no type of traditional controls such as traffic lights or stop signs, this research effort shows how vehicles can pass safely with minimum waiting. The proposed techniques focus on providing lowcost solutions that do not require installation of expensive devices at intersections that makes it difficult to be approved by authorities. The proposed techniques can be applied to intersections of various sizes. The algorithms in this dissertation carefully design a way for vehicles in a network of intersections to communicate and cooperate while passing an intersection. The algorithms are extensively compared to the case of using traffic lights, stop signs, and roundabouts. Results show significant improvement in delay reduction and fuel consumption when the proposed techniques are used. ITS Automated Vehicles Uncontrolled Intersections Networked Intersections Multi-agent Systems Distributed Systems Intelligent Agent Cooperation

Search results