Global ETD Search

361	External Streaming State Abstractions and Benchmarking / Extern strömmande statliga abstraktioner och benchmarking Sree Kumar, Sruthi January 2021 (has links) Distributed data stream processing is a popular research area and is one of the promising paradigms for faster and efficient data management. Application state is a first-class citizen in nearly every stream processing system. Nowadays, stream processing is, by definition, stateful. For a stream processing application, the state is backing operations such as aggregations, joins, and windows. Apache Flink is one of the most accepted and widely used stream processing systems in the industry. One of the main reasons engineers choose Apache Flink to write and deploy continuous applications is its unique combination of flexibility and scalability for stateful programmability, and the firm guarantee that the system ensures. Apache Flink’s guarantees always make its states correct and consistent even when nodes fail or when the number of tasks changes. Flink state can scale up to its compute node’s hard disk boundaries using embedded databases to store and retrieve data. Nevertheless, in all existing state backends officially supported by Flink, the state is always available locally to compute tasks. Even though this makes deployment more convenient, it creates other challenges such as non-trivial state reconfiguration and failure recovery. At the same time, compute, and state are bound to be tightly coupled. This strategy also leads to over-provisioning and is counterintuitive on state intensive only workloads or compute-intensive only workloads. This thesis investigates an alternative state backend architecture, FlinkNDB, which can tackle these challenges. FlinkNDB decouples state and computes by using a distributed database to store the state. The thesis covers the challenges of existing state backends and design choices and the new state backend implementation. We have evaluated the implementation of FlinkNDB against existing state backends offered by Apache Flink. / Distribuerad dataströmsbehandling är ett populärt forskningsområde och är ett av de lovande paradigmen för snabbare och effektivare datahantering. Applicationstate är en förstklassig medborgare i nästan alla strömbehandlingssystem. Numera är strömbearbetning per definition statlig. För en strömbehandlingsapplikation backar staten operationer som aggregeringar, sammanfogningar och windows. Apache Flink är ett av de mest accepterade och mest använda strömbehandlingssystemen i branschen. En av de främsta anledningarna till att ingenjörer väljer ApacheFlink för att skriva och distribuera kontinuerliga applikationer är dess unika kombination av flexibilitet och skalbarhet för statlig programmerbarhet, och företaget garanterar att systemet säkerställer. Apache Flinks garantier gör alltid dess tillstånd korrekt och konsekvent även när noder misslyckas eller när antalet uppgifter ändras. Flink-tillstånd kan skala upp till dess beräkningsnods hårddiskgränser genom att använda inbäddade databaser för att lagra och hämta data. I allmänna tillståndsstöd som officiellt stöds av Flink är staten dock alltid tillgänglig lokalt för att beräkna uppgifter. Även om detta gör installationen bekvämare, skapar det andra utmaningar som icke-trivial tillståndskonfiguration och felåterställning. Samtidigt måste beräkning och tillstånd vara tätt kopplade. Den här strategin leder också till överanvändning och är kontraintuitiv för statligt intensiva endast arbetsbelastningar eller beräkningsintensiva endast arbetsbelastningar. Denna avhandling undersöker en alternativ statsbackendarkitektur, FlinkNDB, som kan hantera dessa utmaningar. FlinkNDB frikopplar tillstånd och beräknar med hjälp av en distribuerad databas för att lagra tillståndet. Avhandlingen täcker utmaningarna med befintliga statliga backends och designval och den nya implementeringen av statebackend. Vi har utvärderat genomförandet av FlinkNDBagainst befintliga statliga backends som erbjuds av Apache Flink. Apache Flink Distributed Systems NDB FlinkNDB State State Backends External State Stream Processing Systems Benchmarking Caching Apache Flink Distributed Systems NDB FlinkNDB State State Backends External State Stream Processing Systems Benchmarking Caching Computer and Information Sciences Data- och informationsvetenskap
362	The state of WebAssembly in distributed systems : With a focus on Rust and Arc-Lang / En utvärdering av WebAssembly inom Distribuerade system : Med fokus på Rust och Arc-Lang Moise, Theodor-Andrei January 2023 (has links) With the current developments in modern web browsers, WebAssembly has been a rising trend over the last four years. Aimed at replacing bits of JavaScript functionality, it attempts to bring extra features to achieve portability and sandboxing through virtualisation. After the release of the WebAssembly System Interface, more and more projects have been working on using it outside web pages and browsers, in scenarios such as embedded, serverless, or distributed computing. This is thus not only relevant to the web and its clients, but also to applications in distributed systems. Considering the novelty of the topic, there is currently very little related scientific literature. With constant changes in development, proposals and goals, there is a large gap in relevant research. We aim to help bridge this gap by focusing on Rust and Arc-Lang, a domain-specific language for data analytics, in order to provide an overview of how far the technology has progressed, in addition to what runtimes there are and how they work. We investigate what kind of use case WebAssembly could have in the context of distributed systems, as well as how it can benefit data processing pipelines. Even though the technology is still immature at first glance, it is worth checking whether its proposals have been implemented, and how its performance compared to that of native Rust can affect data processing in a pipeline. We show this by benchmarking a filter program as part of a distributed container environment, while looking at different WebAssembly compilers such as Cranelift and LLVM. Then, we compare the resulting statistics to native Rust and present a synopsis of the state of WebAssembly in a distributed context. / I takt med den nuvarande utvecklingen av moderna webbläsare har WebAssembly stigit i trend under de senaste fyra åren. WebAssembly har som syfte att ersätta och utöka JavaScript med funktionalitet som är portabel och isolerad från omvärlden genom virtualisering. Efter lanseringen av WebAssembly System Interface har fler och fler projekt börjat applicera WebAssembly utanför webbsidor och webbläsare, i scenarier som inbäddade, serverlösa eller distribuerade beräkningar. Detta har gjort WebAssembly till ett språk som inte bara är relevant för webben och dess användare, utan även för applikationer i distribuerade system. Med tanke på ämnets framkant finns det för närvarande väldigt lite relaterad vetenskaplig litteratur. Ständiga förändringar i utveckling, förslag och mål har resulterat i stort gap i relevant forskning. Vi strävar efter att hjälpa till att överbrygga denna klyfta genom att studera WebAssembly i perspektivet av Rust och Arc-Lang, ett domänspecifikt språk för dataanalys, för att ge en översikt över hur långt tekniken har kommit, och samt utreda vilka exekveringssystem som finns och hur de fungerar. Vi undersöker vilken typ av användning WebAssembly kan ha i samband med distribuerade system, samt hur det kan gynna databehandlingspipelines. Även om tekniken fortfarande är ny vid första anblicken, är det värt att kontrollera om dess förslag har implementerats och hur dess prestanda gentemot Rust kan påverka databehandling i en pipeline. Vi visar detta genom att benchmarka ett filtreringsprogram som en del av en distribuerad containermiljö, samtidigt som vi tittar på olika WebAssembly-kompilatorer som exempelvis Cranelift och LLVM. Vi jämför resultaten med Rust och presenterar en sammanfattning av WebAssemblys tillstånd i sammanhanget av distribuerade system. WebAssembly Wasmer Wasm Rust Arc-Lang Distributed Systems Continuous Deep Analytics Docker containers Performance benchmarking WebAssembly Wasmer Wasm Rust Arc-Lang Distributed Systems Continuous Deep Analytics Docker containers Performance benchmarking Computer and Information Sciences Data- och informationsvetenskap
363	Error isolation in distributed systems Behrens, Diogo 25 May 2016 (has links) (PDF) In distributed systems, if a hardware fault corrupts the state of a process, this error might propagate as a corrupt message and contaminate other processes in the system, causing severe outages. Recently, state corruptions of this nature have been observed surprisingly often in large computer populations, e.g., in large-scale data centers. Moreover, since the resilience of processors is expected to decline in the near future, the likelihood of state corruptions will increase even further. In this work, we argue that preventing the propagation of state corruption should be a first-class requirement for large-scale fault-tolerant distributed systems. In particular, we propose developers to target error isolation, the property in which each correct process ignores any corrupt message it receives. Typically, a process cannot decide whether a received message is corrupt or not. Therefore, we introduce hardening as a class of principled approaches to implement error isolation in distributed systems. Hardening techniques are (semi-)automatic transformations that enforce that each process appends an evidence of good behavior in the form of error codes to all messages it sends. The techniques “virtualize” state corruptions into more benign failures such as crashes and message omissions: if a faulty process fails to detect its state corruption and abort, then hardening guarantees that any corrupt message the process sends has invalid error codes. Correct processes can then inspect received messages and drop them in case they are corrupt. With this dissertation, we contribute theoretically and practically to the state of the art in fault-tolerant distributed systems. To show that hardening is possible, we design, formalize, and prove correct different hardening techniques that enable existing crash-tolerant designs to handle state corruption with minimal developer intervention. To show that hardening is practical, we implement and evaluate these techniques, analyzing their effect on the system performance and their ability to detect state corruptions in practice. hardware errors arbitrary state corruption data corruption error isolation distributed systems Byzantine fault tolerance ddc:004 rvk:ST 200
364	CHECKPOINTING AND RECOVERY IN DISTRIBUTED AND DATABASE SYSTEMS Wu, Jiang 01 January 2011 (has links) A transaction-consistent global checkpoint of a database records a state of the database which reflects the effect of only completed transactions and not the re- sults of any partially executed transactions. This thesis establishes the necessary and sufficient conditions for a checkpoint of a data item (or the checkpoints of a set of data items) to be part of a transaction-consistent global checkpoint of the database. This result would be useful for constructing transaction-consistent global checkpoints incrementally from the checkpoints of each individual data item of a database. By applying this condition, we can start from any useful checkpoint of any data item and then incrementally add checkpoints of other data items until we get a transaction- consistent global checkpoint of the database. This result can also help in designing non-intrusive checkpointing protocols for database systems. Based on the intuition gained from the development of the necessary and sufficient conditions, we also de- veloped a non-intrusive low-overhead checkpointing protocol for distributed database systems. Checkpointing and rollback recovery are also established techniques for achiev- ing fault-tolerance in distributed systems. Communication-induced checkpointing algorithms allow processes involved in a distributed computation take checkpoints independently while at the same time force processes to take additional checkpoints to make each checkpoint to be part of a consistent global checkpoint. This thesis develops a low-overhead communication-induced checkpointing protocol and presents a performance evaluation of the protocol. Distributed Systems Distributed Database Systems Communication-induced Checkpointing Consistent Global Checkpoints Computer Sciences Databases and Information Systems
365	Self-Management for Large-Scale Distributed Systems Al-Shishtawy, Ahmad January 2012 (has links) Autonomic computing aims at making computing systems self-managing by using autonomic managers in order to reduce obstacles caused by management complexity. This thesis presents results of research on self-management for large-scale distributed systems. This research was motivated by the increasing complexity of computing systems and their management. In the first part, we present our platform, called Niche, for programming self-managing component-based distributed applications. In our work on Niche, we have faced and addressed the following four challenges in achieving self-management in a dynamic environment characterized by volatile resources and high churn: resource discovery, robust and efficient sensing and actuation, management bottleneck, and scale. We present results of our research on addressing the above challenges. Niche implements the autonomic computing architecture, proposed by IBM, in a fully decentralized way. Niche supports a network-transparent view of the system architecture simplifying the design of distributed self-management. Niche provides a concise and expressive API for self-management. The implementation of the platform relies on the scalability and robustness of structured overlay networks. We proceed by presenting a methodology for designing the management part of a distributed self-managing application. We define design steps that include partitioning of management functions and orchestration of multiple autonomic managers. In the second part, we discuss robustness of management and data consistency, which are necessary in a distributed system. Dealing with the effect of churn on management increases the complexity of the management logic and thus makes its development time consuming and error prone. We propose the abstraction of Robust Management Elements, which are able to heal themselves under continuous churn. Our approach is based on replicating a management element using finite state machine replication with a reconfigurable replica set. Our algorithm automates the reconfiguration (migration) of the replica set in order to tolerate continuous churn. For data consistency, we propose a majority-based distributed key-value store supporting multiple consistency levels that is based on a peer-to-peer network. The store enables the tradeoff between high availability and data consistency. Using majority allows avoiding potential drawbacks of a master-based consistency control, namely, a single-point of failure and a potential performance bottleneck. In the third part, we investigate self-management for Cloud-based storage systems with the focus on elasticity control using elements of control theory and machine learning. We have conducted research on a number of different designs of an elasticity controller, including a State-Space feedback controller and a controller that combines feedback and feedforward control. We describe our experience in designing an elasticity controller for a Cloud-based key-value store using state-space model that enables to trade-off performance for cost. We describe the steps in designing an elasticity controller. We continue by presenting the design and evaluation of ElastMan, an elasticity controller for Cloud-based elastic key-value stores that combines feedforward and feedback control. / <p>QC 20120831</p> Self-Management Autonomic Computing Control Theory Distributed Systems Grid Computing Cloud Computing Elastic Services Key-Value Stores
366	A basis for intrusion detection in distributed systems using kernel-level data tainting. Hauser, Christophe 19 June 2013 (has links) (PDF) Modern organisations rely intensively on information and communicationtechnology infrastructures. Such infrastructures offer a range of servicesfrom simple mail transport agents or blogs to complex e-commerce platforms,banking systems or service hosting, and all of these depend on distributedsystems. The security of these systems, with their increasing complexity, isa challenge. Cloud services are replacing traditional infrastructures byproviding lower cost alternatives for storage and computational power, butat the risk of relying on third party companies. This risk becomesparticularly critical when such services are used to host privileged companyinformation and applications, or customers' private information. Even in thecase where companies host their own information and applications, the adventof BYOD (Bring Your Own Device) leads to new security relatedissues.In response, our research investigated the characterization and detection ofmalicious activities at the operating system level and in distributedsystems composed of multiple hosts and services. We have shown thatintrusions in an operating system spawn abnormal information flows, and wedeveloped a model of dynamic information flow tracking, based on taintmarking techniques, in order to detect such abnormal behavior. We trackinformation flows between objects of the operating system (such as files,sockets, shared memory, processes, etc.) and network packetsflowing between hosts. This approach follows the anomaly detection paradigm.We specify the legal behavior of the system with respect to an informationflow policy, by stating how users and programs from groups of hosts areallowed to access or alter each other's information. Illegal informationflows are considered as intrusion symptoms. We have implemented this modelin the Linux kernel (the source code is availableat http://www.blare-ids.org), as a Linux Security Module (LSM), andwe used it as the basis for practical demonstrations. The experimentalresults validated the feasibility of our new intrusion detection principles. [SPI:OTHER] Engineering Sciences/Other Intrusion detection Security Distributed systems Linux kernel Anomaly detection
367	Reliable peer to peer grid middleware Leslie, Matthew John January 2011 (has links) Grid computing systems are suffering from reliability and scalability problems caused by their reliance on centralised middleware. In this thesis, we argue that peer to peer middleware could help alleviate these problems. We show that peer to peer techniques can be used to provide reliable storage systems, which can be used as the basis for peer to peer grid middleware. We examine and develop new methods of providing reliable peer to peer storage, giving a new algorithm for this purpose, and assessing its performance through a combination of analysis and simulation. We then give an architecture for a peer to peer grid information system based on this work. Performance evaluation of this information system shows that it improves scalability when compared to the original centralised system, and that it withstands the failure of participant nodes without a significant reduction in quality of service. New contributions include dynamic replication, a new method for maintaining reliable storage in a Distributed Hash Table, which we show allows for the creation of more reliable, higher performance systems with lower bandwidth usage than current techniques. A new analysis of the reliability of distributed storage systems is also presented, which shows for the first time that replica placement has a significant effect on reliability. A simulation of the performance of distributed storage systems provides for the first time a quantitative performance comparison between different placement patterns. Finally, we show how these reliable storage techniques can be applied to grid computing systems, giving a new architecture for a peer to peer grid information service for the SAM-Grid system. We present a thorough performance evaluation of a prototype implementation of this architecture. Many of these contributions have been published at peer reviewed conferences. 005.3
368	Enabling Machine Science through Distributed Human Computing Wagy, Mark David 01 January 2016 (has links) Distributed human computing techniques have been shown to be effective ways of accessing the problem-solving capabilities of a large group of anonymous individuals over the World Wide Web. They have been successfully applied to such diverse domains as computer security, biology and astronomy. The success of distributed human computing in various domains suggests that it can be utilized for complex collaborative problem solving. Thus it could be used for "machine science": utilizing machines to facilitate the vetting of disparate human hypotheses for solving scientific and engineering problems. In this thesis, we show that machine science is possible through distributed human computing methods for some tasks. By enabling anonymous individuals to collaborate in a way that parallels the scientific method -- suggesting hypotheses, testing and then communicating them for vetting by other participants -- we demonstrate that a crowd can together define robot control strategies, design robot morphologies capable of fast-forward locomotion and contribute features to machine learning models for residential electric energy usage. We also introduce a new methodology for empowering a fully automated robot design system by seeding it with intuitions distilled from the crowd. Our findings suggest that increasingly large, diverse and complex collaborations that combine people and machines in the right way may enable problem solving in a wide range of fields. Artificial Intelligence Crowdsourcing Distributed Systems Human Computation Human Computer Interaction Machine Learning Artificial Intelligence and Robotics Computer Sciences
369	Rozvrhování v distribuovaných systémech / Rozvrhování v distribuovaných systémech Vyšohlíd, Jan January 2011 (has links) The present work studies methods of scheduling in heterogeneous distributed systems. First there are introduced some theoretical basics which contain not only the scheduling theory itself but also the graph theory and the computational complexity theory. After that, compile-time scheduling methods and some well-known algorithms solving the problem are presented, followed by real-time scheduling basics and by classification of used methods. In the main part of the work there are proposed algorithms which respect additional restrictions. These algorithms are tested via the enclosed application and compared either to each other or to another algorithms which mostly don't respect additional restrictions. The mentioned application and the documentation for this application are a part of this work as well.
370	A basis for intrusion detection in distributed systems using kernel-level data tainting. / Détection d'intrusions dans les systèmes distribués par propagation de teinte au niveau noyau Hauser, Christophe 19 June 2013 (has links) Les systèmes d'information actuels, qu'il s'agisse de réseaux d'entreprises, deservices en ligne ou encore d'organisations gouvernementales, reposent trèssouvent sur des systèmes distribués, impliquant un ensemble de machinesfournissant des services internes ou externes. La sécurité de tels systèmesd'information est construite à plusieurs niveaux (défense en profondeur). Lors de l'établissementde tels systèmes, des politiques de contrôle d'accès, d'authentification, defiltrage (firewalls, etc.) sont mises en place afin de garantir lasécurité des informations. Cependant, ces systèmes sont très souventcomplexes, et évoluent en permanence. Il devient alors difficile de maintenirune politique de sécurité sans faille sur l'ensemble du système (quand bienmême cela serait possible), et de résister aux attaques auxquelles ces servicessont quotidiennement exposés. C'est ainsi que les systèmes de détectiond'intrusions sont devenus nécessaires, et font partie du jeu d'outils desécurité indispensables à tous les administrateurs de systèmes exposés enpermanence à des attaques potentielles.Les systèmes de détection d'intrusions se classifient en deux grandes familles,qui diffèrent par leur méthode d'analyse: l'approche par scénarios et l'approchecomportementale. L'approche par scénarios est la plus courante, et elle estutilisée par des systèmes de détection d'intrusions bien connus tels queSnort, Prélude et d'autres. Cette approche consiste à reconnaître des signaturesd'attaques connues dans le trafic réseau (pour les IDS réseau) et des séquencesd'appels systèmes (pour les IDS hôtes). Il s'agit donc de détecter descomportements anormaux du système liés à la présence d'attaques. Bien que l'onpuisse ainsi détecter un grand nombre d'attaques, cette approche ne permet pasde détecter de nouvelles attaques, pour lesquelles aucune signature n'estconnue. Par ailleurs, les malwares modernes emploient souvent des techniquesdites de morphisme binaire, afin d'échapper à la détection parsignatures.L'approche comportementale, à l'inverse de l'approche par signature, se basesur la modélisation du fonctionnement normal du système. Cette approche permetainsi de détecter de nouvelles attaques tout comme des attaques plus anciennes,n'ayant recours à aucune base de données de connaissance d'attaques existantes.Il existe plusieurs types d'approches comportementales, certains modèles sontstatistiques, d'autres modèles s'appuient sur une politique de sécurité.Dans cette thèse, on s'intéresse à la détection d'intrusions dans des systèmesdistribués, en adoptant une approche comportementale basée sur une politique desécurité. Elle est exprimée sous la forme d'une politique de flux d'information. Les fluxd'informations sont suivis via une technique de propagation de marques (appeléeen anglais « taint marking ») appliquées sur les objets du systèmed'exploitation, directement au niveau du noyau. De telles approchesexistent également au niveau langage (par exemple par instrumentation de lamachine virtuelle Java, ou bien en modifiant le code des applications), ou encoreau niveau de l'architecture (en émulant le microprocesseur afin de tracer lesflux d'information entre les registres, pages mémoire etc.), etpermettent ainsi une analyse fine des flux d'informations. Cependant, nous avons choisi de nous placer au niveau du système d'exploitation, afin de satisfaire les objectifs suivants:• Détecter les intrusions à tous les niveaux du système, pas spécifiquement au sein d'une ou plusieurs applications.• Déployer notre système en présence d'applications natives, dont le code source n'est pas nécessairement disponible (ce qui rend leur instrumentation très difficile voire impossible).• Utiliser du matériel standard présent sur le marché. Il est très difficile de modifier physiquement les microprocesseurs, et leur émulation a un impact très important sur les performances du système. / Modern organisations rely intensively on information and communicationtechnology infrastructures. Such infrastructures offer a range of servicesfrom simple mail transport agents or blogs to complex e-commerce platforms,banking systems or service hosting, and all of these depend on distributedsystems. The security of these systems, with their increasing complexity, isa challenge. Cloud services are replacing traditional infrastructures byproviding lower cost alternatives for storage and computational power, butat the risk of relying on third party companies. This risk becomesparticularly critical when such services are used to host privileged companyinformation and applications, or customers' private information. Even in thecase where companies host their own information and applications, the adventof BYOD (Bring Your Own Device) leads to new security relatedissues.In response, our research investigated the characterization and detection ofmalicious activities at the operating system level and in distributedsystems composed of multiple hosts and services. We have shown thatintrusions in an operating system spawn abnormal information flows, and wedeveloped a model of dynamic information flow tracking, based on taintmarking techniques, in order to detect such abnormal behavior. We trackinformation flows between objects of the operating system (such as files,sockets, shared memory, processes, etc.) and network packetsflowing between hosts. This approach follows the anomaly detection paradigm.We specify the legal behavior of the system with respect to an informationflow policy, by stating how users and programs from groups of hosts areallowed to access or alter each other's information. Illegal informationflows are considered as intrusion symptoms. We have implemented this modelin the Linux kernel (the source code is availableat http://www.blare-ids.org), as a Linux Security Module (LSM), andwe used it as the basis for practical demonstrations. The experimentalresults validated the feasibility of our new intrusion detection principles. Détection d'intrusions Sécurité Systèmes distribués Noyau Linux Approche comportementale Intrusion detection Security Distributed systems Linux kernel Anomaly detection 378.242

Search results