Global ETD Search

1	Escalonamento adaptativo para sistemas de processamento contínuo de eventos. / Adaptive scheduling for continuous event processing systems. SOUSA, Rodrigo Duarte. 13 April 2018 (has links) Submitted by Johnny Rodrigues (johnnyrodrigues@ufcg.edu.br) on 2018-04-13T17:23:58Z No. of bitstreams: 1 RODRIGO DUARTE SOUSA - DISSERTAÇÃO - PPGCC 2014..pdf: 3708263 bytes, checksum: d9e59ec276a62382b6317ec8ce6bf880 (MD5) / Made available in DSpace on 2018-04-13T17:23:58Z (GMT). No. of bitstreams: 1 RODRIGO DUARTE SOUSA - DISSERTAÇÃO - PPGCC 2014..pdf: 3708263 bytes, checksum: d9e59ec276a62382b6317ec8ce6bf880 (MD5) Previous issue date: 2014-08-04 / Sistemasde processamento contínuo de eventos vêm sendo utilizados em aplicações que necessitam de um processamento quase em tempo real. Essa necessidade, junto da quantidade elevada de dados processados nessas aplicações, provocam que tais sistemas possuam fortes requisitos de desempenho e tolerância a falhas. Sendo assim, escalonadores geralmente fazem uso de informações de utilização dos recursos das máquinas do sistema (como utilização de CPU, memória RAM, rede e disco) natentativadereagirapossíveissobrecargasque possam aumentar a utilização dos recursos, provocando uma piora no desempenho da aplicação. Entretanto, devido aos diferentes perﬁs de aplicações e componentes, a complexidade de se decidir, de forma ﬂexível e genérica, o que deve ser monitorado e a diferença entre o que torna um recurso mais importante que outro em um dado momento, podem provocar escolhas não adequadas por parte do escalonador. O trabalho apresentado nesta dissertação propõe um algoritmo de escalonamento que, através de uma abordagem reativa, se adapta a diferentes perﬁs de aplicações e de carga, tomando decisões baseadas no monitoramento da variação do desempenho de seus operadores. Periodicamente,o escalonador realiza uma avaliação de quais operadores apresentaram uma piora em seu desempenho e, posteriormente, tenta migrar tais operadores para nós menos sobrecarregados. Foram executados experimentos onde um protótipo do algoritmo foi avaliado e os resultados demonstraram uma melhora no desempenho do sistema, apartirdadiminuiçãodalatênciadeprocessamentoedamanutenção da quantidade de eventos processados. Em execuções com variações bruscas da carga de trabalho, a latência média de processamento dos operadores foi reduzida em mais de 84%, enquanto queaquantidadedeeventos processados diminuiuapenas 1,18%. / The usage of event stream processing systems is growing lately, mainly at applications that have a near real-time processing as a requirement. That need, combined with the high amount of data processed by these applications, increases the dependency on performance and fault tolerance of such systems. Therefore, to handle these requirements, schedulers usually make use of the resources utilization (like CPU, RAM, disk and network bandwidth) in an attempt to react to potential over loads that may further increase their utilization, causing the application’s performance to deteriorate. However, due to different application proﬁles and components, the complexity of deciding, in a ﬂexible and generic way, what resources should be monitored and the difference between what makes a resource utilization more important than another in a given time, can provoke the scheduler to perform wrong actions. In this work, we propose a scheduling algorithm that, via a reactive approach, adapts to different applications proﬁles and load, taking decisions based at the latency variation from its operators. Periodically, the system scheduler performs an evaluation of which operators are giving evidence of beingin an over loaded state, then, the scheduler tries to migrate those operators to a machine with less utilization. The experiments showed an improvement in the system performance, in scenarios with a bursty workload, the operators’ average processing latency was reduced by more than 84%, while the number of processed events decreased by only1.18%. Ciência da Computação Sistemas Distribuídos Escalonamento Ciência da computação Event stream processing systems
2	External Streaming State Abstractions and Benchmarking / Extern strömmande statliga abstraktioner och benchmarking Sree Kumar, Sruthi January 2021 (has links) Distributed data stream processing is a popular research area and is one of the promising paradigms for faster and efficient data management. Application state is a first-class citizen in nearly every stream processing system. Nowadays, stream processing is, by definition, stateful. For a stream processing application, the state is backing operations such as aggregations, joins, and windows. Apache Flink is one of the most accepted and widely used stream processing systems in the industry. One of the main reasons engineers choose Apache Flink to write and deploy continuous applications is its unique combination of flexibility and scalability for stateful programmability, and the firm guarantee that the system ensures. Apache Flink’s guarantees always make its states correct and consistent even when nodes fail or when the number of tasks changes. Flink state can scale up to its compute node’s hard disk boundaries using embedded databases to store and retrieve data. Nevertheless, in all existing state backends officially supported by Flink, the state is always available locally to compute tasks. Even though this makes deployment more convenient, it creates other challenges such as non-trivial state reconfiguration and failure recovery. At the same time, compute, and state are bound to be tightly coupled. This strategy also leads to over-provisioning and is counterintuitive on state intensive only workloads or compute-intensive only workloads. This thesis investigates an alternative state backend architecture, FlinkNDB, which can tackle these challenges. FlinkNDB decouples state and computes by using a distributed database to store the state. The thesis covers the challenges of existing state backends and design choices and the new state backend implementation. We have evaluated the implementation of FlinkNDB against existing state backends offered by Apache Flink. / Distribuerad dataströmsbehandling är ett populärt forskningsområde och är ett av de lovande paradigmen för snabbare och effektivare datahantering. Applicationstate är en förstklassig medborgare i nästan alla strömbehandlingssystem. Numera är strömbearbetning per definition statlig. För en strömbehandlingsapplikation backar staten operationer som aggregeringar, sammanfogningar och windows. Apache Flink är ett av de mest accepterade och mest använda strömbehandlingssystemen i branschen. En av de främsta anledningarna till att ingenjörer väljer ApacheFlink för att skriva och distribuera kontinuerliga applikationer är dess unika kombination av flexibilitet och skalbarhet för statlig programmerbarhet, och företaget garanterar att systemet säkerställer. Apache Flinks garantier gör alltid dess tillstånd korrekt och konsekvent även när noder misslyckas eller när antalet uppgifter ändras. Flink-tillstånd kan skala upp till dess beräkningsnods hårddiskgränser genom att använda inbäddade databaser för att lagra och hämta data. I allmänna tillståndsstöd som officiellt stöds av Flink är staten dock alltid tillgänglig lokalt för att beräkna uppgifter. Även om detta gör installationen bekvämare, skapar det andra utmaningar som icke-trivial tillståndskonfiguration och felåterställning. Samtidigt måste beräkning och tillstånd vara tätt kopplade. Den här strategin leder också till överanvändning och är kontraintuitiv för statligt intensiva endast arbetsbelastningar eller beräkningsintensiva endast arbetsbelastningar. Denna avhandling undersöker en alternativ statsbackendarkitektur, FlinkNDB, som kan hantera dessa utmaningar. FlinkNDB frikopplar tillstånd och beräknar med hjälp av en distribuerad databas för att lagra tillståndet. Avhandlingen täcker utmaningarna med befintliga statliga backends och designval och den nya implementeringen av statebackend. Vi har utvärderat genomförandet av FlinkNDBagainst befintliga statliga backends som erbjuds av Apache Flink. Apache Flink Distributed Systems NDB FlinkNDB State State Backends External State Stream Processing Systems Benchmarking Caching Apache Flink Distributed Systems NDB FlinkNDB State State Backends External State Stream Processing Systems Benchmarking Caching Computer and Information Sciences Data- och informationsvetenskap
3	Enhancing availability in large scale storage systems and services: architectures and techniques Seshadri, Sangeetha 04 May 2009 (has links) Enterprises today are dealing with extremely large amounts of critical digital information that continues to grow at an astonishing rate. On the other hand, storage software (firmware, middleware) and systems are becoming much more complex and existing failure recovery mechanisms are insufficient to handle the scale of these systems while meeting high availability and service quality expectations. In addition, the concurrent development and quality assurance processes, the large number of test scenarios and the large scale of these systems and services imply that failures will be the norm rather than the exception. Therefore achieving high availability and reliability in storage systems remains a major concern and an open research challenge. Most existing work in the domain of storage system availability addresses failures of the storage media (such as disks) and recoverability from these failures. However, failures at the firmware and middleware layers remain largely unaddressed. This dissertation research addresses these challenges in depth across different storage architectures. Concretely, we make the following contributions: First, we develop a recovery conscious framework for multi-core architectures and a suite of techniques for performing efficient fine-grained recovery (micro-recovery) in storage controller firmware that can be retrofitted into legacy code. The framework includes a task-level recovery mechanism, the Log(Lock) architecture that allows system state restoration during micro-recovery, and recovery-conscious scheduling algorithms that are designed to reduce the ripple effect of failure and improve recovery efficiency and system availability. Our second technical contribution addresses the storage middleware availability. We develop the notion of hierarchical middleware architectures by organizing critical cluster management services into a hierarchical overlay network, which separates persistent application state from global system control state and demonstrate significant improvement in the availability and reliability of enterprise scale storage systems. In addition, we develop the notion of operator reuse and a suite of reuse techniques to improve data availability. The key idea of operator reuse is to efficiently utilize system resources by exploiting reuse opportunities in both operators and persistent state of computing nodes. We demonstrate our design through STREAMREUSE, a reuse-conscious store-forward network of storage nodes, which offers distributed stream query processing services. Query optimization Software recovery Middleware Firmware Stream processing systems Storage systems High-availability architectures Data libraries Data warehousing Middleware Computer firmware Data recovery (Computer science)
4	Real-Time Failure Event Streaming of Continuous Integration Builds / Realtidsströmning av Felhändelser i Kontinuerlig Integration Seifert, Felix January 2022 (has links) An application build describes compiling and linking the source code of a developed application to libraries and executables. A Continuous Integration (CI) build executes such a build after the source code has been changed and tries to integrate the changes into the existing application. Such CI builds are executed automatically and include automated software tests, which give the developer the assurance that the changes are technically correct. When the time between the discovery of a test failure and the notification to the developer about it is too long, the development process will be impacted negatively and the beneficial effects of CI decrease. Even though several companies already have CI systems that display all events of a single CI build on a terminal during runtime, bigger applications often involve several CI builds in a single CI pipeline to integrate code changes. Observing the events of these CI builds during runtime might require concurrent monitoring of several different terminals. This thesis overcomes this issue by developing a Proof of Concept (PoC) which streams the test failures of a whole CI pipeline in real-time to the developer. To show the feasibility of real-time failure event streaming of CI builds, the PoC is implemented within Spotify’s CI for clientfacing applications. The issues highlighted by this initial PoC will help to refine the whole CI practice. Furthermore, the faster feedback cycles realised by this PoC will lead to a productivity, efficiency and happiness increase for the involved developers and, eventually, higher quality of the developed software. / Ett applikationsbygge beskriver kompilering och länkning av källkod för en utvecklad applikation till bibliotek och körbara filer. Ett Kontinuerlig Integrerings (CI)-bygge kör en sådan bygge efter att källkoden har ändrats och försöker integrera ändringarna i den befintliga applikationen. Sådana CIbyggen exekveras automatiskt och inkluderar automatiserade mjukvarutester, som ger utvecklaren en försäkran om att ändringarna är tekniskt korrekta. När tiden mellan upptäckten av ett testfel och meddelandet till utvecklaren om det är för lång kommer utvecklingsprocessen att påverkas negativt och de fördelaktiga effekterna av CI minskar. Även om flera företag redan har CIsystem som visar alla händelser av ett enskilt CI-bygge i en terminal under körning, involverar större applikationer ofta flera CI-byggen i en och samma CI-pipeline för att integrera kodändringar. Att observera händelserna i dessa CI-byggen under körning kan kräva jämlöpande övervakning av flera olika terminaler. Den här avhandlingen övervinner detta problem genom att utveckla en PoC som strömmar testfelen för en hel CI-pipeline i realtid till utvecklaren. För att visa genomförbarheten av strömning av felhändelser i realtid av CIbyggnader implementeras PoC i Spotifys CI för klientvända applikationer. De problem som lyfts fram av denna första PoC kommer att bidra till att förfina hela CI-praxisen. Dessutom kommer de snabbare återkopplingscyklerna som realiseras av denna PoCatt leda till ökad produktivitet, effektivitet och glädje för de inblandade utvecklarna och, så småningom, högre kvalitet på den utvecklade mjukvaran. Continuous Integration Build Streaming Stream Processing Systems Realtime systems Developer Productivity Engineering Kontinuerlig Integration Bygge Strömning Strömningssystem Realtidssystem Utvecklarproduktivitet Computer and Information Sciences Data- och informationsvetenskap
5	Benchmarking and Scheduling Strategies for Distributed Stream Processing Shukla, Anshu January 2017 (has links) (PDF) The velocity dimension of Big Data refers to the need to rapidly process data that arrives continuously as streams of messages or events. Distributed Stream Processing Systems (DSPS) refer to distributed programming and runtime platforms that allow users to define a composition of dataflow logic that are executed on distributed resources over streams of incoming messages. A DSPS uses commodity clusters and Cloud Virtual Machines (VMs) for its execution. In order to meet the required performance for these applications, the DSPS needs to schedule these dataßows eﬃciently over the resources. Despite their growing use, resource scheduling for DSPSÕs tends to be done in an ad hoc manner, favoring empirical and reactive approaches, rather than a model-driven and analytical approach. Such empirical strategies may arrive at an approximate schedule for the dataflow that needs further tuning to meet the quality of service. We propose a model-based scheduling approach that makes use of performance profiles and benchmarks developed for tasks in the dataßow to plan both the resource allocation and the resource mapping that together form the schedule planning process. We propose the Model Based Allocation (MBA) and the Slot Aware Mapping (SAM) approaches that efectively utilize knowledge of the performance model of logic tasks to provide an eﬃcient and predictable scheduling behavior. We implemented and validate these algorithms using the popular open source Apache Storm DSPS for several micro and application dataflows. The results show that our model-driven approach is able to reduce the amount of required resources (VMs) by 30% − 50% relative to existing techniques. Also we see that our strategies o↵er a predictable behavior that ensures that the expected and actual rates supported and resources used match closely. This can enable deterministic schedule planning even under dynamic conditions. Besides this static scheduling, we also examine the ability to dynamically consolidate tasks onto fewer VMs when the load on the dataßow decreases or the VMs get fragmented. We propose reliable task migration models for Apache Storm dataßows that are able to rapidly move the task assignment in the cluster, and resume the dataflow execution without any message loss. Distributed Stream Processing Distributed Programming Apache Storm Dataflows Stream Processing Benchmark IoT Applications Streaming Dataflows Cloud Virtual Machines (VMs) Model Based Allocation (MBA) Slot Aware Mapping (SAM) Computer Science

1

Page generated in 0.0678 seconds