Global ETD Search

71	Distribuované aplikace s využitím frameworku Windows Communication Foundation / Distributed applications using Windows Communication Foundation framework Kišac, Matej January 2016 (has links) This thesis deals with distributed applications and WCF framework. The first part is based on theoretical information about distributed systems and we also concentrate on models of distributed systems. Next part describes WCF framework and key elements of WCF application. The following chapter is designated to introduce information about prime factorization. Then the knowledge from previous parts is used to create examples of service-oriented applications. In conclusion we discuss main parts of designing distributed application to solve factorization problem. Finally the comparison of distributed and dedicated application is made.
72	Vyvažování dat a dotazů založených na klíčových slovech v distribuovaných úložných systémech / Balancing Keyword-Based Data and Queries in Distributed Storage Systems Wirth, Martin January 2020 (has links) Research in the area of load balancing in distributed systems has not yet come with an optimal load balancing technique. Existing approaches work primarily with replication and sharding. This thesis overviews existing knowledge in this area with focus on shard- ing, and provides an experiment comparing a state-of-the-art load balancing technique called Weighed-Move with a random baseline and an existing domain-specific balancing implementation. As a significant part of the project, we engineered a generic and scal- able load balancer that may be used in any distributed system and deployed it into an existing ad system called Sklik. The major challenges appeared to be tackling various problems related to data consistency, performance and synchronization, together with solving compatibility issues with the rest of the still-evolving ad system. Our experiment shows that the domain-specific load balancing implementation produces data distribution that enables better performance, but Weighed-Move proved to have a great potential and its results are expected to be enhanced by further work on our implementation. 1
73	JOB SCHEDULING FOR STREAMING APPLICATIONS IN HETEROGENEOUS DISTRIBUTED PROCESSING SYSTEMS Al-Sinayyid, Ali 01 December 2020 (has links) The colossal amounts of data generated daily are increasing exponentially at a never-before-seen pace. A variety of applications—including stock trading, banking systems, health-care, Internet of Things (IoT), and social media networks, among others—have created an unprecedented volume of real-time stream data estimated to reach billions of terabytes in the near future. As a result, we are currently living in the so-called Big Data era and witnessing a transition to the so-called IoT era. Enterprises and organizations are tackling the challenge of interpreting the enormous amount of raw data streams to achieve an improved understanding of data, and thus make efficient and well-informed decisions (i.e., data-driven decisions). Researchers have designed distributed data stream processing systems that can directly process data in near real-time. To extract valuable information from raw data streams, analysts need to create and implement data stream processing applications structured as a directed acyclic graphs (DAG). The infrastructure of distributed data stream processing systems, as well as the various requirements of stream applications, impose new challenges. Cluster heterogeneity in a distributed environment results in different cluster resources for task execution and data transmission, which make the optimal scheduling algorithms an NP-complete problem. Scheduling streaming applications plays a key role in optimizing system performance, particularly in maximizing the frame-rate, or how many instances of data sets can be processed per unit of time. The scheduling algorithm must consider data locality, resource heterogeneity, and communicational and computational latencies. The latencies associated with the bottleneck from computation or transmission need to be minimized when mapped to the heterogeneous and distributed cluster resources. Recent work on task scheduling for distributed data stream processing systems has a number of limitations. Most of the current schedulers are not designed to manage heterogeneous clusters. They also lack the ability to consider both task and machine characteristics in scheduling decisions. Furthermore, current default schedulers do not allow the user to control data locality aspects in application deployment.In this thesis, we investigate the problem of scheduling streaming applications on a heterogeneous cluster environment and develop the maximum throughput scheduler algorithm (MT-Scheduler) for streaming applications. The proposed algorithm uses a dynamic programming technique to efficiently map the application topology onto a heterogeneous distributed system based on computing and data transfer requirements, while also taking into account the capacity of underlying cluster resources. The proposed approach maximizes the system throughput by identifying and minimizing the time incurred at the computing/transfer bottleneck. The MT-Scheduler supports scheduling applications that are structured as a DAG, such as Amazon Timestream, Google Millwheel, and Twitter Heron. We conducted experiments using three Storm microbenchmark topologies in both simulated and real Apache Storm environments. To evaluate performance, we compared the proposed MT-Scheduler with the simulated round-robin and the default Storm scheduler algorithms. The results indicated that the MT-Scheduler outperforms the default round-robin approach in terms of both average system latency and throughput. Apache Storm Big data Cloud computing Data stream Scheduling Distributed system Heterogeneous System
74	Leveraging virtualization technologies for resource partitioning in mixed criticality systems Li, Ye 28 November 2015 (has links) Multi- and many-core processors are becoming increasingly popular in embedded systems. Many of these processors now feature hardware virtualization capabilities, such as the ARM Cortex A15, and x86 processors with Intel VT-x or AMD-V support. Hardware virtualization offers opportunities to partition physical resources, including processor cores, memory and I/O devices amongst guest virtual machines. Mixed criticality systems and services can then co-exist on the same platform in separate virtual machines. However, traditional virtual machine systems are too expensive because of the costs of trapping into hypervisors to multiplex and manage machine physical resources on behalf of separate guests. For example, hypervisors are needed to schedule separate VMs on physical processor cores. Additionally, traditional hypervisors have memory footprints that are often too large for many embedded computing systems. This dissertation presents the design of the Quest-V separation kernel, which partitions services of different criticality levels across separate virtual machines, or sandboxes. Each sandbox encapsulates a subset of machine physical resources that it manages without requiring intervention of a hypervisor. In Quest-V, a hypervisor is not needed for normal operation, except to bootstrap the system and establish communication channels between sandboxes. This approach not only reduces the memory footprint of the most privileged protection domain, it removes it from the control path during normal system operation, thereby heightening security. Computer science Chip-level distributed system Mixed criticality system Separation kernel Virtualization
75	Disciplines basées sur la taille pour la planification des jobs dans data-intensif scalable computing systems / Size-based disciplines for job scheduling in data-intensive scalable computing systems Pastorelli, Mario 18 July 2014 (has links) La dernière décennie a vu l’émergence de systèmes parallèles pour l’analyse de grosse quantités de données (DISC) , tels que Hadoop, et la demande qui en résulte pour les politiques de gestion des ressources, pouvant fournir des temps de réponse rapides ainsi qu’équité. Actuellement, les schedulers pour les systèmes de DISC sont axées sur l’équité, sans optimiser les temps de réponse. Les meilleures pratiques pour surmonter ce problème comprennent une intervention manuelle et une politique de planification ad-hoc , qui est sujette aux erreurs et qui est difficile à adapter aux changements. Dans cette thèse, nous nous concentrons sur la planification basée sur la taille pour les systèmes DISC. La principale contribution de ce travail est le scheduler dit Hadoop Fair Sojourn Protocol (HFSP), un ordonnanceur préemptif basé sur la taille qui tient en considération le vieillissement, ayant comme objectifs de fournir l’équité et des temps de réponse réduits. Hélas, dans les systèmes DISC, les tailles des job d’analyse de données ne sont pas connus a priori, donc, HFSP comprends un module d’estimation de taille, qui calcule une approximation et qui affine cette estimation au fur et a mesure du progrès d’un job. Nous démontrons que l’impact des erreurs d’estimation sur les politiques fondées sur la taille n’est pas significatif. Pour cette raison, et en vertu d’être conçu autour de l’idée de travailler avec des tailles estimées, HFSP est tolérant aux erreurs d’estimation de la taille des jobs. Nos résultats expérimentaux démontrent que, dans un véritable déploiement Hadoop avec des charges de travail réalistes, HFSP est plus performant que les politiques de scheduling existantes, a la fois en terme de temps de réponse et d’équité. En outre, HFSP maintiens ses bonnes performances même lorsque le cluster de calcul est lourdement chargé, car il focalises les ressources sur des jobs ayant priorité. HFSP est une politique préventive: la préemption dans un système DISC peut être mis en œuvre avec des techniques différentes. Les approches actuellement disponibles dans Hadoop ont des lacunes qui ont une incidence sur les performances du système. Par conséquence, nous avons mis en œuvre une nouvelle technique de préemption, appelé suspension, qui exploite le système d’exploitation pour effectuer la préemption d’une manière qui garantie une faible latence sans pénaliser l’avancement des jobs a faible priorité. / The past decade have seen the rise of data-intensive scalable computing (DISC) systems, such as Hadoop, and the consequent demand for scheduling policies to manage their resources, so that they can provide quick response times as well as fairness. Schedulers for DISC systems are usually focused on the fairness, without optimizing the response times. The best practices to overcome this problem include a manual and ad-hoc control of the scheduling policy, which is error-prone and difficult to adapt to changes. In this thesis we focus on size-based scheduling for DISC systems. The main contribution of this work is the Hadoop Fair Sojourn Protocol (HFSP) scheduler, a size-based preemptive scheduler with aging; it provides fairness and achieves reduced response times thanks to its size-based nature. In DISC systems, job sizes are not known a-priori: therefore, HFSP includes a job size estimation module, which computes approximated job sizes and refines these estimations as jobs progress. We show that the impact of estimation errors on the size-based policies is not signifi- cant, under conditions which are verified in a system such as Hadoop. Because of this, and by virtue of being designed around the idea of working with estimated sizes, HFSP is largely tolerant to job size estimation errors. Our experimental results show that, in a real Hadoop deployment and with realistic workloads, HFSP performs better than the built-in scheduling policies, achieving both fairness and small mean response time. Moreover, HFSP maintains its good performance even when the cluster is heavily loaded, by focusing the resources to few selected jobs with the smallest size. HFSP is a preemptive policy: preemption in a DISC system can be implemented with different techniques. Approaches currently available in Hadoop have shortcomings that impact on the system performance. Therefore, we have implemented a new preemption technique, called suspension, that exploits the operating system primitives to implement preemption in a way that guarantees low latency without penalizing low-priority jobs. Système distribué Planification des tâches par taille MapReduce Distributed system Size-based job scheduling MapReduce
76	Accéler la préparation des données pour l'analyse du big data / Accelerating data preparation for big data analytics Tian, Yongchao 07 April 2017 (has links) Nous vivons dans un monde de big data, où les données sont générées en grand volume, grande vitesse et grande variété. Le big data apportent des valeurs et des avantages énormes, de sorte que l’analyse des données est devenue un facteur essentiel de succès commercial dans tous les secteurs. Cependant, si les données ne sont pas analysées assez rapidement, les bénéfices de big data seront limités ou même perdus. Malgré l’existence de nombreux systèmes modernes d’analyse de données à grande échelle, la préparation des données est le processus le plus long de l’analyse des données, n’a pas encore reçu suffisamment d’attention. Dans cette thèse, nous étudions le problème de la façon d’accélérer la préparation des données pour le big data d’analyse. En particulier, nous nous concentrons sur deux grandes étapes de préparation des données, le chargement des données et le nettoyage des données. Comme première contribution de cette thèse, nous concevons DiNoDB, un système SQL-on-Hadoop qui réalise l’exécution de requêtes à vitesse interactive sans nécessiter de chargement de données. Les applications modernes impliquent de lourds travaux de traitement par lots sur un grand volume de données et nécessitent en même temps des analyses interactives ad hoc efficaces sur les données temporaires générées dans les travaux de traitement par lots. Les solutions existantes ignorent largement la synergie entre ces deux aspects, nécessitant de charger l’ensemble des données temporaires pour obtenir des requêtes interactives. En revanche, DiNoDB évite la phase coûteuse de chargement et de transformation des données. L’innovation importante de DiNoDB est d’intégrer à la phase de traitement par lots la création de métadonnées que DiNoDB exploite pour accélérer les requêtes interactives. La deuxième contribution est un système de flux distribué de nettoyage de données, appelé Bleach. Les approches de nettoyage de données évolutives existantes s’appuient sur le traitement par lots pour améliorer la qualité des données, qui demandent beaucoup de temps. Nous ciblons le nettoyage des données de flux dans lequel les données sont nettoyées progressivement en temps réel. Bleach est le premier système de nettoyage qualitatif de données de flux, qui réalise à la fois la détection des violations en temps réel et la réparation des données sur un flux de données sale. Il s’appuie sur des structures de données efficaces, compactes et distribuées pour maintenir l’état nécessaire pour nettoyer les données et prend également en charge la dynamique des règles. Nous démontrons que les deux systèmes résultants, DiNoDB et Bleach, ont tous deux une excellente performance par rapport aux approches les plus avancées dans nos évaluations expérimentales, et peuvent aider les chercheurs à réduire considérablement leur temps consacré à la préparation des données. / We are living in a big data world, where data is being generated in high volume, high velocity and high variety. Big data brings enormous values and benefits, so that data analytics has become a critically important driver of business success across all sectors. However, if the data is not analyzed fast enough, the benefits of big data will be limited or even lost. Despite the existence of many modern large-scale data analysis systems, data preparation which is the most time-consuming process in data analytics has not received sufficient attention yet. In this thesis, we study the problem of how to accelerate data preparation for big data analytics. In particular, we focus on two major data preparation steps, data loading and data cleaning. As the first contribution of this thesis, we design DiNoDB, a SQL-on-Hadoop system which achieves interactive-speed query execution without requiring data loading. Modern applications involve heavy batch processing jobs over large volume of data and at the same time require efficient ad-hoc interactive analytics on temporary data generated in batch processing jobs. Existing solutions largely ignore the synergy between these two aspects, requiring to load the entire temporary dataset to achieve interactive queries. In contrast, DiNoDB avoids the expensive data loading and transformation phase. The key innovation of DiNoDB is to piggyback on the batch processing phase the creation of metadata, that DiNoDB exploits to expedite the interactive queries. The second contribution is a distributed stream data cleaning system, called Bleach. Existing scalable data cleaning approaches rely on batch processing to improve data quality, which are very time-consuming in nature. We target at stream data cleaning in which data is cleaned incrementally in real-time. Bleach is the first qualitative stream data cleaning system, which achieves both real-time violation detection and data repair on a dirty data stream. It relies on efficient, compact and distributed data structures to maintain the necessary state to clean data, and also supports rule dynamics. We demonstrate that the two resulting systems, DiNoDB and Bleach, both of which achieve excellent performance compared to state-of-the-art approaches in our experimental evaluations, and can help data scientists significantly reduce their time spent on data preparation. Big data Base de données Système distribué Nettoyage de données Big data Database Distributed system Data cleaning
77	Adaptation of a group to various environments through local interactions between individuals based on estimated global information / 個体の大域的情報推定に基づいた局所相互作用による集団の環境適応 Hayakawa, Tomohiro 23 September 2020 (has links) 付記する学位プログラム名: グローバル生存学大学院連携プログラム / 京都大学 / 0048 / 新制・課程博士 / 博士(工学) / 甲第22771号 / 工博第4770号 / 新制\|\|工\|\|1746(附属図書館) / 京都大学大学院工学研究科機械理工学専攻 / (主査)教授松野文俊, 教授椹木哲夫, 教授泉田啓 / 学位規則第4条第1項該当 / Doctor of Philosophy (Engineering) / Kyoto University / DFAM Autonomous distributed system Global information estimation Task allocation Multi-legged robot Reconfigurable modular robot 500
78	A model-based approach for automatic recovery from memory leaks in enterprise applications Wang, Zimin 06 August 2011 (has links) Large-scale distributed computing systems such as data centers are hosted on heterogeneous and networked servers that execute in a dynamic and uncertain operating environment, caused by factors such as time-varying user workload and various failures. Therefore, achieving stringent quality-of-service goals is a challenging task, requiring a comprehensive approach to performance control, fault diagnosis, and failure recovery. This work presents a model-based approach for fault management, which integrates limited lookahead control (LLC), diagnosis, and fault-tolerance concepts that: (1) enables systems to adapt to environment variations, (2) maintains the availability and reliability of the system, (3) facilitates system recovery from failures. We focused on memory leak errors in this thesis. A characterization function is designed to detect memory leaks. Then, a LLC is applied to enable the computing system to adapt efficiently to variations in the workload, and to enable the system recover from memory leaks and maintain functionality. memory leak limited lookahead control distributed system web server fault management
79	Distributed Design on User Connectivity Maximization in UAV Based Communication Network Tripathi, Saugat 21 July 2023 (has links) No description available. Computer Engineering Electrical Engineering UCN distributed system MA-DQL correlated learning user connectivity
80	Exploring Transaction Anomalies under Weak Isolation Levels for General Database Applications Gan, Yifan January 2021 (has links) No description available. Computer Science Computer Engineering Database Transaction Isolation Concurrency Control Distributed System

Search results