Global ETD Search

1	Theories for Session-based Governance for Large-scale Distributed Systems Chen, Tsu-Chun January 2013 (has links) Large-scale distributed systems and distributed computing are the pillars of IT infrastructure and society nowadays. Robust theoretical principles for designing, building, managing and understanding the interactive behaviours of such systems need to be explored. A promising approach for establishing such principles is to view the session as the key unit for design, execution and verification. Governance is a general term for verifying whether activities meet the specified requirements and for enforcing safe behaviours among processes. This thesis, based on the asynchronous -calculus and the theory of session types, provides a monitoring framework and a theory for validating specifications, verifying mutual behaviours during runtime, and taking actions when noncompliant behaviours are detected. We explore properties and principles for governing large-scale distributed systems, in which autonomous and heterogeneous system components interact with each other in the network to accomplish application goals. This thesis, incorporating lessons from my participation in a substantial practical project, the Ocean Observatories Initiative (OOI), proposes an asynchronous monitoring framework and the process calculus for dynamically governing the asynchronous interactions among distributed multiple applications. We prove that this monitoring model guarantees the satisfaction of global assertions, and state and prove theorems of local and global safety, transparency, and session fidelity. We also study and introduce the semantic mechanisms for runtime session-based governance and the principles of validation of stateful specifications through capturing the runtime asynchronous interactions. 621.39
2	On the Use of Double Auctions in Resource Allocation Problems in Large-scale Distributed Systems Feng, Yuan 24 August 2011 (has links) In this thesis, we explore the use of double auction markets as a general approach to tackle resource allocation problems in large-scale distributed systems, which are traditionally solved using optimization techniques. Prevalently adopted in real-world markets, double auctions have the power of arbitrating mappings between participating players and trading commodities in a decentralized fashion, with every player trying to maximize her own utility selfishly. Through the design of prefetching strategies in peer-assisted video-on-demand systems, we show how the problem of minimizing server bandwidth costs by reallocating media contents can be solved by double auction markets gracefully. However, not every resource allocation problem satisfies requirements of double auctions. We illustrate the limitation of double auctions with an example of virtual machine migration in container-based datacenters, which is then modeled into a Nash bargaining game and solved by a Nash bargaining solution. double auction resource allocation large-scale distributed systems 0984
3	On the Use of Double Auctions in Resource Allocation Problems in Large-scale Distributed Systems Feng, Yuan 24 August 2011 (has links) In this thesis, we explore the use of double auction markets as a general approach to tackle resource allocation problems in large-scale distributed systems, which are traditionally solved using optimization techniques. Prevalently adopted in real-world markets, double auctions have the power of arbitrating mappings between participating players and trading commodities in a decentralized fashion, with every player trying to maximize her own utility selfishly. Through the design of prefetching strategies in peer-assisted video-on-demand systems, we show how the problem of minimizing server bandwidth costs by reallocating media contents can be solved by double auction markets gracefully. However, not every resource allocation problem satisfies requirements of double auctions. We illustrate the limitation of double auctions with an example of virtual machine migration in container-based datacenters, which is then modeled into a Nash bargaining game and solved by a Nash bargaining solution. double auction resource allocation large-scale distributed systems 0984
4	Efficient data and metadata processing in large-scale distributed systems Shi, Rong, Shi January 2018 (has links) No description available. Computer Science Computer Engineering
5	Planning And Scheduling For Large-scaledistributed Systems Yu, Han 01 January 2005 (has links) Many applications require computing resources well beyond those available on any single system. Simulations of atomic and subatomic systems with application to material science, computations related to study of natural sciences, and computer-aided design are examples of applications that can benefit from the resource-rich environment provided by a large collection of autonomous systems interconnected by high-speed networks. To transform such a collection of systems into a user's virtual machine, we have to develop new algorithms for coordination, planning, scheduling, resource discovery, and other functions that can be automated. Then we can develop societal services based upon these algorithms, which hide the complexity of the computing system for users. In this dissertation, we address the problem of planning and scheduling for large-scale distributed systems. We discuss a model of the system, analyze the need for planning, scheduling, and plan switching to cope with a dynamically changing environment, present algorithms for the three functions, report the simulation results to study the performance of the algorithms, and introduce an architecture for an intelligent large-scale distributed system. Large-scale Distributed Systems Planning Scheduling Genetic Algorithms Computer Sciences Engineering
6	Scheduling And Resource Management For Complex Systems: From Large-scale Distributed Systems To Very Large Sensor Networks Yu, Chen 01 January 2009 (has links) In this dissertation, we focus on multiple levels of optimized resource management techniques. We first consider a classic resource management problem, namely the scheduling of data-intensive applications. We define the Divisible Load Scheduling (DLS) problem, outline the system model based on the assumption that data staging and all communication with the sites can be done in parallel, and introduce a set of optimal divisible load scheduling algorithms and the related fault-tolerant coordination algorithm. The DLS algorithms introduced in this dissertation exploit parallel communication, consider realistic scenarios regarding the time when heterogeneous computing systems are available, and generate optimal schedules. Performance studies show that these algorithms perform better than divisible load scheduling algorithms based upon sequential communication. We have developed a self-organization model for resource management in distributed systems consisting of a very large number of sites with excess computing capacity. This self-organization model is inspired by biological metaphors and uses the concept of varying energy levels to express activity and goal satisfaction. The model is applied to Pleiades, a service-oriented architecture based on resource virtualization. The self-organization model for complex computing and communication systems is applied to Very Large Sensor Networks (VLSNs). An algorithm for self-organization of anonymous sensor nodes called SFSN (Scale-free Sensor Networks) and an algorithm utilizing the Small-worlds principle called SWAS (Small-worlds of Anonymous Sensors) are introduced. The SFSN algorithm is designed for VLSNs consisting of a fairly large number of inexpensive sensors with limited resources. An important feature of the algorithm is the ability to interconnect sensors without an identity, or physical address used by traditional communication and coordination protocols. During the self-organization phase, the collision-free communication channels allowing a sensor to synchronously forward information to the members of its proximity set are established and the communication pattern is followed during the activity phases. Simulation study shows that the SFSN ensures the scalability, limits the amount of communication and the complexity of coordination. The SWAS algorithm is further improved from SFSN by applying the Small-worlds principle. It is unique in its ability to create a sensor network with a topology approximating small-world networks. Rather than creating shortcuts between pairs of diametrically positioned nodes in a logical ring, we end up with something resembling a double-stranded DNA. By exploiting Small-worlds principle we combine two desirable features of networks, namely high clustering and small path length. Large-Scale Distributed Systems VLSNs Divisible Load Scheduling Self-organization Computer Sciences Engineering
7	Projeto de um serviço configurável de detecção de defeitos / Design of a configurable failure detection service Balbinot, Jeysonn Isaac January 2007 (has links) A detecção de defeitos pode ser usada como base no projeto de algoritmos e aplicações distribuídas que dependem, de alguma forma, de informações de estado sobre processos distribuídos. O problema de acordo entre processos (consenso), que é um dos problemas fundamentais da computação distribuída, bem como difusão atômica (atomic broadcast), eleição de líder (leader election) e gerenciamento de grupos (membership) necessitam de informações de estado dos processos envolvidos, portanto, do resultado da atividade dos detectores. Esses protocolos, geralmente, são usados como blocos básicos para a construção de outros algoritmos, serviços ou aplicações distribuídas tolerantes a falhas. Os detectores de defeitos, de forma prática, têm sido desenvolvidos com base em parâmetros funcionais de redes locais e não operam bem no contexto de sistemas distribuídos de larga escala e de redes de longa distância (WANs). Sistemas conectados por WANs, geralmente, oferecem um ambiente mais hostil do que as LANs e clusters, devido aos atrasos longos e variáveis e à maior probabilidade de ocorrência de defeitos de temporização (flutuações na latência de comunicação) e omissão (perdas de mensagens), impondo um desafio na concepção de mecanismos que detectem defeitos de forma completa, precisa e que atendam a requisitos de dependabilidade exigidos pelas aplicações. A detecção de defeitos, também, pode ser oferecida na forma de um serviço, podendo ser este serviço utilizado por diferentes aplicações, sem que estas necessitem agregar a implementação do detector em seus projetos. Neste trabalho, foram pesquisadas estratégias aplicáveis à organização e à comunicação entre módulos de detecção de defeitos, focando sistemas de larga escala que operem sobre WANs. Está sendo proposto um modelo de serviço configurável que opera sob demanda das aplicações, e utiliza uma organização hierárquica dos módulos detectores de defeitos. Com base nesse modelo, foi implementado e testado um protótipo, utilizando o framework de simulação Neko. Os testes avaliaram a utilização da estratégia hierárquica com base no tipo e número de mensagens trocadas pelo serviço durante sua operação. Os resultados mostraram que adotar a hierarquia em dois níveis (LAN e WAN) resulta em poucas mensagens adicionais de controle e significativa redução do número de mensagens trafegando entre redes locais. O serviço tirou proveito do conhecimento da topologia da rede e escalou bem, quando um número maior de máquinas foi utilizado. Adicionalmente, para ajustar dinamicamente a detecção aos atrasos impostos pelas WANs, foi utilizado o pacote de predição de timeout do AFDService. / The failure detection may be used as basis for the design of algorithms and distributed applications that need information about the state of distributed processes. The agreement problem among processes (consensus) is one of the fundamental problems in distributed computing as well as other protocols such as atomic broadcast, leader election and membership that also need information about involved processes and consequently need also the results from the failure detector activity. These protocols are generally used as basic blocks to design other algorithms, services or fault-tolerant distributed applications. The failure detectors, in practice, have been developed based on local network parameters; consequently they are not tuned for the context of large-scale distributed systems nor wide area networks (WANs). Systems interconnected by WANs generally are environments more adverse than LAN and traditional clusters, due to variable and long delays and more prone to timing and omission failures. A natural consequence is that it is challenging to develop mechanisms that can accurately detect failures and give the needed support for dependability requirements of the applications. The failure detection may also be offered as a service for the different applications, which do not need to include their own detectors in their design. In this work are investigated strategies previously defined and applied on the communication of failure detector modules, focusing the analysis on large scale systems on WANs. From this, we propose a configurable failure detection service model that works on demand of applications and adopts the hierarchical organization of failure detection modules. Based on this model, a prototype implementation has been developed and tested using Neko simulation framework. The tests evaluate the utilization of hierarchical strategy based on the type and number of messages exchanged by the service during its operation. The experiments show that the two-level (LAN and WAN) hierarchical structure adopted results in a few additional control messages and a significant reduction on the message traffic between local networks. The service uses the knowledge of the topology and scales well when many machines are used. Additionally, to dynamically adjust the delay imposed by WANs on time detection, the timeout prediction package of AFDService has been used. Redes : Computadores Gerencia : Redes : Computadores Tolerancia : Falhas Fault tolerance Failure detection service WAN Large-scale distributed systems
8	Projeto de um serviço configurável de detecção de defeitos / Design of a configurable failure detection service Balbinot, Jeysonn Isaac January 2007 (has links) A detecção de defeitos pode ser usada como base no projeto de algoritmos e aplicações distribuídas que dependem, de alguma forma, de informações de estado sobre processos distribuídos. O problema de acordo entre processos (consenso), que é um dos problemas fundamentais da computação distribuída, bem como difusão atômica (atomic broadcast), eleição de líder (leader election) e gerenciamento de grupos (membership) necessitam de informações de estado dos processos envolvidos, portanto, do resultado da atividade dos detectores. Esses protocolos, geralmente, são usados como blocos básicos para a construção de outros algoritmos, serviços ou aplicações distribuídas tolerantes a falhas. Os detectores de defeitos, de forma prática, têm sido desenvolvidos com base em parâmetros funcionais de redes locais e não operam bem no contexto de sistemas distribuídos de larga escala e de redes de longa distância (WANs). Sistemas conectados por WANs, geralmente, oferecem um ambiente mais hostil do que as LANs e clusters, devido aos atrasos longos e variáveis e à maior probabilidade de ocorrência de defeitos de temporização (flutuações na latência de comunicação) e omissão (perdas de mensagens), impondo um desafio na concepção de mecanismos que detectem defeitos de forma completa, precisa e que atendam a requisitos de dependabilidade exigidos pelas aplicações. A detecção de defeitos, também, pode ser oferecida na forma de um serviço, podendo ser este serviço utilizado por diferentes aplicações, sem que estas necessitem agregar a implementação do detector em seus projetos. Neste trabalho, foram pesquisadas estratégias aplicáveis à organização e à comunicação entre módulos de detecção de defeitos, focando sistemas de larga escala que operem sobre WANs. Está sendo proposto um modelo de serviço configurável que opera sob demanda das aplicações, e utiliza uma organização hierárquica dos módulos detectores de defeitos. Com base nesse modelo, foi implementado e testado um protótipo, utilizando o framework de simulação Neko. Os testes avaliaram a utilização da estratégia hierárquica com base no tipo e número de mensagens trocadas pelo serviço durante sua operação. Os resultados mostraram que adotar a hierarquia em dois níveis (LAN e WAN) resulta em poucas mensagens adicionais de controle e significativa redução do número de mensagens trafegando entre redes locais. O serviço tirou proveito do conhecimento da topologia da rede e escalou bem, quando um número maior de máquinas foi utilizado. Adicionalmente, para ajustar dinamicamente a detecção aos atrasos impostos pelas WANs, foi utilizado o pacote de predição de timeout do AFDService. / The failure detection may be used as basis for the design of algorithms and distributed applications that need information about the state of distributed processes. The agreement problem among processes (consensus) is one of the fundamental problems in distributed computing as well as other protocols such as atomic broadcast, leader election and membership that also need information about involved processes and consequently need also the results from the failure detector activity. These protocols are generally used as basic blocks to design other algorithms, services or fault-tolerant distributed applications. The failure detectors, in practice, have been developed based on local network parameters; consequently they are not tuned for the context of large-scale distributed systems nor wide area networks (WANs). Systems interconnected by WANs generally are environments more adverse than LAN and traditional clusters, due to variable and long delays and more prone to timing and omission failures. A natural consequence is that it is challenging to develop mechanisms that can accurately detect failures and give the needed support for dependability requirements of the applications. The failure detection may also be offered as a service for the different applications, which do not need to include their own detectors in their design. In this work are investigated strategies previously defined and applied on the communication of failure detector modules, focusing the analysis on large scale systems on WANs. From this, we propose a configurable failure detection service model that works on demand of applications and adopts the hierarchical organization of failure detection modules. Based on this model, a prototype implementation has been developed and tested using Neko simulation framework. The tests evaluate the utilization of hierarchical strategy based on the type and number of messages exchanged by the service during its operation. The experiments show that the two-level (LAN and WAN) hierarchical structure adopted results in a few additional control messages and a significant reduction on the message traffic between local networks. The service uses the knowledge of the topology and scales well when many machines are used. Additionally, to dynamically adjust the delay imposed by WANs on time detection, the timeout prediction package of AFDService has been used. Redes : Computadores Gerencia : Redes : Computadores Tolerancia : Falhas Fault tolerance Failure detection service WAN Large-scale distributed systems
9	Projeto de um serviço configurável de detecção de defeitos / Design of a configurable failure detection service Balbinot, Jeysonn Isaac January 2007 (has links) A detecção de defeitos pode ser usada como base no projeto de algoritmos e aplicações distribuídas que dependem, de alguma forma, de informações de estado sobre processos distribuídos. O problema de acordo entre processos (consenso), que é um dos problemas fundamentais da computação distribuída, bem como difusão atômica (atomic broadcast), eleição de líder (leader election) e gerenciamento de grupos (membership) necessitam de informações de estado dos processos envolvidos, portanto, do resultado da atividade dos detectores. Esses protocolos, geralmente, são usados como blocos básicos para a construção de outros algoritmos, serviços ou aplicações distribuídas tolerantes a falhas. Os detectores de defeitos, de forma prática, têm sido desenvolvidos com base em parâmetros funcionais de redes locais e não operam bem no contexto de sistemas distribuídos de larga escala e de redes de longa distância (WANs). Sistemas conectados por WANs, geralmente, oferecem um ambiente mais hostil do que as LANs e clusters, devido aos atrasos longos e variáveis e à maior probabilidade de ocorrência de defeitos de temporização (flutuações na latência de comunicação) e omissão (perdas de mensagens), impondo um desafio na concepção de mecanismos que detectem defeitos de forma completa, precisa e que atendam a requisitos de dependabilidade exigidos pelas aplicações. A detecção de defeitos, também, pode ser oferecida na forma de um serviço, podendo ser este serviço utilizado por diferentes aplicações, sem que estas necessitem agregar a implementação do detector em seus projetos. Neste trabalho, foram pesquisadas estratégias aplicáveis à organização e à comunicação entre módulos de detecção de defeitos, focando sistemas de larga escala que operem sobre WANs. Está sendo proposto um modelo de serviço configurável que opera sob demanda das aplicações, e utiliza uma organização hierárquica dos módulos detectores de defeitos. Com base nesse modelo, foi implementado e testado um protótipo, utilizando o framework de simulação Neko. Os testes avaliaram a utilização da estratégia hierárquica com base no tipo e número de mensagens trocadas pelo serviço durante sua operação. Os resultados mostraram que adotar a hierarquia em dois níveis (LAN e WAN) resulta em poucas mensagens adicionais de controle e significativa redução do número de mensagens trafegando entre redes locais. O serviço tirou proveito do conhecimento da topologia da rede e escalou bem, quando um número maior de máquinas foi utilizado. Adicionalmente, para ajustar dinamicamente a detecção aos atrasos impostos pelas WANs, foi utilizado o pacote de predição de timeout do AFDService. / The failure detection may be used as basis for the design of algorithms and distributed applications that need information about the state of distributed processes. The agreement problem among processes (consensus) is one of the fundamental problems in distributed computing as well as other protocols such as atomic broadcast, leader election and membership that also need information about involved processes and consequently need also the results from the failure detector activity. These protocols are generally used as basic blocks to design other algorithms, services or fault-tolerant distributed applications. The failure detectors, in practice, have been developed based on local network parameters; consequently they are not tuned for the context of large-scale distributed systems nor wide area networks (WANs). Systems interconnected by WANs generally are environments more adverse than LAN and traditional clusters, due to variable and long delays and more prone to timing and omission failures. A natural consequence is that it is challenging to develop mechanisms that can accurately detect failures and give the needed support for dependability requirements of the applications. The failure detection may also be offered as a service for the different applications, which do not need to include their own detectors in their design. In this work are investigated strategies previously defined and applied on the communication of failure detector modules, focusing the analysis on large scale systems on WANs. From this, we propose a configurable failure detection service model that works on demand of applications and adopts the hierarchical organization of failure detection modules. Based on this model, a prototype implementation has been developed and tested using Neko simulation framework. The tests evaluate the utilization of hierarchical strategy based on the type and number of messages exchanged by the service during its operation. The experiments show that the two-level (LAN and WAN) hierarchical structure adopted results in a few additional control messages and a significant reduction on the message traffic between local networks. The service uses the knowledge of the topology and scales well when many machines are used. Additionally, to dynamically adjust the delay imposed by WANs on time detection, the timeout prediction package of AFDService has been used. Redes : Computadores Gerencia : Redes : Computadores Tolerancia : Falhas Fault tolerance Failure detection service WAN Large-scale distributed systems

Search results