Global ETD Search

81	Escalonamento de tarefas com localidade de dados em grids / Task scheduling with data locality in grids Póvoa, Marcelo Galvão, 1990- 02 April 2015 (has links) Orientador: Eduardo Candido Xavier / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-27T04:49:46Z (GMT). No. of bitstreams: 1 Povoa_MarceloGalvao_M.pdf: 1965830 bytes, checksum: 7509ae1701df384bfdc3d415ecd4eda8 (MD5) Previous issue date: 2015 / Resumo: Sistemas computacionais conhecidos como Data Grids fornecem uma infraestrutura computacional distribuída para processamento e armazenamento de dados, com várias aplicações envolvendo computação em larga escala. Devido ao uso de um grande volume de dados, é necessário não apenas um escalonamento eficiente de tarefas, mas também uma distribuição inteligente de réplicas dos dados para se atingir o melhor desempenho. Esses dois problemas já foram extensivamente estudados de forma independente na literatura, mas estamos concentrados em um formulação integrada em um problema estático, de forma a otimizar uma única função objetivo. Primeiramente, mostramos que este problema não pode admitir um algoritmo aproximado. Porém, considerando uma versão restrita do problema, apresentamos um algoritmo aproximado original com fator de aproximação constante. Também fazemos um estudo de algoritmos aproximados para problemas relacionados disponíveis na literatura. Sob um aspecto mais prático, introduzimos duas heurísticas originais para o problema. A primeira é baseada no agrupamento de máquinas próximas em clusters, enquanto a segunda procura identificar grupos de dados frequentemente acessados em conjunto. Comparamos esses algoritmos com duas abordagens adaptadas da literatura, através de simulações computacionais em um grande conjunto de instâncias baseadas em grids reais. Mostramos que nossa primeira heurística costuma obter melhores soluções que as outras com boa eficiência de tempo, enquanto a segunda heurística é ainda mais rápida e ainda obtém soluções competitivas / Abstract: Computational systems known as Data Grids provide a flexible, distributed computing infrastructure for processing and storage and has many applications in large-scale computing. Due to the use of great amounts of data, not only efficient task scheduling but also thorough file replication are crucial for achieving the best performance. Both these problems have already been studied independently in the literature, but we are interested in a combined formulation as a static problem, in order to minimize a single objective function. First, we show that this problem does not admit an approximation algorithm. However, considering a restricted version of the problem, we provide a constant ratio approximation algorithm. We also conduct a study of approximation algorithms for related problems avaliable in the literature. On a more practical side, we introduce two novel heuristics for the problem. The first is based on grouping neighbor nodes into clusters, while the second tries to identify groups of files frequently accessed together. We compare these algorithms with two adapted approaches from other works in the literature by doing computational simulations using an extensive set of instances based on real grids. We show that our first heuristic often obtains the best solutions with good time efficiency, while the second is even faster and still provides competitive solutions / Mestrado / Ciência da Computação / Mestre em Ciência da Computação Algoritmos de aproximação Algoritmos heurísticos Approximation algorithms Heuristic algorithms Computational grids (Computer systems)
82	PPerfGrid: A Grid Services-Based Tool for the Exchange of Heterogeneous Parallel Performance Data Hoffman, John Jared 01 January 2004 (has links) This thesis details the approach taken in developing PPerfGrid. Section 2 discusses other research related to this project. Section 3 provides general background on the technologies utilized in PPerfGrid, focusing on the components that make up the Grid services architecture. Section 4 provides a description of the architecture of PPerfGrid. Section 5 details the implementation of PPerfGrid. Section 6 presents tests designed to measure the overhead and scalability of the PPerfGrid application. Section 7 suggests future work, and Section 8 concludes the thesis. High performance computing Computational grids (Computer systems) Parallel computers Computer Engineering Computer Sciences
83	Gridfields: Model-Driven Data Transformation in the Physical Sciences Howe, Bill 01 December 2006 (has links) Scientists' ability to generate and store simulation results is outpacing their ability to analyze them via ad hoc programs. We observe that these programs exhibit an algebraic structure that can be used to facilitate reasoning and improve performance. In this dissertation, we present a formal data model that exposes this algebraic structure, then implement the model, evaluate it, and use it to express, optimize, and reason about data transformations in a variety of scientific domains. Simulation results are defined over a logical grid structure that allows a continuous domain to be represented discretely in the computer. Existing approaches for manipulating these gridded datasets are incomplete. The performance of SQL queries that manipulate large numeric datasets is not competitive with that of specialized tools, and the up-front effort required to deploy a relational database makes them unpopular for dynamic scientific applications. Tools for processing multidimensional arrays can only capture regular, rectilinear grids. Visualization libraries accommodate arbitrary grids, but no algebra has been developed to simplify their use and afford optimization. Further, these libraries are data dependent—physical changes to data characteristics break user programs. We adopt the grid as a first-class citizen, separating topology from geometry and separating structure from data. Our model is agnostic with respect to dimension, uniformly capturing, for example, particle trajectories (1-D), sea-surface temperatures (2-D), and blood flow in the heart (3-D). Equipped with data, a grid becomes a gridfield. We provide operators for constructing, transforming, and aggregating gridfields that admit algebraic laws useful for optimization. We implement the model by analyzing several candidate data structures and incorporating their best features. We then show how to deploy gridfields in practice by injecting the model as middleware between heterogeneous, ad hoc file formats and a popular visualization library. In this dissertation, we define, develop, implement, evaluate and deploy a model of gridded datasets that accommodates a variety of complex grid structures and a variety of complex data products. We evaluate the applicability and performance of the model using datasets from oceanography, seismology, and medicine and conclude that our model-driven approach offers significant advantages over the status quo. Computational grids (Computer systems) Multigrid methods (Numerical analysis) Algebra -- Data processing Computer Engineering Computer Sciences
84	Economic scheduling in Grid computing using Tender models Bsoul, Mohammad January 2007 (has links) Economic scheduling needs to be considered for Grid computing environment, because it gives an incentive for resource providers to supply their resources. Moreover, it enforces efficient use of resources, because the users have to pay for their use. Tendering is a suitable model for Grid scheduling because users start the negotiations for finding suitable resources for executing their jobs. Furthermore, the users specify their job requirements with their requests and therefore the resources reply with bids that are based on the cost of taking on the job and the availability of their processors. In this thesis, a framework for economic Grid scheduling using tendering is proposed. The framework entities such as users, brokers and resources employ tender/contract-net model to negotiate the prices and deadlines. The brokers' role is acting on behalf of users. During the negotiations, the entities aim to maximise their performance which is measured by a number of metrics. In order to evaluate the entities' performance under different scenarios, a Java- based simulator, called MICOSim, supporting event-driven simulation of economic Grid scheduling is presented. MICOSim can perform a simulation of more than one hundred entities faster than real time. It is concluded from the evaluation that users who are interested in increasing the job success rate and paying less for executing their jobs have to consider received prices to select the most appropriate bids, while users who are interested in improving the job average satisfaction rate have to consider either received completion time or both price and completion time to select the most suitable bids when the submission of jobs is static. The best broker strategy is the one that doesn't take into account meeting the job deadlines in the bids it sends to job owners. Finally, the resource strategy that considers the price to determine if to reply to a request or not is superior to other resource strategies. The only exception is employing this strategy with price that is too low. However, there is a tiny difference between the performances of different user strategies in dynamic submission. It is also concluded from the evaluation that broker strategies have the best performance when the revenue they target from the users is reasonable. Thus, the broker's aim has to be receiving reasonable revenue (neither too low nor too high) from acting on behalf of users. It is observed from the results that the strategy performance is influenced by the behaviour of other entities such as the submission time of user jobs. Finally, it is observed that the characteristics of entities have an effect on the performance of strategies. For example, the two user strategies that consider the received completion time and both price and completion time to determine if to accept a broker bid have similar performance, because of the existence of resources with various prices from cheap to expensive and existence of resources which don't care about the price paid for the execution. So, the price threshold doesn't have a large effect on the performance. 004.36
85	\"Armazenamento distribuído de dados e checkpointing de aplicações paralelas em grades oportunistas\" / Distributed data storage and checkpointing of parallel applications in opportunistic grids Camargo, Raphael Yokoingawa de 04 May 2007 (has links) Grades computacionais oportunistas utilizam recursos ociosos de máquinas compartilhadas para executar aplicações que necessitam de um alto poder computacional e/ou trabalham com grandes quantidades de dados. Mas a execução de aplicações paralelas computacionalmente intensivas em ambientes dinâmicos e heterogêneos, como grades computacionais oportunistas, é uma tarefa difícil. Máquinas podem falhar, ficar inacessíveis ou passar de ociosas para ocupadas inesperadamente, comprometendo a execução de aplicações. Um mecanismo de tolerância a falhas que dê suporte a arquiteturas heterogêneas é um importante requisito para estes sistemas. Neste trabalho, analisamos, implementamos e avaliamos um mecanismo de tolerância a falhas baseado em checkpointing para aplicações paralelas em grades computacionais oportunistas. Este mecanismo permite o monitoramento de execuções e a migração de aplicações entre nós heterogêneos da grade. Mas além da execução, é preciso gerenciar e armazenar os dados gerados e utilizados por estas aplicações. Desejamos uma infra-estrutura de armazenamento de dados de baixo custo e que utilize o espaço livre em disco de máquinas compartilhadas da grade. Devemos utilizar somente os ciclos ociosos destas máquinas para armazenar e recuperar dados, de modo que um sistema de armazenamento distribuído que as utilize deve ser redundante e tolerante a falhas. Para resolver o problema do armazenamento de dados em grades oportunistas, projetamos, implementamos e avaliamos o middleware OppStore. Este middleware provê armazenamento distribuído e confiável de dados, que podem ser acessados de qualquer máquina da grade. As máquinas são organizadas em aglomerados, que são conectados por uma rede peer-to-peer auto-organizável e tolerante a falhas. Dados são codificados em fragmentos redundantes antes de serem armazenados, de modo que arquivos podem ser reconstruídos utilizando apenas um subconjunto destes fragmentos. Finalmente, para lidar com a heterogeneidade dos recursos, desenvolvemos uma extensão ao protocolo de roteamento em redes peer-to-peer Pastry. Esta extensão adiciona balanceamento de carga e suporte à heterogeneidade de máquinas ao protocolo Pastry. / Opportunistic computational grids use idle resources from shared machines to execute applications that need large amounts of computational power and/or deal with large amounts of data. But executing computationally intensive parallel applications in dynamic and heterogeneous environments, such as opportunistic grids, is a daunting task. Machines may fail, become inaccessible, or change from idle to occupied unexpectedly, compromising the application execution. A fault tolerance mechanism that supports heterogeneous architectures is an important requisite for such systems. In this work, we analyze, implement and evaluate a checkpointing-based fault tolerance mechanism for parallel applications running on opportunistic grids. The mechanism monitors application execution and allows the migration of applications between heterogeneous nodes of the grid. But besides application execution, it is necessary to manage data generated and used by those applications. We want a low cost data storage infrastructure that utilizes the unused disk space of grid shared machines. The system should use the machines to store and recover data only during their idle periods, requiring the system to be redundant and fault-tolerant. To solve the data storage problem in opportunistic grids, we designed, implemented and evaluated the OppStore middleware. This middleware provides reliable distributed storage for application data, which can be accessed from any machine in the grid. The machines are organized in clusters, connected by a self-organizing and fault-tolerant peer-to-peer network. During storage, data is codified into redundant fragments, allowing the reconstruction of the original file using only a subset of those fragments. Finally, to deal with resource heterogeneity, we developed an extension to the Pastry peer-to-peer routing substrate, enabling heterogeneity-aware load-balancing message routing. armazenamento distribuído BSP BSP checkpointing checkpointing computational grids distributed data storage fault-tolerance grades computacionais grid computing peer-to-peer peer-to-peer tolerância a falhas
86	Patterns for web services standards Unknown Date (has links) Web services intend to provide an application integration technology that can be successfully used over the Internet in a secure, interoperable and trusted manner. Policies are high-level guidelines defining the way an institution conducts its activities. The WS-Policy standard describes how to apply policies of security definition, enforcement of access control, authentication and logging. WS-Trust defines a security token service and a trust engine which are used by web services to authenticate other web services. Using the functions defined in WS-Trust, applications can engage in secure communication after establishing trust. BPEL is a language for web service composition that intends to provide convenient and effective means for application integration over the Internet. We address security considerations in BPEL and how to enforce them, as well as its interactions with other web services standards such as WS-Security and WS-Policy. / by Ola Ajaj. / Thesis (M.S.C.S.)--Florida Atlantic University, 2010. / Includes bibliography. / Electronic reproduction. Boca Raton, Fla., 2010. Mode of access: World Wide Web. Computational grids (Computer systems) Computer systems--Verification Expert systems (Computer science) Computer network architectures Web servers--Management Electronic commerce--Computer programs
87	ALiCE: A Java-based Grid Computing System Teo, Yong Meng 01 1900 (has links) A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities. This talk is divided into three parts. Firstly, we give an overview of the main issues in grid computing. Next, we introduce ALiCE (Adaptive and Scalable Internet-based Computing Engine), a platform independent and lightweight grid. ALiCE exploits object-level parallelism using our Object Network Transport Architecture (ONTA). Grid applications are written using ALiCE Object Programming Template that hides the complexities of the underlying grid fabric. Lastly, we present some performance results of ALiCE applications including the geo-rectification of satellite images and the progressive multiple sequence alignments problem. / Singapore-MIT Alliance (SMA) computational grids ALiCE object-level parallelism Object Network Transport Architecture ONTA geo-rectification of satellite images
88	Performance Modeling Based Scheduling And Rescheduling Of Parallel Applications On Computational Grids Sanjay, H A 10 1900 (has links) As computational grids have become popular and ubiquitous, users have access to large number and different types of geographically distributed grid resources. Many computational grid frameworks are composed of multiple distributed sites with each site consisting of one or more dedicated or non-dedicated clusters. Jobs submitted to a grid are handled by a matascheduler which interacts with the local schedulers of the clusters for scheduling jobs to the individual clusters. Computational grids have been found to be powerful research-beds for execution of various kinds of parallel applications. When a parallel application is submitted to a grid, the metascheduler has to choose a set of resources from a cluster for application execution. To select the best set of resources for application execution, it is important to determine the performance of the application. Accurate performance estimates of an application is essential in assisting a grid meta scheduler to efficiently schedule user jobs. Thus models that predict execution times of parallel applications on a set of resources and a search procedure (scheduling strategy) which selects the best set of machines within a cluster for application execution are of importance for enabling the parallel applications on grids. For efficient execution of large scientific parallel applications consisting of multiple phases, performance models of the individual phases should be obtained. Efficient rescheduling strategies that can use the per-phase models to adapt the parallel applications to application and resource dynamics are necessary for maintaining high performance of the applications on grids. A practical and robust grid computing infrastructure that integrates components related to application and resource monitoring, performance modeling, scheduling and rescheduling techniques, is highly essential for large-scale deployment and high performance of scientific applications on grid systems and hence for fostering high performance computing. This thesis focuses on developing performance models for predicting execution times of parallel problems/subproblems on dedicated and non-dedicated grid resources. The thesis also constructs robust scheduling and rescheduling strategies in a grid metascheduler that can use the performance models for efficient execution of large scientific parallel applications on dynamic grids. Finally, the thesis builds a practical and robust grid middleware infrastructure which integrates components related to performance modeling, scheduling and rescheduling, monitoring and migration frameworks for large-scale deployment and use of high performance applications on grids. The thesis consists of four main components. In the first part of the thesis, we have developed a comprehensive set of performance modeling strategies to predict the execution times of tightly-coupled parallel applications on a set of resources in a dedicated or non-dedicated cluster. The main purpose of our prediction strategies is to aid grid metaschedulers in making scheduling decisions. Our performance modeling strategies, based on linear regression, can deal with non-dedicated systems where the loads can change during application executions. Our models do not require detailed knowledge and instrumentation of the applications and can be constructed without the involvement of application developers. The strategies are intended for rapid and large scale deployment of parallel applications on non-dedicated grid systems. We have evaluated our strategies on 8, 16, 24 and 32-node clusters with random loads and load traces from a grid system. Our performance modeling strategies gave less than 30% average percentage prediction errors in all cases, which is reasonable for non-dedicated systems. We also found that scheduling based on the predictions by our strategies will result in perfect scheduling in many cases. For modeling large-scale scientific applications, we use execution profiles and automatic program analysis, and manual analysis of significant portions of the application’s code to identify the different phases of applications. We then adopt our performance modeling strategies to predict execution times for the different phases of the tightly-coupled parallel applications on a set of resources in a dedicated or non-dedicated cluster. Our experiments show that using combinations of performance models of the phases give 18% – 70% more accurate predictions than using single performance models for the applications. In the second part of the thesis, we have devised, evaluated and compared algorithms for scheduling tightly-coupled parallel applications on multi-cluster grids. Our algorithms use performance models that predict the execution times of parallel applications, for evaluations of candidate schedules. In this work, we propose a novel algorithm called Box Elimination (BE) that searches a space of performance model parameters to determine efficient schedules. By eliminating large search space regions containing poorer solutions at each step and searching high quality solutions, our algorithm is able to generate efficient schedules within few seconds for even clusters of 512 processors. By means of large number of real and simulation experiment, we compared our algorithm with popular optimization techniques. We show that our algorithm generates up to 80% more efficient schedules than other algorithms and the resulting execution times are more robust against performance modeling errors. The third part of the thesis deals with policies for rescheduling long-running multi-phase parallel applications in response to application and resource dynamics. In this work, we use our performance modeling and scheduling strategies to derive rescheduling plans for executing multi-phase parallel applications on grids. A rescheduling plan consists of potential points in application execution for rescheduling and schedules of resources for application execution between two consecutive rescheduling points. We have developed three algorithms, namely an incremental algorithm, a divide-and-conquer algorithm and a genetic algorithm, for deriving a rescheduling plan for a parallel application execution. We have also developed an algorithm that uses rescheduling plans derived on different clusters to form a single coherent rescheduling plan for application execution on a grid consisting of multiple clusters. The rescheduling plans generated by our algorithms are highly efficient leading to application execution times that are higher than the execution times corresponding to brute force method by less than 10%. We also find that rescheduling in response to changing application and resource dynamics, using the rescheduling plans for multi-cluster grids generated by our algorithms, give much lesser execution times when compared to executions of the applications on a single schedule throughout application execution. In the final part of the thesis, we have developed a practical grid middleware framework called MerITA (Middleware for Performance Improvement of Tightly Coupled Parallel Applications on Grids), a system for effective execution of tightly-coupled parallel applications on multi-cluster grids consisting of dedicated or non-dedicated, interactive or batch systems. The framework brings together performance modeling for automatically determining the characteristics of parallel applications, scheduling strategies that use the performance models for efficient mapping of applications to resources, rescheduling policies for determining the points in application execution when executing applications can be rescheduled to different sets of resources to obtain performance improvement and a check-pointing library for enabling rescheduling. Computational Grids Performance Modeling Scheduling Rescheduling Grid Computing Scheduling Algorithms Rescheduling Algorithms Grid Scheduling Grids Tightly-Coupled Parallel Applications Computer Science
89	Three essays on the interface of computer science, economics and information systems Hidvégi, Zoltán Tibor, 1970- 28 August 2008 (has links) This thesis looks at three aspects related to the design of E-commerce systems, online auctions and distributed grid computing systems. We show how formal verification techniques from computer science can be applied to ensure correctness of system design and implementation at the code level. Through e-ticket sales example, we demonstrate that model checking can locate subtle but critical flaws that traditional control and auditing methods (e.g., penetration testing, analytical procedure) most likely miss. Auditors should understand formal verification methods, enforce engineering to use them to create designs with less of a chance of failure, and even practice formal verification themselves in order to offer credible control and assistance for critical e-systems. Next, we study why many online auctions offer fixed buy prices to understand why sellers and auctioneers voluntarily limit the surplus they can get from an auction. We show when either the seller of the dibbers are risk-averse, a properly chosen fixed permanent buy-price can increase the social surplus and does not decrease the expected utility of the sellers and bidders, and we characterize the unique equilibrium strategies of uniformly risk-averse buyers in a buy-price auction. In the final chapter we look at the design of a distributed grid-computing system. We show how code-instrumentation can be used to generate a witness of program execution, and show how this witness can be used to audit the work of self-interested grid agents. Using a trusted intermediary between grid providers and customers, the audit allows payment to be contingent on the successful audit results, and it creates a verified reputation history of grid providers. We show that enabling the free trade of reputations provides economic incentives to agents to perform the computations assigned, and it induces increasing effort levels as the agents' reputation increases. We show that in such a reputation market only high-type agents would have incentive to purchase a high reputation, and only low-type agents would use low reputations, thus a market works as a natural signaling mechanism about the agents' type. / text Electronic commerce Internet auctions--Mathematical models Prices--Mathematical models Computer software--Verification Computer programs--Verification Coding theory Computational grids (Computer systems)
90	An integrated methodology for creating composed Web/grid services Tan, Koon Leai Larry January 2009 (has links) This thesis presents an approach to design, specify, validate, verify, implement, and evaluate composed web/grid services. Web and grid services can be composed to create new services with complex behaviours. The BPEL (Business Process Execution Language) standard was created to enable the orchestration of web services, but there have also been investigation of its use for grid services. BPEL specifies the implementation of service composition but has no formal semantics; implementations are in practice checked by testing. Formal methods are used in general to define an abstract model of system behaviour that allows simulation and reasoning about properties. The approach can detect and reduce potentially costly errors at design time. CRESS (Communication Representation Employing Systematic Specification) is a domainindependent, graphical, abstract notation, and integrated toolset for developing composite web service. The original version of CRESS had automated support for formal specification in LOTOS (Language Of Temporal Ordering Specification), executing formal validation with MUSTARD (Multiple-Use Scenario Testing and Refusal Description), and implementing in BPEL4WS as the early version of BPEL standard. This thesis work has extended CRESS and its integrated tools to design, specify, validate, verify, implement, and evaluate composed web/grid services. The work has extended the CRESS notation to support a wider range of service compositions, and has applied it to grid services as a new domain. The thesis presents two new tools, CLOVE (CRESS Language-Oriented Verification Environment) and MINT (MUSTARD Interpreter), to respectively support formal verification and implementation testing. New work has also extended CRESS to automate implementation of composed services using the more recent BPEL standard WS-BPEL 2.0. 006.7

Search results