Global ETD Search

31	Partner-based scheduling and routing for grid workflows Ashraf, Jawad January 2012 (has links) The Grid has enabled the scientific community to make faster progress. Scientific experiments and data analyses once spanning several years can now be completed in a matter of hours. With the advancement of technology, the execution of scientific experiments, often represented as workflows, has become more demanding. Thus, there is a vital need for improvements in the scheduling of scientific workflows. Efficient execution of scientific workflows can be achieved by the timely allocation of the resources. Advance reservation can ensure the future availability of heterogeneous resources and help a scheduler to produce better schedules. We propose a novel resource mapping technique for jobs of a Grid workflow in an advance reservation environment. Using a dynamic critical path based job selection method, our proposed technique considers the conditional mapping of parent and child jobs to the same resource, trying to minimise the communication duration between jobs and thus optimising the workflow completion time. The proposed method is analysed in both static and dynamic environments, and the simulation results show encouraging performance especially for workflows where the communication costs are higher than the computation costs. We also propose a hybrid of multiple scheduling heuristics for the aforementioned problem, which chooses the best among multiple schedules computed by different algorithms. Simulation results show a significant improvement over well known scheduling heuristics in terms of workflow completion time. Considering the advance reservation environment, a better schedule for the earliest completion of a workflow can be achieved if better paths can be found for the transfer of data files between jobs executed on different resources. We propose a K-shortest path based routing algorithm for finding good paths in the advance reservation environment. The results show that our proposed algorithm performs very well in terms of the earliest arrival time of the data. Finally, we also study a modified partner based scheduling heuristic for non-advance reservation environments. The results demonstrate that our proposed algorithm is a promising candidate for adoption in such Grid environments. 004.36
32	Transparent componentisation : a hybrid approach to support the development of contemporary distributed systems Lin, Shen January 2010 (has links) Distributed computing systems are increasingly pervading all aspects Of daily life. This rapid growth is characterised by the growing com- plexity of these systems, which unfolds in three dimensions. First, contemporary distributed systems must often cater for computation nodes with heterogeneous computing and networking capacities; sec- ond, they must deal with dynamic changes such as network churns and mobile nodes; and finally, they are often large scale and must be able to grow elastically to meet evolving expectations. This thesis investigates how the above complexity dimensions can be made easier to control by using novel software development ap- proaches and frameworks. In particular, the proposed work seeks to develop approaches that promote three key properties in contempo- Ray distributed systems: 1) configurability to construct customised Systems that target heterogeneous operating environments; 2) die- Manic adaptability to adapt to dynamic changes; and, 3) understand- Ability and simplicity to facilitate software reuse and to hide low-level programming details. To address these issues, this thesis proposes a hybrid software devel- opment approach that combines the advantages of component frame works with that of high-level protocol specification languages. This hybrid approach, termed Transparent Componentisation, automati- cally maps a high-level protocol specification onto an underlying corn- ponent framework. It thus allows developers to focus on the program- matic description of a distributed system's behaviour in simple and high-level terms. Meanwhile, it transparently retains the benefits of a component architecture such as component reuse, configurability, and runtime adaptability. As a proof of concept, this thesis presents the WHISPERS/GOSSIP KIT framework for gossip-based distributed sys- terns, a representative subclass of contemporary distributed systems. WHISPERS/GOSSIPKIT is evaluated to demonstrate that it success- fully retains the simplicity and understandability of a high-level pro- tocol specification language while encouraging component reuse and supporting transparent (re)configuration thanks to its component un- derpinnings. 004.36
33	Numerical methods for the efficient and scalable discovery of semantically-described resources on the internet Hau, Jeffrey January 2007 (has links) With the advance of the Semantic Web, both the Web and Grid communities have embraced the concept of enriching distributed resources with machine-understandable semantic metadata. Semantic resource discovery is one of the emerging research areas that leverages resource metadata to reason about compatibility and functionality. Resource compatibility can be derived by reasoning about their types and relations. The OWL language semantics provides a formal model for description logic reasoning. However, under many usage scenarios the logical inference approach is often too restrictive. Many similar resources that are potentially useful are eliminated in the matching process, due to their logical non equivalence. In this thesis we propose two efficient numerical methods for calculating the similarity of OWL-described resources. By viewing OWL descriptions as RDF graphs, we base our first similarity measure on the graph edit distance technique developed for inexact graph matching. The similarity of two graphs is derived from the total cost of the edit operations. The second method transform semantic descriptions into distance constraints based on their relational structures. By deriving resource coordinates from the distance constraints, the distances between resources become a natural similarity measure. Numerical similarity measure provides a useful lightweight method to exploit the available semantic metadata. As increasing number of resources becomes publicly available on the Internet, the computationally intensive process of logical reasoning often cannot be used to achieve a satisfactory result within a reasonable timeframe. We demonstrate the use of the similarity measures as an alternative and potentially complimentary technique to the logical reasoning method. 004.36
34	Context-based anomaly detection in critical infrastructures: a study in distributed systems McEvoy, Thomas Richard January 2013 (has links) The modernization of critical infrastructure exposes a large attack surface in a set of systems, key to the sustainability of civilization, at a time when targeted malicious attacks are growing in sophistication, particularly with regard to stealth techniques, which are particularly difficult to uncover in distributed systems due to multiple possible orderings of state. We argue that by making use of a set of known relationships (which we label a context) between states in disparate parts of a distributed system and the provision of suitable concurrent (or near-concurrent) observation and comparison mechanisms, we can provide the means to detect such anomalies and locate their source as a precursor to managing outcomes. As a necessary prerequisite to our research, we establish an adversary capability model which allows us to make explicit statements about the feasible actions and subsequent impacts of adversary and demonstrate the validity of any dctective methods. We focus primarily on integrity attacks. The first technique we present is a security protocol, using traceback techniques, which allows us to locate processes which manipulate message content between an operator and a control unit. Thc second technique allows us to model algebraically possible sequences in host system states which may be indicative of malicious activity and detect these using a multi-threaded observation mechanism. The third technique provides a process engineering model of a basic non-linear process in a biochemical plant (pasteurization in a brewery) which shows how the provision of, even minimal, additional sensor information, outside of standard telemetry requirements, can be used to determine a failure in supervisory control due to malicious action. This last technique represents an improvement over previous approaches which focused on linear or linearized systems. All three techniques pave the way for more sophisticated approaches for realtime detection and management of attacks. Thanks go to my supervisor, Stephcn Wolthusen, for his encouragement and counsel, and to my examiners, Keith Mayes and Emiliano Cassalichio, for their comments and questions. I am also grateful to the anonymous reviewers of the papers submitted for publication as part of the work for this dissertation for their comments, advice and corrections and to Diageo PLC for access to data on their beer pasteurization process. Finally, I would like to thank my wife and family for their patience and support over the past seven years. iv 004.36
35	Swarm-array computing : a swarm robotics inspired approach to achieve automated fault tolerance in high-performance computing systems Varghese, Blesson January 2011 (has links) Abstract: Fault tolerance is an important area of research in high-performance computing. Traditional fault tolerant methods which require human administrator intervention are challenged by many drawbacks and hence pose a constraint in achieving efficient fault tolerance for high-performance computer systems. The research presented in this dissertation is motivated towards the development of automated fault tolerant methods for high-performance computing. To this end, four questions are addressed: (1) How can autonomic computing concepts be ap- plied to parallel computing? (2) How can a bridge between multi-agent systems and parallel computing systems be built for achieving fault tolerance? (3) How can pro- cessor virtualization for process migration be extended for achieving fault tolerance in parallel computing systems? (4) How can traditional fault tolerant methods be replaced to achieve efficient fault tolerance in high-performance computing systems? In this dissertation, Swarm-Array Computing, a novel framework inspired by the concept of multi-agents in swarm robotics, and built on the foundations of parallel and autonomic computing is proposed to address these questions. The framework comprises three approaches, firstly, intelligent agents, secondly, intelligent cores, and thirdly, a combination of these as a means to achieving automated fault tolerance inline with the goals of autonomic computing. The feasibility of the framework is evaluated using simulation and practical experimental studies. The simulation studies were performed by emulating a field programmable gate array on a multi-agent simulator. The practical studies involved the implementation of a parallel reduction algorithm using message passing interfaces on a computer cluster. The statistics gathered from the experiments confirm that the swarm-array computing approaches improve the fault tolerance of high-performance computing systems over traditional fault tolerant mechanisms. The agent concepts within the framework are formalised by mapping a layered architecture onto both intelligent agents and intelligent cores. Elements of the work reported in this dissertation have been published as journal and conference papers (Appendix A) and presented as public lectures, conference presentations and posters (Appendix B). 004.36
36	Type representations and coordination Wilkinson, Andrew January 2007 (has links) Open coordination systems are a means of performing distributed computing where the processes in the system are not known to each other until the system is run. In an ideal world these processes could be of any language and running on any hardware or operating system. Traditionally systems have limited processes either to a single language or so that they can only communicate using simple types. 004.36
37	Extending WS-agreement to support dynamic service level agreements in grids Sharaf, Sanaa Abdullah M. January 2012 (has links) Grid Computing allows users to share resources in both commercial and scientific environments. This dependency on Grid systems accelerated the need for replacing the "best-effort" approach used in most Grid environments with a more controlled and reliable method of achieving the high levels of Quality of Service (QoS) necessary to potential users. Service Level Agreements (SLAs) are electronic contracts between the service provider and service consumer, which depict the provided service explicitly in terms of the requirements, guarantee terms and the responsibilities of each party. The WS-Agreement is a Web Service protocol used to establish an agreement between service providers and service consumers; the definition of the protocol is very general and does not contemplate the possibility of changing an agreement at runtime. The state of the SLA may be an important reason for reducing the reliability and trustworthiness of parties if an unexpected event occurs at runtime. Therefore, it is not possible to adapt the terms or the (negotiated) QoS parameters of the agreement to accommodate this new state. The challenge is to make agreements more long-lived and robust to individual term violations. This research shows extensions of the WS-Agreement specification to support the dynamic nature of SLAs by allowing the possibility of SLA renegotiation at runtime. Modifying the ~greement at runtime to accommodate the most recent QoS level required by both parties (service provider and service requester) gives more flexibility and provides better services in optimistic scenarios, and at least prevents violations and SLA failures in pessimistic scenarios. In this research, we have extended the WS-Agreement specification to support the SLA renegotiation and make it possible at runtime. The extended WS-Agreement specification with the renegotiation possibility have been implemented and tested in this research. Within this implementation, the concept of renegotiation has been proved through the ability to create more than one SLA at runtime. Moreover, a number of experiments have been designed to calculate the possible profit the service provider can gain in optimistic scenarios, or the saving which can result from rescuing the SLA from violations and paying penalties. A comparison between static SLA and the new, proposed, dynamic SLA shows an improvement in the application performance through lowering the risk and reducing the execution time . 004.36
38	Component level risk assessment modelling for grid resources Sangrasi, Asif January 2012 (has links) Service level agreements (SLAs), as formal contractual agreements, increase the confidence level between the End User and the Grid Resource provider, as compared to the best effort approach. However, SLAs fail short of assessing the risk in acceptance of the SLA; Risk Assessment in Grid computing fills that gap. Risk Assessment is crucial for the Resource provider as failing to fulfil an SLA will result in facing a financial penalty_ Thus risk, a deterrent to the commercial viability of Grids, needs to be assessed and mitigated to overcome the pitfalls associated with SLAs. The current approaches to assess and manage risk in Grid computing are a major step towards the provisioning of Quality of Service (QoS) to the end-user. However these approaches are based on node or machine level Assessment. As a node may contain CPU(s), storage devices, connections for communication and software resource, consequently a node failure may actually be a failure of any of these components. Our approach towards Risk Assessment is aimed at a granularity level of individual components as compared to machine level in previous efforts. Moreover the past efforts of risk assessment at node level fail short of considering the nature of the Grid Failure data that is repairable or replaceable. Thus to overcome the short comings of the previous efforts, we propose Risk Assessment Model(s) at component level considering the resources repairable and replaceable. A three step methodology was utilized in this work consisting of Data analysis, Risk modelling and Experimentation. The Probabilistic model, proposed at the component level based on senes and parallel model(s) considers Grid Resources as replaceable is based on. Similarly an R-out-N model is proposed for the aggregation of risk values for a number of nodes and provides more detailed results but with some pitfalls, against the parallel model. On the other hand, a risk assessment model at the component level based on NonHomogeneous Poisson Process (NHPP) model is proposed considering Grid Resources as Repairable. Grid failure data is used for the experimentation at the component level. The proposed NHPP based Grid risk model selection is validated by using a goodness of fit test along with graphical approaches. Similarly, considering Grid resources as repairable, a Semi Markov based Risk assessment model is also proposed. The Semi Markov based risk assessment model provides slight advantages over the NHPP based model such as taking repairability extrinsically and assessing the probabilities of repair for an individual components within a node. The three proposed risk models are evaluated by conducting the experimentation and are further evaluated by conducting a comparative evaluation and performance analysis of the proposed models. Detailed risk assessment information at the component level is provided by the experimental results of the proposed risk assessment models which can help enable Grid Resource provider to manage and use the Grid resources efficiently. These results can in turn help enhance the commercial viability and QoS provisioning to End Users by utilization of risk aware scheduling by Grid Resource Provider. 004.36
39	MoDeS : a mobile-agent system for improving availability and data-traffic on pervasive computing Koukoumpsetsos, Kyriakos January 2004 (has links) No description available. 004.36
40	An organization model to support dynamic cooperative work and workflow Cheng, Edward Chi-Man January 2004 (has links) No description available. 004.36

Search results