61 |
On the Performance Analysis of Large Scale, Dynamic, Distributed and Parallel Systems.Ardelius, John January 2013 (has links)
Evaluating the performance of large distributed applications is an important and non-trivial task. With the onset of Internet wide applications there is an increasing need to quantify reliability, dependability and performance of these systems, both as a guide in system design as well as a means to understand the fundamental properties of large-scale distributed systems. Previous research has mainly focused on either formalised models where system properties can be deduced and verified using rigorous mathematics or on measurements and experiments on deployed applications. Our aim in this thesis is to study models on an abstraction level lying between the two ends of this spectrum. We adopt a model of distributed systems inspired by methods used in the study of large scale system of particles in physics and model the application nodes as a set of interacting particles each with an internal state whose actions are specified by the application program. We apply our modeling and performance evaluation methodology to four different distributed and parallel systems. The first system is the distributed hash table (DHT) Chord running in a dynamic environment. We study the system under two scenarios. First we study how performance (in terms of lookup latency) is affectedon a network with finite communication latency. We show that an average delay in conjunction with other parameters describing changes in the network (such as timescales for network repair and join and leave processes)induces fundamentally different system performance. We also verify our analytical predictions via simulations.In the second scenario we introduce network address translators (NATs) to the network model. This makes the overlay topology non-transitive and we explore the implications of this fact to various performance metrics such as lookup latency, consistency and load balance. The latter analysis is mainly simulation based.Even though these two studies focus on a specific DHT, many of our results can easily be translated to other similar ring-based DHTs with long-range links, and the same methodology can be applied evento DHT's based on other geometries.The second type of system studied is an unstructured gossip protocol running a distributed version of the famous Belman-Ford algorithm. The algorithm, called GAP, generates a spanning tree over the participating nodes and the question we set out to study is how reliable this structure is(in terms of generating accurate aggregate values at the root) in the presence of node churn. All our analytical results are also verified using simulations.The third system studied is a content distribution network (CDN) of interconnected caches in an aggregation access network. In this model, content which sits at the leaves of the cache hierarchy tree, is requested by end users. Requests can then either be served by the first cache level or sent further up the tree. We study the performance of the whole system under two cache eviction policies namely LRU and LFU. We compare our analytical results with traces from related caching systems.The last system is a work stealing heuristic for task distribution in the TileraPro64 chip. This system has access to a shared memory and is therefore classified as a parallel system. We create a model for the dynamic generation of tasks as well as how they are executed and distributed among the participating nodes. We study how the heuristic scales when the number of nodes exceeds the number of processors on the chip as well as how different work stealing policies compare with each other. The work on this model is mainly simulation-based. / Att utvärdera prestanda hos storskaliga distribuerade system är en viktigoch icke-trivial uppgift. I och med utvecklingen av Internet och det faktum attapplikationer och system har fått global utsträckning, har det uppkommit ettökande behov av kvantifiering av tillförlitlighet och prestanda hos dessa system.Både som underlag för systemdesign men också för att skapa förståelseoch kunskap om fundamentala egenskaper hos distribuerade system.Tidigare forskning har i mångt och mycket fokuserat antingen på formaliserademodeller, där egenskaper kan härledas med hjälp av strikta matematiskametoder eller på mätningar av riktiga system. Målet med arbetet i dennaavhandling är att undersöka modeller på en abstraktionsnivå mellan dessa tvåytterligheter. Vi tillämpar en modell av distributerade system med inspirationfrån så kallade partikelmodeller från den teoretiska fysiken och modellererarapplikationsnoder som en samling interagerande pariklar var och en med sitteget interna tillstånd vars beteende beskrivs av det exekvernade programmeti fråga. Vi tillämpar denna modelerings- och utvärderingsmetod på fyra olikadistribuerade och parallella system.Det första systemet är den distribuerade hash tabellen (DHT) Chord i endynamisk miljö. Vi har valt att studera systemet under två scenarion. Förstutvärderar vi hur systemet beteer sig (med avseende på lookup latency) iett nätverk med finita kommunikationsfördröjningar. Vårt arbete visar atten generell fördröjning i nätet tillsammans med andra parametrar (som t.ex.tidsskala för felkorrektion och anslutningsprocess för noder) genererar fundamentaltskilda prestandamått. Vi verifierar vår analytiska model med simuleringar.I det andra scenariot undersöker vi betydelsen av NATs (networkadress translators) i nätverksmodellen. Förekomsten av dessa tar bort dentransitiva egenskapen hos nätverkstopologin och vi undersöker hur detta påverkarlookup-kostnad, datakonsistens och lastbalans. Denna analys är främst simuleringsbaserad.Även om dessa två studier fokuserar på en specifik DHT såkan de flesta resultat och metoden som sådan överföras på andra liknanderingbaserade DHTer med långa länkar och även andra geometrier.Den andra klassen av system som analyseras är ostrukturerade gossip protokolli form av den välkända Belman-Ford algoritmen. Algoritmen, GAP,skapar ett spännande träd över systemets noder. Problemställningen vi studerarär hur tillförlitlig denna struktur, med avseende på precisionen på aggregatvid rotnoden, är i ett dynamiskt nätverk. Samtliga analytiska resultatverifieras i simulator.Det tredje systemet vi undersöker är ett CDN (content distribution system)med en hierarkisk cache struktur i sitt distributionsnät. I den här modellenefterfrågas data från löven på cache-trädet. Antingen kan förfrågan servas avcacharna på de lägre nivåerna eller så skickas förfrågan vidare uppåt i trädet.Vi analyserar två fundamentala heuristiker, LRU och LFU. Vi jämför våraanalytiska resultat med tracedata från riktiga cachesystem.Till sist analyserar vi en heuristik för last distribution i TileraPro64 arkitekturen.Systemet har ett centralt delat minne och är därför att betrakta somparallellt. Vi skapar här en model för den dynamiska genereringen av lastsamt hur denna distribueras till de olika noderna på chipet. Vi studerar hur heuristiken skalar när antalet noder överstiger antalet på chipet (64) samtjämför prestanda hos olika heuristiker. Analysen är simuleringsbaserad. / <p>QC 20131128</p>
|
62 |
Multicast communication in distributed systems with dynamic groupsBelkeir, Nasr Eddine January 1991 (has links)
No description available.
|
63 |
Classification of complex two-dimensional images in a parallel distributed processing architectureSimpson, Robert Gilmour January 1992 (has links)
Neural network analysis is proposed and evaluated as a method of analysis of marine biological data, specifically images of plankton specimens. The quantification of the various plankton species is of great scientific importance, from modelling global climatic change to predicting the economic effects of toxic red tides. A preliminary evaluation of the neural network technique is made by the development of a back-propagation system that successfully learns to distinguish between two co-occurring morphologically similar species from the North Atlantic Ocean, namely Ceratium arcticum and C. longipes. Various techniques are developed to handle the indeterminately labelled source data, pre-process the images and successfully train the networks. An analysis of the network solutions is made, and some consideration given to how the system might be extended.
|
64 |
Fault-tolerant group communication protocols for asynchronous systemsMacedo, Raimundo Jose de Araujo January 1994 (has links)
It is widely accepted that group communication (multicast) is a powerful abstraction that can be used whenever a collection of distributed processes cooperate to achieve a common goal such as load-sharing or fault-tolerance. Due to the uncertainties inherent to distributed systems (emerging from communication and/or process failures), group communication protocols have to face situations where, for instance, a sender process fails when a multicast is underway or where messages from different senders arrive in an inconsistent order at different destination processes. Further complications arise if processes belong to multiple groups. In this thesis, we make use of logical clocks [Lamport78] to develop the concept of Causal Blocks. We show that Causal Blocks provide a concise method for deducing ordering relationships between messages exchanged by processes of a group, resulting in simple methods for dealing with multiple groups. Based on the Causal Blocks representation, we present a protocol for total order message delivery which has constant and low message space overhead (Le. the protocol related information contained in a multicast message is small). We also present causal order protocols with different trade-offs between message space overhead and speed of message delivery. Furthermore, we show how the Causal Blocks representation can be used to easily deduce and maintain reliability information. Our protocols are faulttolerant: ordering and liveness are preserved even if group membership changes occur (due to failures such as process crashes or network partitions). The total order protocol, together with a novel flow control mechanism, has been implemented over a set of networked Unix workstations, and experiments carried out to analyse its performance in varied group configurations.
|
65 |
Naming issues in the design of transparently distributed operating systemsStroud, Robert James January 1987 (has links)
Naming is of fundamental importance in the design of transparently distributed operating systems. A transparently distributed operating system should be functionally equivalent to the systems of which it is composed. In particular, the names of remote objects should be indistinguishable from the names oflocal objects. In this thesis we explore the implication that this recursive notion of transparency has for the naming mechanisms provided by an operating system. In particular, we show that a recursive naming system is more readily extensible than a flat naming system by demonstrating that it is in precisely those areas in which a system is not recursive that transparency is hardest to achieve. However, this is not so much a problem of distribution so much as a problem of scale. A system which does not scale well internally will not extend well to a distributed system. Building a distributed system out of existing systems involves joining the name spaces of the individual systems together. When combining name spaces it is important to preserve the identity of individual objects. Although unique identifiers may be used to distinguish objects within a single name space, we argue that it is difficult if not impossible in practice to guarantee the uniqueness of such identifiers between name spaces. Instead, we explore the possibility of Using hierarchical identifiers, unique only within a localised context. However, We show that such identifiers cannot be used in an arbitrary naming graph without compromising the notion of identity and hence violating the semantics of the underlying system. The only alternative is to sacrifice a deterministic notion of identity by using random identifiers to approximate global uniqueness with a know probability of failure (which can be made arbitrarily small if the overall size of the system is known in advance).
|
66 |
Constructing reliable distributed applications using actions and objectsWheater, Stuart Mark January 1989 (has links)
A computation model for distributed systems which has found widespread acceptance is that of atomic actions (atomic transactions) controlling operations on persistent objects. Much current research work is oriented towards the design and implementation of distributed systems supporting such an object and action model. However, little work has been done to investigate the suitability of such a model for building reliable distributed systems. Atomic actions have many properties which are desirable when constructing reliable distributed applications, but these same properties can also prove to be obstructive. This thesis examines the suitability of atomic actions for building reliable distributed applications. Several new structuring techniques are proposed providing more flexibility than hitherto possible for building a large class of applications. The proposed new structuring techniques are: Serialising Actions, Top-Level Independent Actions, N-Level Independent Actions, Common Actions and Glued Actions. A new generic form of action is also proposed, the Coloured Actions, which provides more control over concurrency and recovery than traditional actions. It will be shown that Coloured Actions provide a uniform mechanism for implementing most of the new structuring techniques, and at the same time are no harder to implement than normal actions. Thus this proposal is of practical importance. The suitability of new structuring techniques will be demonstrated by considering a number of applications. It will be shown that the proposed techniques provide natural tools for composing distributed applications.
|
67 |
Object Management for Persistence and RecoverabilityDixon, Graeme N. January 1988 (has links)
As distribution becomes commonplace, there is a growing requirement for applications that behave reliably when node or network failures occur. To support reliability, operations on the components of a distributed application may be declared to occur within the scope of an atomic action. This thesis describes how atomic actions may be supported in an environment consisting of applications that operate on objects. To support the failure atomicity and permanence of effect properties of an atomic action, the objects accessed within the scope of an atomic action must be recoverable and persistent. This thesis describes how these properties may be added to the class of an object. The approach adopted is to provide a class that implements recovery and persistence mechanisms, and derive new classes from this base class. By refining inherited operations so that recovery and persistence is specific to that class, recoverable and persistent objects may be easily produced. This thesis also describes how an atomic action may be implemented as a class, so that instances of the class are atomic actions which manage the recoverable and persistent objects. Multiple instance declarations produce nested atomic actions, and the atomic action class also inherits persistence so that shortterm commit information may be saved in an object store which is used to maintain the passive state of persistent objects. Since the mechanisms and classes that support recovery, persistence, and atomic actions are constructed using the feature of an object-oriented language, they may be implemented in environments that provide suitable support for objects and object-oriented programming languages.
|
68 |
Impact of Distributed Generation on Distribution Feeder ProtectionChang, Tim 15 December 2010 (has links)
Standard overcurrent protection schemes for passive radial systems assume single direction current flow. The addition of distributed generation (DG) presents issues for the protection scheme, as current can flow from multiple directions. This thesis investigates the impact of DGs on overcurrent protection designed for a radial system, and proposes solutions to address the issues. A realistic feeder system and its protection scheme are developed in PSCAD/EMTDC. A point-of-common-coupling (PCC) is identified, indicating the portion of feeder that can potentially operate as an island. One DG, with output adjusted to maintain a specified power flow at the PCC, is added to the feeder system. The performance of the overcurrent protection in the presence of line, ground, and three-phase faults is analyzed. A second machine, outputting full capacity at unity power factor, is added to the feeder system. The strategies used to develop the single-DG modified protection scheme are applied to the two-DG system. The functionality of the modified protection scheme is verified.
|
69 |
Impact of Distributed Generation on Distribution Feeder ProtectionChang, Tim 15 December 2010 (has links)
Standard overcurrent protection schemes for passive radial systems assume single direction current flow. The addition of distributed generation (DG) presents issues for the protection scheme, as current can flow from multiple directions. This thesis investigates the impact of DGs on overcurrent protection designed for a radial system, and proposes solutions to address the issues. A realistic feeder system and its protection scheme are developed in PSCAD/EMTDC. A point-of-common-coupling (PCC) is identified, indicating the portion of feeder that can potentially operate as an island. One DG, with output adjusted to maintain a specified power flow at the PCC, is added to the feeder system. The performance of the overcurrent protection in the presence of line, ground, and three-phase faults is analyzed. A second machine, outputting full capacity at unity power factor, is added to the feeder system. The strategies used to develop the single-DG modified protection scheme are applied to the two-DG system. The functionality of the modified protection scheme is verified.
|
70 |
Simulating multiple systems of systems using the high level architecture.Cramp, Anthony January 2009 (has links)
Simulation provides the ability to obtain results from, and analyse, a system without physically building the system. These results can be used to inform the actual construction of the physical system, how best to use a system, how best to integrate a system with another system, and so on. A simulation can also be used to train and educate the end-users of a system either before the system is actually produced or when the availability of the actual system is limited. Most end systems are in some way composed of subsystems. The subsystems themselves may be composed of subsystems. This type of architecture is generically referred to as a system of systems. For example, a ship is composed of a hull, engines, sensors, etc. The engine system may be composed of the fuel and cooling subsystems, for example. Systems constructed this way have numerous benefits including allowing subsystems to be built independently of each other (after creating well defined interfaces), and allowing for subsystems to be replaced without affecting other subsystems. These same benefits are desirable in the construction of a simulation of a system. One simulation framework that supports these ideals is the High Level Architecture (HLA). The HLA is an international modelling and simulation framework that specifically provides for distributed simulation. The HLA uses the term federate for component simulations that are then brought together in a distributed computing environment to form a federation. The HLA defines a data model for documenting the data interfaces of the federates and the application programming interface used by the federates to communicate data. A simulation of a systems of systems architecture can be implemented in the HLA by creating federates for each subsystem and defining the data communicated between subsystems in terms of HLA’s data model. HLA’s default communication model defines publishers and subscribers of data classes. The HLA provides class based filtering, i.e., a federate only receives data for a data class to which it has subscribed. However, HLA’s default communication model has no notion of direct ‘wiring’ between federates. Thus, it is not possible to have data sent to a specific federate. This creates a problem if multiple instances of a system of systems are simulated concurrently, which may be desirable so as to observe the interactions between systems. In this case, the data sent within one system is exposed to all other systems in the simulation. This thesis explores this problem of simulating multiple systems of systems using the HLA. The problem is stated formally by introducing the concept of a message path and showing that a federation containing multiple systems of systems contains incorrect message paths which communicate intra-system data between systems. Three methods are presented and shown to solve the problem by either eliminating the incorrect message paths or allowing a receiving federate to determine whether intra-system data was delivered via an incorrect message path. The three solutions are Local Data Filtering (LDF), Data Distribution Management (DDM), and Federation Communities (FC). The LDF solution marks all intra-system data with a system identifier, allowing receivers to distinguish whether they should process it. The DDM method uses a service defined by the HLA that essentially provides an automated version of the LDF solution. The FC method restricts one federation to simulating one system and requires a multiple system simulation to enable inter-federation communication, something that is not defined in the HLA. These three methods are analysed both quantitatively and qualitatively. The quantitative analysis looks at performance overhead imposed by each method and how well each method reduces the number of incorrect intra-system messages communicated. The qualitative analysis is presented in terms of identifying the complexity of implementing each method for a specific systems of systems federation: the election process for the Australian federal government. The thesis concludes that the LDF method is simple to understand but potentially finicky to implement and is wasteful of network resources. The DDMmethod is advantageous in that it is a service defined by the HLA standard. However, the implementation of the DDM services by a Runtime Infrastructure (RTI) is not defined by the HLA. Thus, the performance of the DDMmethod is coupled to a specific RTI and its configurability. The FC method achieves an ideal of replicating the simulation of a single system without modification to achieve a multisystem simulation. However, it requires and inter-federation communicationmechanism that is not defined by the HLA. The FC method introduces extra latency and reduced throughput to inter-system messages in a Local Area Network (LAN) environment. / Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 2009
|
Page generated in 0.1177 seconds