331 |
Communication performance measurement and analysis on commodity clusters.Abdul Hamid, Nor Asilah Wati January 2008 (has links)
Cluster computers have become the dominant architecture in high-performance computing. Parallel programs on these computers are mostly written using the Message Passing Interface (MPI) standard, so the communication performance of the MPI library for a cluster is very important. This thesis investigates several different aspects of performance analysis for MPI libraries, on both distributed memory clusters and shared memory parallel computers. The performance evaluation was done using MPIBench, a new MPI benchmark program that provides some useful new functionality compared to existing MPI benchmarks. Since there has been only limited previous use of MPIBench, some initial work was done on comparing MPIBench with other MPI benchmarks, and improving its functionality, reliability, portability and ease of use. This work included a detailed comparison of results from the Pallas MPI Benchmark (PMB), SKaMPI, Mpptest, MPBench and MPIBench on both distributed memory and shared memory parallel computers, which has not previously been done. This comparison showed that the results for some MPI routines were significantly different between the different benchmarks, particularly for the shared memory machine. A comparison was done between Myrinet and Ethernet network performance on the same machine, an IBM Linux cluster with 128 dual processor nodes, using the MPICH MPI library. The analysis focused mainly on the scalability and variability of communication times for the different networks, making use of the capability of MPIBench to generate distributions of MPI communication times. The analysis provided an improved understanding of the effects of TCP retransmission timeouts on Ethernet networks. This analysis showed anomalous results for some MPI routines. Further investigation showed that this is because MPICH uses different algorithms for small and large message sizes for some collective communication routines, and the message size where this changeover occurs is fixed, based on measurements using a cluster with a single processor per node. Experiments were done to measure the performance of the different algorithms, which demonstrated that for some MPI routines the optimal changeover points were very different between Myrinet and Ethernet networks and for 1 and 2 processors per node. Significant performance improvements can be made by allowing the changeover points to be tuned rather than fixed, particularly for commodity Ethernet networks and for clusters with more than 1 process per node. MPIBench was also used to analyse the MPI performance and scalability of a large ccNUMA shared memory machine, an SGI Altix 3000 with 160 processors. The results were compared with a high-end cluster, an AlphaServer SC with Quadrics QsNet interconnect. For most MPI routines the Altix showed significantly better performance, particularly when non-buffered copy was used. MPIBench proved to be a very capable tool for analyzing MPI performance in a variety of different situations. / http://proxy.library.adelaide.edu.au/login?url= http://library.adelaide.edu.au/cgi-bin/Pwebrecon.cgi?BBID=1331421 / Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 2008
|
332 |
On resource placements and fault-tolerant broadcasting in toroidal networksAlMohammad, Bader Fahed AlBedaiwi 13 November 1997 (has links)
Parallel computers are classified into: Multiprocessors, and multicomputers. A
multiprocessor system usually has a shared memory through which its processors
can communicate. On the other hand, the processors of a multicomputer system
communicate by message passing through an interconnection network. A widely
used class of interconnection networks is the toroidal networks. Compared to a
hypercube, a torus has a larger diameter, but better tradeoffs, such as higher channel
bandwidth and lower node degree. Results on resource placements and fault-tolerant
broadcasting in toroidal networks are presented.
Given a limited number of resources, it is desirable to distribute these resources
over the interconnection network so that the distance between a non-resource and a
closest resource is minimized. This problem is known as distance-d placement. In
such a placement, each non-resource must be within a distance of d or less from at
least one resource, where the number of resources used is the least possible. Solutions
for distance-d placements in 2D and 3D tori are proposed. These solutions are
compared with placements used so far in practice. Simulation experiments show
that the proposed solutions are superior to the placements used in practice in terms of reducing average network latency.
The complexity of a multicomputer increases the chances of having processor failures. Therefore, designing fault-tolerant communication algorithms is quite necessary for a sufficient utilization of such a system. Broadcasting (single-node one-to-all) in a multicomputer is one of the important communication primitives. A non-redundant fault-tolerant broadcasting algorithm in a faulty toroidal network is designed. The algorithm can adapt up to (2n-2) processor failures. Compared to the optimal algorithm in a fault-free n-dimensional toroidal network, the proposed algorithm requires at most 3 extra communication steps using cut through packet routing, and (n + 1) extra steps using store-and-forward routing. / Graduation date: 1998
|
333 |
Distributed systems, hardware-in-the-loop simulation, and applications in control systems /Handrigan, Paul, January 2004 (has links)
Thesis (M.Eng.)--Memorial University of Newfoundland, 2005. / Bibliography: leaves 124-128.
|
334 |
Design and performance analysis of MPI-SHARC a high-speed network service for distributed digital signal processor systems /Kohout, James, January 2001 (has links) (PDF)
Thesis (M.S.)--University of Florida, 2001. / Title from first page of PDF file. Document formatted into pages; contains ix, 69 p.; also contains graphics. Vita. Includes bibliographical references (p. 66-68).
|
335 |
Automatic program restructuring for distributed memory multicomputersIkei, Mitsuru 04 1900 (has links) (PDF)
M.S. / Computer Science and Engineering / To compile a Single Program Multiple Data (SPMD) program for a Distributed Memory Multicomputer (DMMC), we need to find data that can be processed in parallel in the program and we need to distribute the data among processors such that the interprocessor communication becomes reasonably small. Loop restructuring is needed for finding parallelism in imperative programs and array alignment is one effective step to reduce interprocessor communication caused by array references. Automatic conversion of imperative programs using these two program restructuring steps has been implemented in the Tiny loop restructuring tool. The restructuring strategy is derived by translating the way that the compiler uses for the functional language Crystal, to the imperative language Tiny. Although an imperative language can have more varied loop structures than a functional language and it is more difficult to select the optimal one, we can get a loop structure which is comparable to Crystal. We also can find array alignment preference (temporal + spatial) relations in a Tiny source program and add a new construct, the align statement, to Tiny to express the array alignment preferences. In this thesis, we discuss these program restructuring strategies which we used for Tiny by comparison with Crystal.
|
336 |
Implementation of display control node for a distributed microcontroller networkXu, Yan, 1954- 10 October 1990 (has links)
As hardware becomes cheaper and cheaper, there is an
increasing interest in the implementation of distributed control
networks with microcontrollers. Such networks are usually
inexpensive and of high performance.
In this thesis work, a low-cost display control unit for COLAN, a
control-oriented network, is designed. The system is based on the
8051 single-chip microprocessor and 82716 video controller.
COLAN network users can remotely display and control the screen by
sending commands to the display control node through the network.
The emphasis of this thesis is in the implementation of the
display control node, including hardware and software. It has been
demonstrated that a basic set of display control functions has been
developed, different type monitors can be supported, and on-screen
instructions make the system easy to use. / Graduation date: 1991
|
337 |
Software Tools for Separating Distribution ConcernsTilevich, Eli 18 November 2005 (has links)
With the advent of the Internet, distributed programming has become a necessity for the majority of application domains. Nevertheless, programming distributed systems remains a delicate and complex task. This dissertation explores separating distribution concerns,
the process of transforming a centralized monolithic program into a distributed one. This research develops algorithms, techniques, and tools for separating distribution concerns
and evaluates the applicability of the developed artifacts by identifying the distribution
concerns that they separate and the common architectural characteristics of the centralized programs that they transform successfully. The thesis of this research is that software tools working with standard mainstream languages, systems software, and virtual machines can effectively and efficiently separate distribution concerns from application logic for object-oriented programs that use multiple distinct sets of resources. Among the specific technical contributions of this dissertation are (1) a general algorithm for call-by-copy-restore semantics in remote procedure calls for linked data structures, (2) an analysis heuristic that determines which application objects get passed to which parts of native (i.e., platform-specific) code in the language runtime system for platform-independent binary code applications, (3) a technique for injecting code in such applications that will convert objects to the right representation so that they can be accessed correctly inside both application
and native code, (4) an approach to maintaining the Java centralized concurrency and synchronization semantics over remote procedure calls efficiently, and (5) an approach to enabling the execution of legacy Java code remotely from a web browser.
The technical contributions of this dissertation have been realized in three software tools for separating distribution concerns: NRMI, middleware with copy-restore semantics; GOTECH, a program generator for distribution; and J-Orchestra, an automatic partitioning system. This dissertation presents several case studies of successfully applying the developed
tools to third-party programs.
|
338 |
A formal framework for modelling component extension and layers in distributed embedded systems /Förster, Stefan. January 2007 (has links)
Techn. Univ., Diss.--Chemnitz.
|
339 |
High-speed network interface for commodity SMP clustersWong, Kwan-po. January 2000 (has links)
Thesis (M. Phil.)--University of Hong Kong, 2001. / Includes bibliographical references (p. 118-122).
|
340 |
Strategic analysis of a data processing company /Chen, George C. M. January 2005 (has links)
Research Project (M.B.A.) - Simon Fraser University, 2005. / Research Project (Faculty of Business Administration) / Simon Fraser University. Senior supervisor : Dr. Ed Bukszar. EMBA Program. Also issued in digital format and available on the World Wide Web.
|
Page generated in 0.1011 seconds