• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 12
  • 4
  • 2
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 30
  • 30
  • 10
  • 10
  • 8
  • 6
  • 6
  • 5
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Reconfigurable Backplane Topology

Rajendra Prasad, Gunda, Ajay Kumar, Thenmatam, Srinivasa Rao, Kurapati January 2006 (has links)
In the field of embedded computer and communication systems, the demands for the interconnection networks are increasing rapidly. To satisfy these demands much advancement has been made at the chip level as well as at the system level and still the research works are going on, to make the interconnection networks more flexible to satisfy the demands of the real-time applications. This thesis mainly focuses on the interconnection between the nodes in an embedded system via a reconfigurable backplane. To satisfy the project goals, an algorithm is written for the reconfigurable topology that changes according to the given traffic specification like throughput. Initially the connections are established between pairs of nodes according to the given throughput demands. By establishing all the connections, a topology is formed. Then a possible path is chosen for traversing the data from source to destination nodes. Later the algorithm is implemented by simulation and the results are shown in a tabular form. Through some application examples, we both identify problems with the algorithm and propose an improvement to deal with such problems.
2

Reconfigurable Backplane Topology

Rajendra Prasad, Gunda, Ajay Kumar, Thenmatam, Srinivasa Rao, Kurapati January 2006 (has links)
<p>In the field of embedded computer and communication systems, the demands for the </p><p>interconnection networks are increasing rapidly. To satisfy these demands much advancement has </p><p>been made at the chip level as well as at the system level and still the research works are going </p><p>on, to make the interconnection networks more flexible to satisfy the demands of the real-time </p><p>applications. </p><p> </p><p>This thesis mainly focuses on the interconnection between the nodes in an embedded system via a </p><p>reconfigurable backplane. To satisfy the project goals, an algorithm is written for the </p><p>reconfigurable topology that changes according to the given traffic specification like throughput. </p><p>Initially the connections are established between pairs of nodes according to the given throughput </p><p>demands. By establishing all the connections, a topology is formed. Then a possible path is </p><p>chosen for traversing the data from source to destination nodes. Later the algorithm is </p><p>implemented by simulation and the results are shown in a tabular form. Through some application </p><p>examples, we both identify problems with the algorithm and propose an improvement to deal </p><p>with such problems.</p>
3

Routing Statistics for Unqueued Banyan Networks

Knight, Thomas F., Jr., Sobalvarro, Patrick G. 01 September 1990 (has links)
Banyan networks comprise a large class of networks that have been used for interconnection in large-scale multiprocessors and telephone switching systems. Regular variants of Banyan networks, such as delta and butterfly networks, have been used in multiprocessors such as the IBM RP3 and the BBN Butterfly. Analysis of the performance of Banyan networks has typically focused on these regular variants. We present a methodology for performance analysis of unbuffered Banyan multistage interconnection networks. The methodology has two novel features: it allows analysis of networks where some inputs are more likely to be active than others, and allows analysis of Banyan networks of arbitrary topology.
4

Performance evaluation of Distributed Crossbar Switch Hypermesh

Loucif, Samia January 1999 (has links)
No description available.
5

Accelerating Communication in On-Chip Interconnection Networks

Ahn, Minseon 2012 May 1900 (has links)
Due to the ever-shrinking feature size in CMOS process technology, it is expected that future chip multiprocessors (CMPs) will have hundreds or thousands of processing cores. To support a massively large number of cores, packet-switched on-chip interconnection networks have become a de facto communication paradigm in CMPs. However, the on-chip networks have several drawbacks, such as limited on-chip resources, increasing communication latency, and insufficient communication bandwidth. In this dissertation, several schemes are proposed to accelerate communication in on-chip interconnection networks within area and cost budgets to overcome the problems. First, an early transition scheme for fully adaptive routing algorithms is proposed to improve network throughput. Within a limited number of resources, previously proposed fully adaptive routing algorithms have low utilization in escape channels. To increase utilization of escape channels, it transfers packets earlier before the normal channels are full. Second, a pseudo-circuit scheme is proposed to reduce network latency using communication temporal locality. Reducing per-hop router delay becomes more important for communication latency reduction in larger on-chip interconnection networks. To improve communication latency, the previous arbitration information is reused to bypass switch arbitration. For further acceleration, we also propose two aggressive schemes, pseudo-circuit speculation and buffer bypassing. Third, two handshake schemes are proposed to improve network throughput for nanophotonic interconnects. Nanophotonic interconnects have been proposed to replace metal wires with optical links in on-chip interconnection networks for low latency and power consumptions as well as high bandwidth. To minimize the average token waiting time of the nanophotonic interconnects, the traditional credit-based flow control is removed. Thus, the handshake schemes increase link utilization and enhance network throughput.
6

Reliable low latency I/O in torus-based interconnection networks

Azeez, Babatunde 25 April 2007 (has links)
In today's high performance computing environment I/O remains the main bottleneck in achieving the optimal performance expected of the ever improving processor and memory technologies. Interconnection networks therefore combines processing units, system I/O and high speed switch network fabric into a new paradigm of I/O based network. It decouples the system into computational and I/O interconnections each allowing "any-to-any" communications among processors and I/O devices unlike the shared model in bus architecture. The computational interconnection, a network of processing units (compute-nodes), is used for inter-processor communication in carrying out computation tasks, while the I/O interconnection manages the transfer of I/O requests between the compute-nodes and the I/O or storage media through some dedicated I/O processing units (I /O-nodes). Considering the special functions performed by the I/O nodes, their placement and reliability become important issues in improving the overall performance of the interconnection system. This thesis focuses on design and topological placement of I/O-nodes in torus based interconnection networks, with the aim of reducing I/O communication latency between compute-nodes and I/O-nodes even in the presence of faulty I/O-nodes. We propose an efficient and scalable relaxed quasi-perfect placement scheme using Lee distance error correction code such that compute-nodes are at distance-t or at most distance-t+1 from an I/O-node for a given t. This scheme provides a better and optimal alternative placement than quasi perfect placement when perfect placement cannot be found for a particular torus. Furthermore, in the occurrence of faulty I/O-nodes, the placement scheme is also used in determining other alternative I/O-nodes for rerouting I/O traffic from affected compute-nodes with minimal slowdown. In order to guarantee the quality of service required of inter-processor communication, a scheduling algorithm was developed at the router level to prioritize message forwarding according to inter-process and I/O messages with the former given higher priority. Our simulation results show that relaxed quasi-perfect outperforms quasi-perfect and the conventional I/O placement (where I/O nodes are concentrated at the base of the torus interconnection) with little degradation in inter-process communication performance. Also the fault tolerant redirection scheme provides a minimal slowdown, especially when the number of faulty I/O nodes is less than half of the initial available I/O nodes.
7

Performance analysis of multistage interconnection networks with general traffic

Lin, Hua January 2002 (has links)
No description available.
8

Design, development and evaluation of an efficient hierarchical interconnection network.

Campbell, Stuart M. January 1999 (has links)
Parallel computing has long been an area of research interest because exploiting parallelism in difficult problems has promised to deliver orders of magnitude speedups. Processors are now both powerful and cheap, so that systems incorporating tens, hundreds or even thousands of powerful processors need not be prohibitively expensive. The weak link in exploiting parallelism is the means of communication between the processors. Shared memory systems are fundamentally limited in the number of processors they can utilise. To achieve high levels of parallelism it is still necessary to use distributed memory and some form of interconnection network. But interconnection networks can be costly, slow, difficult to build and expand, vulnerable to faults and limited in the range of problems they can be used to solve effectively. As a result there has been extensive research into developing interconnection networks which overcome some or all of these difficulties. In this thesis it is argued that a new interconnection network, Hierarchical Cliques (HiC), and a derivative, FatHiC, possesses many desirable properties and are worthy of consideration for use in building parallel computers. A fundamental element of an interconnection network is its topology. After defining the topology of HiC, expressions are derived for the various parameters which define its underlying limits of performance and fault tolerance. A second element of an interconnection network is an addressing and routing scheme. The addressing scheme and routing algorithms of HiC are described. The flexibility of HiC is demonstrated by developing embeddings of popular, regular interconnection networks. Some embeddings into HiC suffer from high congestion, however the FatHiC network is shown to have low congestion for those embeddings. The performance of some important, regular, data parallel problems on HiC and ++ / FatHiC are determined by analysis and simulation, using the 2D-mesh as a means of comparison. But performance alone does not tell the whole story. Any parallel computer system must be cost effective. In order to analyse the cost effectiveness of HiCs an existing measure was expanded to provide a more realistic model and a more accurate means of comparison. One aim of this thesis is to demonstrate the suitability of HiC for parallel computing systems which execute irregular algorithms requiring dynamic load balancing. A new dynamic load balancing algorithm is proposed which takes advantage of the hierarchical structure of the HiC to reduce communication overheads incurred when distributing work. To demonstrate performance of an irregular problem, a novel parallel algorithm was developed to detect subgraph isomorphism from many model graphs to a single input graph. The use of the new load balancing algorithm in conjunction with the subgraph isomorphism algorithm is discussed.
9

IMPROVING MESSAGE-PASSING PERFORMANCE AND SCALABILITY IN HIGH-PERFORMANCE CLUSTERS

RASHTI, Mohammad Javad 26 January 2011 (has links)
High Performance Computing (HPC) is the key to solving many scientific, financial, and engineering problems. Computer clusters are now the dominant architecture for HPC. The scale of clusters, both in terms of processor per node and the number of nodes, is increasing rapidly, reaching petascales these days and soon to exascales. Inter-process communication plays a significant role in the overall performance of HPC applications. With the continuous enhancements in interconnection technologies and node architectures, the Message Passing Interface (MPI) needs to be improved to effectively utilize the modern technologies for higher performance. After providing a background, I present a deep analysis of the user level and MPI libraries over modern cluster interconnects: InfiniBand, iWARP Ethernet, and Myrinet. Using novel techniques, I assess characteristics such as overlap and communication progress ability, buffer reuse effect on latency, and multiple-connection scalability. The outcome highlights some of the inefficiencies that exist in the communication libraries. To improve communication progress and overlap in large message transfers, a method is proposed which uses speculative communication to overlap communication with computation in the MPI Rendezvous protocol. The results show up to 100% communication progress and more than 80% overlap ability over iWARP Ethernet. An adaptation mechanism is employed to avoid overhead on applications that do not benefit from the method due to their timing specifications. To reduce MPI communication latency, I have proposed a technique that exploits the application buffer reuse characteristics for small messages and eliminates the sender-side copy in both two-sided and one-sided MPI small message transfer protocols. The implementation over InfiniBand improves small message latency up to 20%. The implementation adaptively falls back to the current method if the application does not benefit from the proposed technique. Finally, to improve scalability of MPI applications on ultra-scale clusters, I have proposed an extension to the current iWARP standard. The extension improves performance and memory usage for large-scale clusters. The extension equips Ethernet with an efficient zero-copy, connection-less datagram transport. The software-level evaluation shows more than 40% performance benefits and 30% memory usage reduction for MPI applications on a 64-core cluster. / Thesis (Ph.D, Electrical & Computer Engineering) -- Queen's University, 2010-10-16 12:25:18.388
10

All Optical Switching Architectures

Sathyan, Saju January 2006 (has links)
In communication systems, the need for high bandwidth interconnects and efficient distribution of large amount of data is very essential. This thesis work addresses all-optical packet switching issues in the field of reconfigurable optical interconnection networks for high performance embedded systems. The recent research conducted at the Halmstad University, on high performance embedded systems, focuses on the optical interconnection techniques to achieve ultra high throughputs and reconfigurability at the system level. Recent research in the field of optical interconnection networks for applications like switches and routers for data and telecommunication industry and parallel computing architectures for embedded signal processing use optical to electrical conversion to switch packets. This conversion scales down the enormous bandwidth capacity of the optical communication channels to electronic processing rates. To maintain the high throughputs all over the interconnection networks, the optical packets need to be maintained in optical state and switched to different part of the interconnection network. To achieve this goal, all-optical packet switching architectures are studied. The study is concluded with a positive outlook towards alloptical switching technologies, and it will play a very important role in the near future in the field of optical communication, telecommunication and embedded systems.

Page generated in 0.2045 seconds