• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 2
  • 1
  • Tagged with
  • 6
  • 6
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Balancing Performance, Area, and Power in an On-Chip Network

Gold, Brian 06 August 2003 (has links)
Several trends can be observed in modern microprocessor design. Architectures have become increasingly complex while design time continues to dwindle. As feature sizes shrink, wire resistance and delay increase, limiting architects from scaling designs centered around a single thread of execution. Where previous decades have focused on exploiting instruction-level parallelism, emerging applications such as streaming media and on-line transaction processing have shown greater thread-level parallelism. Finally, the increasing gap between processor and off-chip memory speeds has constrained performance of memory-intensive applications. The Single-Chip Message Passing (SCMP) parallel computer sits at the confluence of these trends. SCMP is a tiled architecture consisting of numerous thread-parallel processor and memory nodes connected through a structured interconnection network. Using an interconnection network removes global, ad-hoc wiring that limits scalability and introduces design complexity. However, routing data through general purpose interconnection networks can come at the cost of dedicated bandwidth, longer latency, increased area, and higher power consumption. Understanding the impact architectural decisions have on cost and performance will aid in the eventual adoption of general purpose interconnects. This thesis covers the design and analysis of the on-chip network and its integration with the SCMP system. The result of these efforts is a framework for analyzing on-chip interconnection networks that considers network performance, circuit area, and power consumption. / Master of Science
2

The Development And Hardware Implementation Of A High-speed Adaptable Packet Switch Fabric

Akbaba, Erdem Eyup 01 February 2013 (has links) (PDF)
Routers have to be fast enough to keep pace with increasing traffic data rate because of the increasing need for network bandwidth and processing. The switch fabric component of a router is a combination of hardware and software which moves the incoming packets to the outgoing ports. The access of the input ports to the switch fabric is controlled by a scheduler which affects the overall performance together with the fabric design. In this thesis we investigate two switch fabric and scheduler architectures, the well-known iSlip fabric scheduler and the Byte-Focal switch. We observe that these two architectures have different behaviors under different input traffic load ranges. The novel contribution of this thesis is a combined switch architecture which is composed of these two architectures that are implemented and run in parallel to selectively forward the packets with lower delay to the outputs to achieve an overall lower average delay. The design of the combined switch is carried out on FPGA and simulated. Our results show that the combined architecture has 100% throughput and a lower average delay compared to the Byte-Focal switch and the input-queued switch with iSlip. On the other hand, our combined switch uses more resources in FPGA than individual iSlip and Byte-Focal switch.
3

Arquitetura de uma rede de interconexão com memória compartilhada baseada na topologia crossbar / Architecture of an interconnection network with shared memory based on the topology crossbar.

Fábio Gonçalves Pessanha 22 March 2013 (has links)
Multi-Processor System-on-Chip (MPSoC) possui vários processadores, em um único chip. Várias aplicações podem ser executadas de maneira paralela ou uma aplicação paralelizável pode ser particionada e alocada em cada processador, a fim de acelerar a sua execução. Um problema em MPSoCs é a comunicação entre os processadores, necessária para a execução destas aplicações. Neste trabalho, propomos uma arquitetura de rede de interconexão baseada na topologia crossbar, com memória compartilhada. Esta arquitetura é parametrizável, possuindo N processadores e N módulos de memórias. A troca de informação entre os processadores é feita via memória compartilhada. Neste tipo de implementação cada processador executa a sua aplicação em seu próprio módulo de memória. Através da rede, todos os processadores têm completo acesso a seus módulos de memória simultaneamente, permitindo que cada aplicação seja executada concorrentemente. Além disso, um processador pode acessar outros módulos de memória, sempre que necessite obter dados gerados por outro processador. A arquitetura proposta é modelada em VHDL e seu desempenho é analisado através da execução paralela de uma aplicação, em comparação à sua respectiva execução sequencial. A aplicação escolhida consiste na otimização de funções objetivo através do método de Otimização por Enxame de Partículas (Particle Swarm Optimization - PSO). Neste método, um enxame de partículas é distribuído igualmente entre os processadores da rede e, ao final de cada interação, um processador acessa o módulo de memória de outro processador, a fim de obter a melhor posição encontrada pelo enxame alocado neste. A comunicação entre processadores é baseada em três estratégias: anel, vizinhança e broadcast. Essa aplicação foi escolhida por ser computacionalmente intensiva e, dessa forma, uma forte candidata a paralelização. / Multi-Processor System-on-Chip (MPSoC) has multiple processors in a single chip. Multiple applications can be executed in parallel or a parallelizable application can be partitioned and allocated to each processor in order to accelerate their execution. One problem in MPSoCs is the communication between the processors required to implement these applications. In this work, we propose the architecture of an interconnection network based on the crossbar topology, with shared memory. This architecture is parameterizable, having N processors and N memory modules. The exchange of information between processors is done via shared memory. In this type of implementation each processor executes its application stored in its own memory module. Through the network, all processors have complete access to their own memory modules simultaneously allowing each application to run concurrently. Moreover, a processor can access other memory modules, whenever it needs to retrieve data generated by another processor. The proposed architecture is modelled in VHDL and its performance is analysed by the execution of a parallel aplication, in comparison to its sequencial one. The chosen application consists of optimizing some objetive functions by using the Particle Swarm Optimization method. In this method, particles of a swarm are distributed among the processors and, at the end of each iteration, a processor accesses the memory module of another one in order to obtain the best position found in the swarm. The communication between processors is based on three strategies: ring, neighbourhood and broadcast. This application was chosen due to its computational intensive characteristic and, therefore, a strong candidate for parallelization.
4

Arquitetura de uma rede de interconexão com memória compartilhada baseada na topologia crossbar / Architecture of an interconnection network with shared memory based on the topology crossbar.

Fábio Gonçalves Pessanha 22 March 2013 (has links)
Multi-Processor System-on-Chip (MPSoC) possui vários processadores, em um único chip. Várias aplicações podem ser executadas de maneira paralela ou uma aplicação paralelizável pode ser particionada e alocada em cada processador, a fim de acelerar a sua execução. Um problema em MPSoCs é a comunicação entre os processadores, necessária para a execução destas aplicações. Neste trabalho, propomos uma arquitetura de rede de interconexão baseada na topologia crossbar, com memória compartilhada. Esta arquitetura é parametrizável, possuindo N processadores e N módulos de memórias. A troca de informação entre os processadores é feita via memória compartilhada. Neste tipo de implementação cada processador executa a sua aplicação em seu próprio módulo de memória. Através da rede, todos os processadores têm completo acesso a seus módulos de memória simultaneamente, permitindo que cada aplicação seja executada concorrentemente. Além disso, um processador pode acessar outros módulos de memória, sempre que necessite obter dados gerados por outro processador. A arquitetura proposta é modelada em VHDL e seu desempenho é analisado através da execução paralela de uma aplicação, em comparação à sua respectiva execução sequencial. A aplicação escolhida consiste na otimização de funções objetivo através do método de Otimização por Enxame de Partículas (Particle Swarm Optimization - PSO). Neste método, um enxame de partículas é distribuído igualmente entre os processadores da rede e, ao final de cada interação, um processador acessa o módulo de memória de outro processador, a fim de obter a melhor posição encontrada pelo enxame alocado neste. A comunicação entre processadores é baseada em três estratégias: anel, vizinhança e broadcast. Essa aplicação foi escolhida por ser computacionalmente intensiva e, dessa forma, uma forte candidata a paralelização. / Multi-Processor System-on-Chip (MPSoC) has multiple processors in a single chip. Multiple applications can be executed in parallel or a parallelizable application can be partitioned and allocated to each processor in order to accelerate their execution. One problem in MPSoCs is the communication between the processors required to implement these applications. In this work, we propose the architecture of an interconnection network based on the crossbar topology, with shared memory. This architecture is parameterizable, having N processors and N memory modules. The exchange of information between processors is done via shared memory. In this type of implementation each processor executes its application stored in its own memory module. Through the network, all processors have complete access to their own memory modules simultaneously allowing each application to run concurrently. Moreover, a processor can access other memory modules, whenever it needs to retrieve data generated by another processor. The proposed architecture is modelled in VHDL and its performance is analysed by the execution of a parallel aplication, in comparison to its sequencial one. The chosen application consists of optimizing some objetive functions by using the Particle Swarm Optimization method. In this method, particles of a swarm are distributed among the processors and, at the end of each iteration, a processor accesses the memory module of another one in order to obtain the best position found in the swarm. The communication between processors is based on three strategies: ring, neighbourhood and broadcast. This application was chosen due to its computational intensive characteristic and, therefore, a strong candidate for parallelization.
5

Automated Generation of Round-robin Arbitration and Crossbar Switch Logic

Shin, Eung Seo 25 November 2003 (has links)
The objective of this thesis is to automate the design of round-robin arbiter logic. The resulting arbitration logic is more than 1.8X times faster than the fastest prior state-of-the-art arbitration logic the author could find reported in the literature. The generated arbiter implemented in a single chip is fast enough in 0.25ьm CMOS technology to achieve terabit switching with a single chip computer network switch. Moreover, this arbiter is applicable to crossbar (Xbar) arbitration logic. The generated Xbar, customized according to user specifications, provides multiple communication paths among masters and slaves. As the number of transistors on a single chip increases rapidly, there is a productivity gap between the number of transistors available in a chip and the number of transistors per hour a chip designer designs. One solution to reduce this productivity gap is to increase the use of Silicon Intellectual Property (SIP) cores. However, a SIP core should be customized before being used in a system different than the one for which it was designed. Thus, to reconfigure the SIP core, either an engineer must spend significant effort altering the core by hand or else an enhanced CAD tool can automatically customize the core according to customer specifications. In this thesis, we present SIP generator tools for arbiter and Xbar generation. First, we introduce a Round-robin Arbiter Generator (RAG). The RAG can generate a hierarchical Bus Arbiter (BA) which is faster than all known previous approaches. RAG can also generate a hierarchical Switch Arbiter (SA) which is faster than all known previous approaches. Using a 0.25ьm TSMC standard cell library from LEDA Systems, we show the arbitration time of a 32x32 SA and demonstrate that our SA meets the time constraint to achieve terabit throughput. Furthermore, using a novel token-passing hierarchical arbitration scheme, our 32x32 SA performs better than the Ping-Pong Arbiter and Programmable Priority Encoder by factors of 1.8X and 2.3X, respectively, with less power dissipation. Finally, we present an Xbar switch Generator (X-Gt) tool that automatically configures a crossbar for a multiprocessor System-on-a-Chip (SoC). An Xbar is generated in Register Transfer Level (RTL) Verilog HDL.
6

Wafer-level heterogeneous integration of MEMS actuators

Braun, Stefan January 2010 (has links)
This thesis presents methods for the wafer-level integration of shape memory alloy (SMA) and electrostatic actuators to functionalize MEMS devices. The integration methods are based on heterogeneous integration, which is the integration of different materials and technologies. Background information about the actuators and the integration method is provided. SMA microactuators offer the highest work density of all MEMS actuators, however, they are not yet a standard MEMS material, partially due to the lack of proper wafer-level integration methods. This thesis presents methods for the wafer-level heterogeneous integration of bulk SMA sheets and wires with silicon microstructures. First concepts and experiments are presented for integrating SMA actuators with knife gate microvalves, which are introduced in this thesis. These microvalves feature a gate moving out-of-plane to regulate a gas flow and first measurements indicate outstanding pneumatic performance in relation to the consumed silicon footprint area. This part of the work also includes a novel technique for the footprint and thickness independent selective release of Au-Si eutectically bonded microstructures based on localized electrochemical etching. Electrostatic actuators are presented to functionalize MEMS crossbar switches, which are intended for the automated reconfiguration of copper-wire telecommunication networks and must allow to interconnect a number of input lines to a number of output lines in any combination desired. Following the concepts of heterogeneous integration, the device is divided into two parts which are fabricated separately and then assembled. One part contains an array of double-pole single-throw S-shaped actuator MEMS switches. The other part contains a signal line routing network which is interconnected by the switches after assembly of the two parts. The assembly is based on patterned adhesive wafer bonding and results in wafer-level encapsulation of the switch array. During operation, the switches in these arrays must be individually addressable. Instead of controlling each element with individual control lines, this thesis investigates a row/column addressing scheme to individually pull in or pull out single electrostatic actuators in the array with maximum operational reliability, determined by the statistical parameters of the pull-in and pull-out characteristics of the actuators. / QC20100729

Page generated in 0.0671 seconds