Global ETD Search

1	Performance Evaluation of the On-Chip Communications in a Network-on-Chip System Hariharan, Sriram 23 May 2005 (has links) No description available. Network-on-Chip
2	Test and fault-tolerance for network-on-chip infrastructures Grecu, Cristian 05 1900 (has links) The demands of future computing, as well as the challenges of nanometer-era VLSI design, will require new design techniques and design styles that are simultaneously high performance, energy-efficient, and robust to noise and process variation. One of the emerging problems concerns the communication mechanisms between the increasing number of blocks, or cores, that can be integrated onto a single chip. The bus-based systems and point-to-point interconnection strategies in use today cannot be easily scaled to accommodate the large numbers of cores projected in the near future. Network-on-chip (NoC) interconnect infrastructures are one of the key technologies that will enable the emergence of many-core processors and systems-on-chip with increased computing power and energy efficiency. This dissertation is focused on testing, yield improvement and fault-tolerance of such NoC infrastructures. A fast, efficient test method is developed for NoCs, that exploits their inherent parallelism to reduce the test time by transporting test data on multiple paths and testing multiple NoC components concurrently. The improvement of test time varies, depending on the NoC architecture and test transport protocol, from 2X to 34X, compared to current NoC test methods. This test mechanism is used subsequently to perform detection of NoC link permanent faults, which are then repaired by an on-chip mechanism that replaces the faulty signal lines with fault-free ones, thereby increasing the yield, while maintaining the same wire delay characteristics. The solution described in this dissertation improves significantly the achievable yield of NoC inter-switch channels â from 4% improvement for an 8-bit wide channel, to a 71% improvement for a 128-bit wide channel. The direct benefit is an improved fault-tolerance and increased yield and long-term reliability of NoC based multicore systems. Network-on-chip Fault tolerance
3	Test and fault-tolerance for network-on-chip infrastructures Grecu, Cristian 05 1900 (has links) The demands of future computing, as well as the challenges of nanometer-era VLSI design, will require new design techniques and design styles that are simultaneously high performance, energy-efficient, and robust to noise and process variation. One of the emerging problems concerns the communication mechanisms between the increasing number of blocks, or cores, that can be integrated onto a single chip. The bus-based systems and point-to-point interconnection strategies in use today cannot be easily scaled to accommodate the large numbers of cores projected in the near future. Network-on-chip (NoC) interconnect infrastructures are one of the key technologies that will enable the emergence of many-core processors and systems-on-chip with increased computing power and energy efficiency. This dissertation is focused on testing, yield improvement and fault-tolerance of such NoC infrastructures. A fast, efficient test method is developed for NoCs, that exploits their inherent parallelism to reduce the test time by transporting test data on multiple paths and testing multiple NoC components concurrently. The improvement of test time varies, depending on the NoC architecture and test transport protocol, from 2X to 34X, compared to current NoC test methods. This test mechanism is used subsequently to perform detection of NoC link permanent faults, which are then repaired by an on-chip mechanism that replaces the faulty signal lines with fault-free ones, thereby increasing the yield, while maintaining the same wire delay characteristics. The solution described in this dissertation improves significantly the achievable yield of NoC inter-switch channels â from 4% improvement for an 8-bit wide channel, to a 71% improvement for a 128-bit wide channel. The direct benefit is an improved fault-tolerance and increased yield and long-term reliability of NoC based multicore systems. Network-on-chip Fault tolerance
4	A Configurable Router for Embedded Network-on-Chip Support in Field-Programmable Gate Arrays Pau, Ronny 27 September 2008 (has links) The scaling of VLSI technology has allowed extensive integration of processing resources on a single chip. Consequently, programmable chips is able to have a high logic and memory capacity for implementation of complex systems. Field-programmable gate arrays (FPGAs) with their embedded memory and other specialized functionality have become viable alternatives in many cases to costly application-specific integrated circuits as a system-on-chip (SoC) substrate. However, on-chip bus-based interconnects are no longer suitable for complex SoC design because of its limited scalability. The network-on-chip (NoC)paradigm has therefore emerged as a scalable approach for addressing this challenge. FPGAs can also adopt the NoC paradigm in order to support more complex SoC implementations. The elements for NoC support can be implemented in conventional programmable logic within an FPGA, however, a dedicated approach for these NoC elements can lead to better performance and more efficient utilization of on-chip FPGA resources. A fixed network topology can be a disadvantage in NoC platforms due to misalignment with application requirements. It is therefore desirable to incorporate a certain level of configurability even for embedded NoC support within an FPGA. This thesis presents the design and implementation of a configurable router intended as a dedicated embedded module for NoC support in an FPGA. The goal is to provide a general NoC infrastructure for the FPGA platform that balances trade-offs with regard to logic complexity, resource utilization, and flexibility. The configurable router provides flexibility in implementing a variety of network topologies with the convenience of a 3-bit input to the router for configuration. All of the necessary routing functionality for each topology is implemented in logic for performance and area efficiency. The overall router design provides general NoC support with reduced complexity, thereby achieving area efficiency and an adequate clock frequency for typical operation in conjunction with embedded soft processors. Synthesis results are presented at the router level in order to characterize the hardware overhead for implementations in programmable logic as well as standard-cell technology, and at the system-level in order to evaluate overall system resource utilization. Operational results are shown at router level to demonstrate correctness and at system level to demonstrate functionality of the multiprocessor systems that utilizes the configurable router. / Thesis (Master, Electrical & Computer Engineering) -- Queen's University, 2008-09-24 23:24:01.907 FPGA multiprocessing network-on-chip
5	Efficient Design and Clocking for a Network-on-Chip Mandal, Ayan 03 October 2013 (has links) As VLSI fabrication technology scales, an increasing number of processing elements (cores) on a chip makes on-chip communication a new performance bottleneck. The Network-on-Chip (NoC) paradigm has emerged as an efficient and scalable infrastructure to handle the communication needs for such multi-core systems. In most existing NoCs, design decisions are made assuming that the NoC operates at the same or lower clock speed as the cores, which slows down the communication system. A major challenge in designing a high speed NoC is the difficulty of distributing a high speed, low power clock across the chip. In this dissertation, we first propose several techniques to address the issue of distributing a high-speed, low power, low jitter clock across the IC. We primarily focus our attention on resonant standing wave oscillators (SWOs), which have recently emerged as a promising technique for high-speed, low power clock generation. In addition, we also present a dynamic programming based approach to synthesize a low jitter, low power buffered H-tree for clock distribution. In the second part of this dissertation, we use these efficient clock distribution schemes to present a novel fast NoC design that relies on source synchronous data transfer over a ring. In our source-synchronous design, the clock and data NoC are routed in parallel yielding a fast, robust design. Architectural simulations on synthetic and real traffic show that our source-synchronous NoC designs can provide significantly lower latency while achieving the same or better bandwidth compared to a state of the art mesh, while consuming lower area. The fact that the our ring-based NoC runs significantly faster than the mesh contributes to these improvements. Moreover, since our proposed NoC designs are fully synchronous, they are very amenable to testing as well. In the last part of this dissertation, we explore an alternate scheme of achieving high-speed on-chip data transfer using sinusoidal signals of different frequencies. The key advantage of our method is the ability to superimpose such sinusoids and thereby effectively send multiple logic values along the same wire in a clock cycle. Experimental results show that for the same throughput as that of a traditional scheme, we require significantly fewer wires. Network-on-Chip Clock VLSI
6	High Performance Interconnect System Design for Future Chip Multiprocessors Wang, Lei 03 October 2013 (has links) Chip Multi-Processor (CMP) architectures have become mainstream for designing processors. With a large number of cores, Network-On-Chip (NOC) provides a scalable communication method for CMP architectures. NOC must be carefully designed to meet constraints of power and area, and provide ultra low latencies and high throughput. In this research, we explore different techniques to design high performance NOC. First, existing NOCs mostly use Dimension Order Routing (DOR) to determine the route taken by a packet in unicast traffic. However, with the development of diverse applications in CMPs, one-to-many (multicast) and one-to-all (broadcast) traffic are becoming more common. Current unicast routing cannot support multi-cast and broadcast traffic efficiently. We propose Recursive Partitioning Multicast (RPM) routing and a detailed multicast wormhole router design for NOCs. RPM allows routers to select intermediate replication nodes based on the global distribution of destination nodes. This provides more path diversities, thus achieves more bandwidth-efficiency and finally improves the performance of the whole network. Second, as feature size is shrinking, wires are becoming abundant resources available in NOC. Since NOC can benefit from high wire density due to no limits on the number of pins and faster signaling rates, it is very critical in the NOC router design to find a way that fully utilizes the wire resources to provide high performance. We propose an Adaptive Physical Channel Regulator (APCR) for NOC routers to exploit huge wiring resources. The flit size in an APCR router is less than the physical channel width (phit size) to provide finer granularity flow control. An APCR router allows flits from different packets or flows to share the same physical channel in a single cycle. The three regulation schemes (Monopolizing, Fair-sharing and Channel-stealing) intelligently allocate the output channel resources considering not only the availability of physical channels but the occupancy of input buffers. In an APCR router, each Virtual Channel can forward a dynamic number of flits every cycle depending on the run-time network status. Third, nanophotonics has been proposed to design low latency and high band- width NOC for future CMPs. Recent nanophotonic NOC designs adopt the token- based arbitration coupled with credit-based flow control, which leads to low band- width utilization. We propose two handshake schemes for nanophotonic interconnects in CMPs, Global Handshake (GHS) and Distributed Handshake (DHS), which get rid of the traditional credit-based flow control, reduce the average token waiting time, and finally improve the network throughput. Furthermore, we enhance the basic handshake schemes with setaside buffer and circulation techniques to overcome the Head-Of-Line (HOL) blocking. Network-On-Chip Chip Multiprocessors
7	Test and fault-tolerance for network-on-chip infrastructures Grecu, Cristian 05 1900 (has links) The demands of future computing, as well as the challenges of nanometer-era VLSI design, will require new design techniques and design styles that are simultaneously high performance, energy-efficient, and robust to noise and process variation. One of the emerging problems concerns the communication mechanisms between the increasing number of blocks, or cores, that can be integrated onto a single chip. The bus-based systems and point-to-point interconnection strategies in use today cannot be easily scaled to accommodate the large numbers of cores projected in the near future. Network-on-chip (NoC) interconnect infrastructures are one of the key technologies that will enable the emergence of many-core processors and systems-on-chip with increased computing power and energy efficiency. This dissertation is focused on testing, yield improvement and fault-tolerance of such NoC infrastructures. A fast, efficient test method is developed for NoCs, that exploits their inherent parallelism to reduce the test time by transporting test data on multiple paths and testing multiple NoC components concurrently. The improvement of test time varies, depending on the NoC architecture and test transport protocol, from 2X to 34X, compared to current NoC test methods. This test mechanism is used subsequently to perform detection of NoC link permanent faults, which are then repaired by an on-chip mechanism that replaces the faulty signal lines with fault-free ones, thereby increasing the yield, while maintaining the same wire delay characteristics. The solution described in this dissertation improves significantly the achievable yield of NoC inter-switch channels â from 4% improvement for an 8-bit wide channel, to a 71% improvement for a 128-bit wide channel. The direct benefit is an improved fault-tolerance and increased yield and long-term reliability of NoC based multicore systems. / Applied Science, Faculty of / Electrical and Computer Engineering, Department of / Graduate Network-on-chip Fault tolerance
8	Uma rede Ethernet on chip parametrizável para aplicações DSP em FPGA / An Ethernet network on configurable DSP chip for applications in FPGA Cunha Junior, Hélio Fernandes da 03 June 2015 (has links) Com o crescimento acelerado da complexidade das aplicações e softwares que exigem alto desempenho, o hardware e sua arquitetura passou por algumas mudanças para que pudesse atender essa necessidade. Uma das abordagens propostas e desenvolvidas para suportar essas aplicações, foi a integração de mais de um core de processamento em um único circuito integrado. Inicialmente, a comunicação utilizando barramento foi escolhida, pela sua vantagem de reuso comparado a ponto a ponto. No entanto, com o aumento acelerado da quantidade de cores nos Systems-on-Chip (SoC), essa abordagem passou a apresentar problemas para suportar a comunicação interna. Uma alternativa que vem sendo explorada é a Network-on-Chip (NoC), uma abordagem que propõe utilizar o conhecimento de redes comuns em projetos de comunicação interna de SoC. Esse trabalho fornece uma arquitetura de NoC completa, configurável, parametrizável e no padrão Ethernet. Os três módulos básicos da NoC, Network Adapter (NA), Link e Switch, são implementados e disponibilizados. Os resultados foram obtidos utilizando o FPGA Stratix IV da Altera. As métricas de desempenho utilizadas para validação da NoC são a área no FPGA e o atraso na comunicação. Os parâmetros disponibilizados são referentes as configurações dos módulos desenvolvidos, considerando características apresentadas de aplicações DSP (Digital Signal Processing). O experimento utilizando dois NAs, dois cores e um Switch precisou de 7310 ALUTs do FPGA EP4SGX230KF40C2ES o que corresponde a 4% dos seus recursos lógicos. O tempo gasto para a transmissão de um quadro ethernet de 64 Bytes foi de 422 ciclos de clock a uma frequência de 50MHz. / With the accelerated growth of the complexity of the software and applications that require high performance, hardware and its architecture has undergone a few changes so it could meet that need. One of the proposals and approaches developed to support these applications, was the integration of more than one core processing in a single integrated circuit. Initially, the bus communication architecture was chosen, using for its reuse benefit compared to point-to-point. However, with the cores number increase in Systems-on-Chip (SoC), this approach began to present problems to support internal communication. An alternative that has been explored is the Network-on-Chip (NoC), an approach that proposes to use knowledge of common networks on internal communication projects of SOC. This dissertation focuses is to provide a complete NoC architecture, configurable, customizable and on standard Ethernet. The three NoC basic modules, Network Adapter (NA), Link and Switch, are implemented. The results were obtained using the Stratix IV FPGA. The performance metrics used for NoC validation are silicon area and latency. The available parameters are related to developed modules settings, considering features presented of DSP applications. The experiment using two NA, two cores and one Switch needed 7310 FPGA ALUTs which corresponds to 4% of their logical resources. The time for the transmission of an ethernet frame of 64 Bytes was 422 clock cycles at 50 MHz. DSP DSP Ethernet Ethernet FPGA FPGA Network-on-chip Network-on-chip
9	Uma rede Ethernet on chip parametrizável para aplicações DSP em FPGA / An Ethernet network on configurable DSP chip for applications in FPGA Hélio Fernandes da Cunha Junior 03 June 2015 (has links) Com o crescimento acelerado da complexidade das aplicações e softwares que exigem alto desempenho, o hardware e sua arquitetura passou por algumas mudanças para que pudesse atender essa necessidade. Uma das abordagens propostas e desenvolvidas para suportar essas aplicações, foi a integração de mais de um core de processamento em um único circuito integrado. Inicialmente, a comunicação utilizando barramento foi escolhida, pela sua vantagem de reuso comparado a ponto a ponto. No entanto, com o aumento acelerado da quantidade de cores nos Systems-on-Chip (SoC), essa abordagem passou a apresentar problemas para suportar a comunicação interna. Uma alternativa que vem sendo explorada é a Network-on-Chip (NoC), uma abordagem que propõe utilizar o conhecimento de redes comuns em projetos de comunicação interna de SoC. Esse trabalho fornece uma arquitetura de NoC completa, configurável, parametrizável e no padrão Ethernet. Os três módulos básicos da NoC, Network Adapter (NA), Link e Switch, são implementados e disponibilizados. Os resultados foram obtidos utilizando o FPGA Stratix IV da Altera. As métricas de desempenho utilizadas para validação da NoC são a área no FPGA e o atraso na comunicação. Os parâmetros disponibilizados são referentes as configurações dos módulos desenvolvidos, considerando características apresentadas de aplicações DSP (Digital Signal Processing). O experimento utilizando dois NAs, dois cores e um Switch precisou de 7310 ALUTs do FPGA EP4SGX230KF40C2ES o que corresponde a 4% dos seus recursos lógicos. O tempo gasto para a transmissão de um quadro ethernet de 64 Bytes foi de 422 ciclos de clock a uma frequência de 50MHz. / With the accelerated growth of the complexity of the software and applications that require high performance, hardware and its architecture has undergone a few changes so it could meet that need. One of the proposals and approaches developed to support these applications, was the integration of more than one core processing in a single integrated circuit. Initially, the bus communication architecture was chosen, using for its reuse benefit compared to point-to-point. However, with the cores number increase in Systems-on-Chip (SoC), this approach began to present problems to support internal communication. An alternative that has been explored is the Network-on-Chip (NoC), an approach that proposes to use knowledge of common networks on internal communication projects of SOC. This dissertation focuses is to provide a complete NoC architecture, configurable, customizable and on standard Ethernet. The three NoC basic modules, Network Adapter (NA), Link and Switch, are implemented. The results were obtained using the Stratix IV FPGA. The performance metrics used for NoC validation are silicon area and latency. The available parameters are related to developed modules settings, considering features presented of DSP applications. The experiment using two NA, two cores and one Switch needed 7310 FPGA ALUTs which corresponds to 4% of their logical resources. The time for the transmission of an ethernet frame of 64 Bytes was 422 clock cycles at 50 MHz. DSP Ethernet FPGA Network-on-chip DSP Ethernet FPGA Network-on-chip
10	Synthetic Traffic Models that Capture Cache Coherent Behaviour Badr, Mario 24 June 2014 (has links) Modern and future many-core systems represent large and complex architectures. The communication fabrics in these large systems play an important role in their performance and power consumption. Current simulation methodologies for evaluating networks-on-chip (NoCs) are not keeping pace with the increased complexity of our systems; architects often want to explore many different design knobs quickly. Methodologies that trade-off some accuracy but maintain important workload trends for faster simulation times are highly beneficial at early stages of architectural exploration. We propose a synthetic traffic generation methodology that captures both application behaviour and cache coherence traffic to rapidly evaluate NoCs. This allows designers to quickly indulge in detailed performance simulations without the cost of long-running full system simulation but still capture a full range of application and coherence behaviour. Our methodology has an average (geometric) error of 10.9% relative to full system simulation, and provides 50x speedup on average over full system simulation. Network on Chip Performance Modelling 0537

Search results