Spelling suggestions: "subject:"On-Chip betworks"" "subject:"On-Chip conetworks""
1 |
Asynchronous Bypass Channels Improving Performance for Multi-synchronous Network-on-chipsJain, Tushar Naveen Kumar 2010 August 1900 (has links)
Dr. Paul V. Gratz Network-on-Chip (NoC) designs have emerged as a replacement for traditional shared-bus designs for on-chip communications. As with all current VLSI design, however, reducing power consumption in NoCs is a critical challenge. One approach to reduce power is to dynamically scale the voltage and frequency of each network node or groups of nodes (DVFS). Another approach to reduce power consumption is to replace the balanced clock tree with a globally-asynchronous, locally-synchronous (GALS) clocking scheme. NoCs implemented with either of these schemes, however, tend to have high latencies as packets must be synchronized at the intermediate nodes between source and destination. In this work, we propose a novel router microarchitecture which offers superior performance versus typical synchroniz- ing router designs. Our approach features Asynchronous Bypass Channels (ABCs) at intermediate nodes thus avoiding synchronization delay. We also propose a new network topology and routing algorithm that leverage the advantages of the bypass channel offered by our router design. Our experiments show that our design improves the performance of a conventional synchronizing design with similar resources by up to 26 percent at low loads and increases saturation throughput by up to 50 percent.
|
2 |
Fpga Implementation Of A Network-on-chipKilinc, Ismail Ozsel 01 September 2011 (has links) (PDF)
This thesis aims to design a Network-on-Chip (NoC) that performs wormhole flow control method and source routing and aims to describe the design in VHDL language and implement it on an FPGA platform. In order to satisfy the diverse needs of different network traffic, the thesis aims to design the NoC in such a way that it can be modified via a user interface, which changes the descriptions in the VHDL source code. Network topology, number of router ports, number of virtual channels, buffer size and flit size are the features of the designed NoC that can be modified. In this thesis, interfaces and operations of the blocks in the NoC are defined through block diagrams and algorithmic state machines. Verification of these blocks is performed not only on computer environment via simulations tools, but also in real world. To achieve this, source nodes generating dummy flits are also designed which communicate with our user interface via RS-232 generating flits according to the information provided by the user and monitoring the received flits from other source nodes in real-time.
|
3 |
GCA: Global Congestion Awareness for Load Balance in Networks-on-ChipRamakrishna, Mukund 2012 August 1900 (has links)
As modern CMPs scale to ever increasing core counts, Networks-on-Chip (NoCs) are emerging as an interconnection fabric, enabling communication between components. While NoCs are easy to implement and provide high and scalable bandwidth, current routing algorithms, such as dimension-ordered routing, suffer from poor load balance, leading to reduced throughput and high latencies. Improving load balance, hence, is critical in future CMP designs where increased latency leads to wasted power and energy waiting for outstanding requests to resolve. Adaptive routing is a known technique to improve load balance; however, prior adaptive routing techniques either use local, myopic information or misinformed, regionally-aggregated information to form their routing decisions. This thesis proposes a new, light-weight, adaptive routing algorithm for on-chip routers based on global link state and congestion information, Global Congestion Awareness (GCA). GCA leverages unused bits in existing packet header flits to "piggyback" congestion state information around the network and uses a simple, low-complexity route calculation unit, to calculate optimal packet paths to their destination without the myopia of local decisions, nor the aggregation of unrelated status information, found in prior designs. In particular GCA outperforms local adaptive routing by up to 82%, Regional Congestion Awareness (RCA) by up to 51%, and a recent competing adaptive routing algorithm, DAR, by 8% on average on realistic workloads.
|
4 |
Floorplan-Aware High Performance NoC DesignRoca Pérez, Antoni 20 November 2012 (has links)
Las actuales arquitecturas de m�ltiples n�cleos como los chip multiprocesadores (CMP) y soluciones multiprocesador para sistemas dentro del chip (MPSoCs) han adoptado a las redes dentro del chip (NoC) como elemento -ptimo para la inter-conexi-n de los diversos elementos de dichos sistemas. En este sentido, fabricantes de CMPs y MPSoCs han adoptado NoCs sencillas, generalmente con una topolog'a en malla o anillo, ya que son suficientes para satisfacer las necesidades de los sistemas actuales. Sin embargo a medida que los requerimientos del sistema -- baja latencia y alto rendimiento -- se hacen m�s exigentes, estas redes tan simples dejan de ser una soluci-n real. As', la comunidad investigadora ha propuesto y analizado NoCs m�s complejas. No obstante, estas soluciones son m�s dif'ciles de implementar -- especialmente los enlaces largos -- haciendo que este tipo de topolog'as complejas sean demasiado costosas o incluso inviables.
En esta tesis, presentamos una metodolog'a de dise-o que minimiza la p�rdida de prestaciones de la red debido a su implementaci-n real. Los principales problemas que se encuentran al implementar una NoC son los conmutadores y los enlaces largos. En esta tesis, el conmutador se ha hecho modular, es decir, formado como uni-n de m-dulos m�s peque-os. En nuestro caso, los m-dulos son id�nticos, donde cada m-dulo es capaz de arbitrar, conmutar, y almacenar los mensajes que le llegan. Posteriormente, flexibilizamos la colocaci-n de estos m-dulos en el chip, permitiendo que m-dulos de un mismo conmutador est�n distribuidos por el chip.
Esta metodolog'a de dise-o la hemos aplicado a diferentes escenarios. Primeramente, hemos introducido nuestro conmutador modular en NoCs con topolog'as conocidas como la malla 2D. Los resultados muestran como la modularidad y la distribuci-n del conmutador reducen la latencia y el consumo de potencia de la red.
En segundo lugar, hemos utilizado nuestra metodolog'a de dise-o para implementar un crossbar distribuid / Roca Pérez, A. (2012). Floorplan-Aware High Performance NoC Design [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/17844
|
5 |
B-RPM: An Efficient One-to-Many Communication Framework for On-Chip NetworksShaukat, Noman 2012 August 1900 (has links)
The prevalence of multicore architectures has accentuated the need for scalable on-chip communication media. Various parallel applications and programming paradigms use a mix of unicast (one-to-one) and multicast (one-to-many) to maintain data coherence and consistency. Providing efficient support for these communication patterns becomes a critical design point for on-chip networks (OCN). High performance on-chip networks design advocates balanced traffic across the whole network, which makes adaptive routing appealing. Adaptive routing explores the path diversity of the network, increases throughput, and reduces network latency compared with oblivious routing.
In this work, we propose an adaptive multicast routing, Balanced Recursive Partitioning Multicast (B-RPM), to achieve balanced one-to-many on-chip communication. The algorithm derives its functionality from previously proposed algorithm Recursive Partitioning Multicast (RPM). Unlike RPM which uses fixed set of directional priorities and position of destination nodes, B-RPM replicates packet based on the local congestion information and position of destination nodes with respect to current node. B-RPM employs a new deadlock avoidance technique Dynamically Sized Virtual Networks (DSVN). Built upon the traditional virtual networks, DSVN dynamically allocates the network resources to different VNs according to the run-time traffic status, which delivers better resources utilization. We also propose a new scheme for representing multiple destinations in packet head. The scheme works simply by differentiating multicast and unicast packets. The algorithm combined with dynamically sized virtual networks enables us to improve network performance at high load on average by 20% (up to 50%) and saturation throughput of network on average by 10% (up to 18%) over the most recent multicast algorithm. Also the new header representation scheme enables us to save 24% of dynamic link power.
|
6 |
Exploring trade-offs between Latency and Throughput in the Nostrum Network on ChipNilsson, Erland January 2006 (has links)
<p>During the past years has the Nostrum Network on Chip <i>(NoC)</i> been developed to become a competitive platform for network based on-chip communication. The Nostrum NoC provides a versatile communication platform to connect a large number of intellectual properties <i>(IP) </i>on a single chip. The communication is based on a packet switched network which aims for a small physical footprint while still providing a low communication overhead. To reduce the communication network size, a queue-less network has been the research focus. This network uses de ective hot-potato routing which is implemented to perform routing decisions in a single clock cycle.</p><p>Using a platform like this results in increased design reusability, validated signal integrity, and well developed test strategies, in contrast to a fully customised designs which can have a more optimal communication structure but has a significantly longer development cycle to verify the new design accordingly.</p><p>Several factors are considered when designing a communication platform. The goal is to create a platform which provides low communication latency, high throughput, low power consumption, small footprint, and low design, verification, and test overhead. Proximity Congestion Awareness is one technique that serves to reduce</p><p>the network load. This leads to that the latency is reduced which also increases the network throughput. Another technique is to implement low latency paths called<i> Data Motorways</i> achieved through a clocking method called Globally Pseudochronous Locally Synchronous clocking. Furthermore, virtual circuits can be used to provide guarantees on latency and throughput. Such guarantees are dificult in</p><p>hot-potato networks since network access has to be ensured. A technique that implements virtual circuits use looped containers that are circulating on a predefined circuit. Several overlapping virtual circuits are possible by allocating the virtual circuits in different Temporally Disjoint Networks.</p><p>This thesis summarise the impact the presented techniques and methods have on the characteristics on the Nostrum model. It is possible to reduce the network load by a factor of 20 which reduces the communication latency. This is done by distributing load information between the Switches in the network. Data Motorways</p><p>can reduce the communication latency with up to 50% in heavily loaded networks. Such latency reduction results in freed buffer space in the Switch registers which allows the traffic rate to be increased with about 30%.</p>
|
7 |
Exploring trade-offs between Latency and Throughput in the Nostrum Network on ChipNilsson, Erland January 2006 (has links)
During the past years has the Nostrum Network on Chip (NoC) been developed to become a competitive platform for network based on-chip communication. The Nostrum NoC provides a versatile communication platform to connect a large number of intellectual properties (IP) on a single chip. The communication is based on a packet switched network which aims for a small physical footprint while still providing a low communication overhead. To reduce the communication network size, a queue-less network has been the research focus. This network uses de ective hot-potato routing which is implemented to perform routing decisions in a single clock cycle. Using a platform like this results in increased design reusability, validated signal integrity, and well developed test strategies, in contrast to a fully customised designs which can have a more optimal communication structure but has a significantly longer development cycle to verify the new design accordingly. Several factors are considered when designing a communication platform. The goal is to create a platform which provides low communication latency, high throughput, low power consumption, small footprint, and low design, verification, and test overhead. Proximity Congestion Awareness is one technique that serves to reduce the network load. This leads to that the latency is reduced which also increases the network throughput. Another technique is to implement low latency paths called Data Motorways achieved through a clocking method called Globally Pseudochronous Locally Synchronous clocking. Furthermore, virtual circuits can be used to provide guarantees on latency and throughput. Such guarantees are dificult in hot-potato networks since network access has to be ensured. A technique that implements virtual circuits use looped containers that are circulating on a predefined circuit. Several overlapping virtual circuits are possible by allocating the virtual circuits in different Temporally Disjoint Networks. This thesis summarise the impact the presented techniques and methods have on the characteristics on the Nostrum model. It is possible to reduce the network load by a factor of 20 which reduces the communication latency. This is done by distributing load information between the Switches in the network. Data Motorways can reduce the communication latency with up to 50% in heavily loaded networks. Such latency reduction results in freed buffer space in the Switch registers which allows the traffic rate to be increased with about 30%. / QC 20101122
|
Page generated in 0.2288 seconds