Global ETD Search

11	An Exploration of On-chip Network-based Thread Migration Matthew, Misler 12 January 2011 (has links) As the number of cores integrated on a single chip continues to increase, communication has the potential to become a severe bottleneck to overall system performance. The presence of thread sharing and the distribution of data across cache banks on the chip can result in long distance communication. Long distance communication incurs substantial latency that impacts performance; furthermore, this communication consumes significant dynamic power when packets are switched over many Network-on-Chip (NoC) links and routers. Thread migration can mitigate problems created by long distance communication. This thesis presents Moths, which stands for Mobile Threads. Moths is an efficient run-time algorithm that responds automatically to dynamic NoC traffic patterns, providing beneficial thread migration to decrease overall traffic volume and average packet latency. Moths reduces latency by up to 28.4% (18.0% on average) and traffic volume by up to 24.9% (20.6% on average) across a variety of commercial and scientific benchmarks. Network on Chip Thread migration 0544
12	An Exploration of On-chip Network-based Thread Migration Matthew, Misler 12 January 2011 (has links) As the number of cores integrated on a single chip continues to increase, communication has the potential to become a severe bottleneck to overall system performance. The presence of thread sharing and the distribution of data across cache banks on the chip can result in long distance communication. Long distance communication incurs substantial latency that impacts performance; furthermore, this communication consumes significant dynamic power when packets are switched over many Network-on-Chip (NoC) links and routers. Thread migration can mitigate problems created by long distance communication. This thesis presents Moths, which stands for Mobile Threads. Moths is an efficient run-time algorithm that responds automatically to dynamic NoC traffic patterns, providing beneficial thread migration to decrease overall traffic volume and average packet latency. Moths reduces latency by up to 28.4% (18.0% on average) and traffic volume by up to 24.9% (20.6% on average) across a variety of commercial and scientific benchmarks. Network on Chip Thread migration 0544
13	Synthetic Traffic Models that Capture Cache Coherent Behaviour Badr, Mario 24 June 2014 (has links) Modern and future many-core systems represent large and complex architectures. The communication fabrics in these large systems play an important role in their performance and power consumption. Current simulation methodologies for evaluating networks-on-chip (NoCs) are not keeping pace with the increased complexity of our systems; architects often want to explore many different design knobs quickly. Methodologies that trade-off some accuracy but maintain important workload trends for faster simulation times are highly beneficial at early stages of architectural exploration. We propose a synthetic traffic generation methodology that captures both application behaviour and cache coherence traffic to rapidly evaluate NoCs. This allows designers to quickly indulge in detailed performance simulations without the cost of long-running full system simulation but still capture a full range of application and coherence behaviour. Our methodology has an average (geometric) error of 10.9% relative to full system simulation, and provides 50x speedup on average over full system simulation. Network on Chip Performance Modelling 0537
14	Uma abordagem meta-heurística para o mapeamento de tarefas em uma plataforma MPSoC baseada em NoC FARIAS, Max Santana Rolemberg 31 January 2014 (has links) Submitted by Nayara Passos (nayara.passos@ufpe.br) on 2015-03-13T12:04:17Z No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) TESE Max Santana Rolemberg Farias.pdf: 3331146 bytes, checksum: aafe22682c1e4d4144f19615252785b9 (MD5) / Approved for entry into archive by Daniella Sodre (daniella.sodre@ufpe.br) on 2015-03-13T13:23:12Z (GMT) No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) TESE Max Santana Rolemberg Farias.pdf: 3331146 bytes, checksum: aafe22682c1e4d4144f19615252785b9 (MD5) / Made available in DSpace on 2015-03-13T13:23:12Z (GMT). No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) TESE Max Santana Rolemberg Farias.pdf: 3331146 bytes, checksum: aafe22682c1e4d4144f19615252785b9 (MD5) Previous issue date: 2014 / CNPq, FACEPE / O crescente número de tarefas em execução em plataformas Multiprocessor Systemson- Chips (MPSoC) exige mais e mais processadores e as plataformas MPSoC que utilizam o meio de comunicação tradicional (barramento) possui uma largura de banda limitada e não são escaláveis para projetos de alta performance. Nesse sentido, os MPSoC baseados em Networkon- Chip (NoC) foram propostos para resolver estas limitações. Um dos principais problemas em plataformas MPSoC baseado em NoC é o custo de comunicação, pois esse custo de comunicação depende do mapeamento de tarefas nos processadores. Este trabalho apresenta uma abordagem que utiliza uma meta-heurística para atribuir um conjunto de tarefas a um conjunto de Processing Element (PE) em um MPSoC baseado em NoC. Esta abordagem proposta avalia e otimiza o mapeamento de tarefas de aplicações e, em alguns experimentos, os resultados foram comparados com o pior e o melhor mapeamento do espaço de projeto. Os resultados encontrados durante os experimentos mostram uma redução média de energia de 47% quando é utilizado o mecanismo de agrupamento de tarefas e 44% quando o mecanismo de agrupamento é desligado. Network on Chip MPSoC Mapeamento de Tarefas
15	Network on Chip : Performance Bound and Tightness Zhao, Xueqian January 2015 (has links) Featured with good scalability, modularity and large bandwidth, Network-on-Chip (NoC) has been widely applied in manycore Chip Multiprocessor (CMP) and Multiprocessor System-on-Chip (MPSoC) architectures. The provision of guaranteed service emerges as an important NoC design problem due to the application requirements in Quality-of-Service (QoS). Formal analysis of performance bounds plays a critical role in ensuring guaranteed service of NoC by giving insights into how the design parameters impact the network performance. The study in this thesis proposes analysis methods for delay and backlog bounds with Network Calculus (NC). Based on xMAS (eXecutable Micro-Architectural Speciﬁcation), a formal framework to model communication fabrics, the delay bound analysis procedure is presented using NC. The micro-architectural xMAS representation of a canonical on-chip router is proposed with both the data ﬂow and control ﬂow well captured. Furthermore, a well-deﬁned xMAS model for a speciﬁc application on an NoC can be created with network and ﬂow knowledge and then be mapped to corresponding NC analysis model for end-to-end delay bound calculation. The xMAS model eﬀectively bridges the gap between the informal NoC micro-architecture and the formal analysis model. Besides delay bound, the analysis of backlog bound is also crucial for predicting buﬀer dimensioning boundary in on-chip Virtual Channel (VC) routers. In this thesis, basic buﬀer use cases are identiﬁed with corresponding analysis models proposed so as to decompose the complex ﬂow contention in a network. Then we develop a topology independent analysis technique to convey the backlog bound analysis step by step. Algorithms are developed to automate this analysis procedure. Accompanying the analysis of performance bounds, tightness evaluation is an essential step to ensure the validity of the analysis models. However, this evaluation process is often a tedious, time-consuming, and manual simulation process in which many simulation parameters may have to be conﬁgured before the simulations run. In this thesis, we develop a heuristics aided tightness evaluation method for the analytical delay and backlog bounds. The tightness evaluation is abstracted as constrained optimization problems with the objectives formulated as implicit functions with respect to the system parameters. Based on the well-deﬁned problems, heuristics can be applied to guide a fully automated conﬁguration searching process which incorporates cycle-accurate bit-accurate simulations. As an example of heuristics, Adaptive Simulated Annealing (ASA) is adopted to guide the search in the conﬁguration space. Experiment results indicate that the performance analysis models based on NC give tight results which are eﬀectively found by the heuristics aided evaluation process even the model has a multidimensional discrete search space and complex constraints. In order to facilitate xMAS modeling and corresponding validation of the performance analysis models, the thesis presents an xMAS tool developed in Simulink. It provides a friendly graphical interface for xMAS modeling and parameter conﬁguring based on the powerful Simulink modeling environment. Hierarchical model build-up and Verilog-HDL code generation are essentially supported to manage complex models and conduct simulations. Attributed to the synthesizable xMAS library and the good extendibility, this xMAS tool has promising use in application speciﬁc NoC design based on the xMAS components. / <p>QC 20150520</p> Network-on-Chip Performance analysis Network Calculus
16	Flit Synchronous Aelite Network on Chip Subburaman, Mahesh Balaji January 2008 (has links) <p> </p><p>The deep sub micron process technology and application convergence increases the design challenges in System-on-Chip (SoC). The traditional bus based on chip communication are not scalable and fails to deliver the performance requirements of the complex SoC. The Network on Chip (NoC) has been emerged as a solution to address these complexities of a efficient, high performance, scalable SoC design. The Aethereal NoC provides the latency and throughput bounds by pipelined timedivision multiplexed (TDM) circuit switching architecture. A global synchronous clock defines the timing for TDM, which is not beneficial for decreasing process geometry and increasing clock frequency. This thesis work focuses on the Aelite NoC architecture. The Aelite NoC offering guaranteed services exploits the complexities of System-on-Chip design with real time requirements. The Aelite NoC implements flit synchronous communication using mesochronous and asynchronous links.</p><p> </p><p> </p> Network on Chip Mesochronous GALS Electrical engineering Elektroteknik
17	Quality-of-service for Network-on-chip-based Smartphone/Tablet Systems-on-chip Feng, Kai 22 November 2012 (has links) Smartphone/tablet Systems-on-Chip (SoCs) integrate increasing number of components to offer more functionality. Capacity and efficiency of data communication between memory and other hardware blocks have become a major concern in the SoC design. To address this concern, we propose to use Network-on-Chip (NoC) architectures, to meet high bandwidth, and low power and area demands. We propose a Quality-of-Service (QoS) scheme to differentially provision network resources to cater to different performance requirements by different hardware blocks. Implementation and evaluation are performed on a simulation infrastructure we construct specifically for this type of SoCs. We demonstrate, via simulation results, that the proposed Dynamic QoS schemes can achieve better bandwidth provisioning, with good area and power efficiencies. Quality-of-Service Network-on-Chip Smartphone SoC 0544
18	Quality-of-service for Network-on-chip-based Smartphone/Tablet Systems-on-chip Feng, Kai 22 November 2012 (has links) Smartphone/tablet Systems-on-Chip (SoCs) integrate increasing number of components to offer more functionality. Capacity and efficiency of data communication between memory and other hardware blocks have become a major concern in the SoC design. To address this concern, we propose to use Network-on-Chip (NoC) architectures, to meet high bandwidth, and low power and area demands. We propose a Quality-of-Service (QoS) scheme to differentially provision network resources to cater to different performance requirements by different hardware blocks. Implementation and evaluation are performed on a simulation infrastructure we construct specifically for this type of SoCs. We demonstrate, via simulation results, that the proposed Dynamic QoS schemes can achieve better bandwidth provisioning, with good area and power efficiencies. Quality-of-Service Network-on-Chip Smartphone SoC 0544
19	Throughput-Efficient Network-on-Chip Router Design with STT-MRAM Narayana, Sagar 1986- 14 March 2013 (has links) As the number of processor cores on a chip increases with the advance of CMOS technology, there has been a growing need of more efficient Network-on-Chip (NoC) design since communication delay has become a major bottleneck in large-scale multicore systems. In designing efficient input buffers of NoC routers for better performance and power efficiency, Spin-Torque Transfer Magnetic RAM (STT-MRAM) is regarded as a promising solution due to its nature of high density and near-zero leakage power. Previous work that adopts STT-MRAM in designing NoC router input buffer shows a limitation in minimizing the overhead of power consumption, even though it succeeds to some degree in achieving high network throughput by the use of SRAM to hide the long write latency of STT-MRAM. In this thesis, we propose a novel input buffer design that depends solely on STT-MRAM without the need of SRAM to maximize the benefits of low leakage power and area efficiency inherent in STT-MRAM. In addition, we introduce power-efficient buffer refreshing schemes synergized with age-based switch arbitration that gives higher priority to older flits to remove unnecessary refreshing operations. On an average, we observed throughput improvements of 16% on synthetic workloads and benchmarks. input buffer router STT-MRAM Network-on-Chip
20	Flit Synchronous Aelite Network on Chip Subburaman, Mahesh Balaji January 2008 (has links) The deep sub micron process technology and application convergence increases the design challenges in System-on-Chip (SoC). The traditional bus based on chip communication are not scalable and fails to deliver the performance requirements of the complex SoC. The Network on Chip (NoC) has been emerged as a solution to address these complexities of a efficient, high performance, scalable SoC design. The Aethereal NoC provides the latency and throughput bounds by pipelined timedivision multiplexed (TDM) circuit switching architecture. A global synchronous clock defines the timing for TDM, which is not beneficial for decreasing process geometry and increasing clock frequency. This thesis work focuses on the Aelite NoC architecture. The Aelite NoC offering guaranteed services exploits the complexities of System-on-Chip design with real time requirements. The Aelite NoC implements flit synchronous communication using mesochronous and asynchronous links. Network on Chip Mesochronous GALS Electrical engineering Elektroteknik

Search results