• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 74
  • 26
  • 18
  • 17
  • 16
  • 3
  • 1
  • 1
  • Tagged with
  • 167
  • 73
  • 71
  • 39
  • 33
  • 30
  • 25
  • 25
  • 20
  • 17
  • 17
  • 17
  • 16
  • 16
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

PERFORMANCE EVALUATION OF A MULTI-CLOCK NoC ON FPGA

SWAMINATHAN, VIJAY 08 October 2007 (has links)
No description available.
12

Hybrid Nanophotonic NOC Design for GPGPU

Yuan, Wen 2012 May 1900 (has links)
Due to the massive computational power, Graphics Processing Units (GPUs) have become a popular platform for executing general purpose parallel applications. The majority of on-chip communications in GPU architecture occur between memory controllers and compute cores, thus memory controllers become hot spots and bottle neck when conventional mesh interconnection networks are used. Leveraging this observation, we reduce the network latency and improve throughput by providing a nanophotonic ring network which connects all memory controllers. This new interconnection network employs a new routing algorithm that combines Dimension Ordered Routing (DOR) and nanophotonic ring algorithms. By exploring this new topology, we can achieve to reduce interconnection network latency by 17% on average (up to 32%) and improve IPC by 5% on average (up to 11.5%). We also analyze application characteristics of six CUDA benchmarks on the GPGPU-Sim simulator to obtain better perspective for designing high performance GPU interconnection network.
13

Networks-on-chip: modeling, analysis, and design methodologies.

El Miligi, Haytham 19 October 2011 (has links)
The growing complexity of System-on-Chip (SoC) designs motivates both academic and industrial researchers to find better solutions for the complexity of the chip-interconnect. For SoC designs that have hundreds of Processing Elements (PEs), a single shared bus can no longer be accepted as an efficient communication scheme. To address this problem, the Networks-on-Chip (NoC) concept is proposed as a new paradigm, which provides an integrated solution for achieving efficient interconnection scheme for complex SoC applications. NoC-based designs are composed of computational resources in the form of PE cores, and switching nodes (routers) that allow PEs to communicate with each other. For different applications, this research work: 1) proposes new analytical models for various NoC design parameters, 2) performs comparative analyses of the commonly used network architectures, and 3) presents novel methodologies for efficiently designing the NoC-topology. The proposed methodologies are developed to help NoC-designers better achieve minimum power consumption and delay, and maximum performability for their applications. Graph-theoretic concepts are adopted to study the topological architecture of NoCs and propose a new topology-based models for network power, performability, and delay. The proposed models take into consideration important design parameters, which significantly affect the power, performability, and delay of a NoC-based system; such as network topology architecture, traffic distribution, noise power, voltage swing, probability of edge failure, router design and number of ports, clock frequency, and target technology. In this dissertation, we show how the proposed models could be used to optimally design the network topology so that it achieves the target design requirement for a given application. After studying each design metric individually, a joint consideration of NoC power, performability, and delay is carried out simultaneously. We use Particle Swarm Optimization (PSO) to find the optimum network topology, that achieves minimum delay, maximum performability, and minimum power consumption, for a given NoC application. Real case studies are presented to validate the proposed theoretical concepts. This validation is carried out through experimental work, targeting various real NoC applications. Experimental results show that using the proposed design methodologies, designers can improve the overall system efficiency in terms of power, delay, and performability, by choosing the design parameters (i.e., network topology architecture, PEs’ mapping, etc.) efficiently at early design phases. This improvement is measured in some cases by an order of magnitude, compared to the worst case scenario of choosing wrong design parameters for the target application. / Graduate
14

Spatial parallelism in the routers of asynchronous on-chip networks

Song, Wei January 2011 (has links)
State-of-the-art multi-processor systems-on-chip use on-chip networks as their communication fabric. Although most on-chip networks are implemented synchronously, asynchronous on-chip networks have several advantages over their synchronous counterparts. Timing division multiplexing (TDM) flow control methods have been utilized in asynchronous on-chip networks extensively. The synchronization required by TDM leads to significant speed penalties. Compared with using TDM methods, spatial parallelism methods, such as the spatial division multiplexing (SDM) flow control method, achieve better network throughput with less area overhead.This thesis proposes several techniques to increase spatial parallelism in the routers of asynchronous on-chip networks.Channel slicing is a new pipeline structure that alleviates the speed penalty by removing the synchronization among bit-level data pipelines. It is also found out that the lookahead pipeline using early evaluated acknowledgement can be used in routers to further improve speed.SDM is a new flow control method proposed for asynchronous on-chip networks. It improves network throughput without introducing synchronization among buffers of different frames, which is required by TDM methods. It is also found that the area overhead of SDM is smaller than the virtual channel (VC) flow control method -- the most used TDM method. The major design problem of SDM is the area consuming crossbars. A novel 2-stage Clos switch structure is proposed to replace the crossbar in SDM routers, which significantly reduces the area overhead. This Clos switch is dynamically reconfigured by a new asynchronous Clos scheduler.Several asynchronous SDM routers are implemented using these new techniques. An asynchronous VC router is also reproduced for comparison. Performance analyses show that the SDM routers outperform the VC router in throughput, area overhead and energy efficiency.
15

Réseaux embarqués sur puce reconfigurable dynamiquement et sûrs de fonctionnement / Reliable and dynamically reconfigurable network-on-chip

Killian, Cédric 05 December 2012 (has links)
Les besoins de performance des systèmes sur puce embarqués augmentant sans cesse pour satisfaire des applications de plus en plus complexes, de nouvelles architectures de traitement et de nouveaux paradigmes de calcul sont apparus. L'intégration au sein d'une même puce électronique de plusieurs dizaines, voire centaines d'éléments de calcul a donné naissance aux systèmes sur puce multiprocesseur (MultiProcessor Systems on Chip - MPSoC). Cette évolution permet d'obtenir une puissance de traitement parallèle considérable. Actuellement, les performances de tels systèmes reposent sur le support de communication et d'échange des données entre les blocs de calcul intégrés. La problématique du support de communication est de fournir une bande passante et une adaptabilité élevées, afin de pouvoir bénéficier efficacement du parallélisme potentiel de la puissance de calcul disponible des MPSoC. C'est dans ce contexte du besoin primordial de flexibilité et de bande passante que sont apparus les réseaux embarqués sur puce (Network-on-Chip - NoC) dont l'objectif est de permettre l'interconnexion optimisée d'un grand nombre d'éléments de calcul au sein d'une même puce électronique, tout en assurant l'exigence d'un compromis entre les performances de communication et les ressources d'interconnexion. De plus, l'apparition de la technologie FPGA reconfigurable dynamiquement a ouvert de nouvelles approches permettant aux MPSoC d'adapter leurs constituants en cours de fonctionnement et de répondre aux besoins croissant d'adaptabilité, de flexibilité et de la diversité des ressources des systèmes embarqués. Étant donnée cette évolution de complexité des systèmes électroniques et la diminution de la finesse de gravure, et donc du nombre croissant de transistors au sein d'une même puce, la sensibilité des circuits face aux phénomènes générant des fautes n'a de cesse d'augmenter. Ainsi, dans le but d'obtenir des systèmes sur puces performants et fiables, des techniques de détection, de localisation et de correction d'erreurs doivent être proposées au sein des NoC reconfigurables ou adaptatifs, où la principale difficulté réside dans l'identification et la distinction entre des erreurs réelles et des fonctionnements variables ou adaptatifs des éléments constituants ces types de NoC C'est dans ce contexte que nous proposons de nouveaux mécanismes et solutions architecturales permettant de contrôler le fonctionnement d'un NoC adaptatif supportant les communications d'une structure MPSOC, et afin de d'identifier et localiser avec précision les éléments défaillants d'une telle structure dans le but de les corriger ou de les isoler pour prévenir toutes défaillances du système / The need of performance of embedded Syxtena-on-Chlps (Socs) are increasing constantly to meet the requirements of applications becoming more and more complexes, and new processing architectures and new computing paradigms have emerged. The integration within a single chip of dozens, or hundreds of computing and processing elements has given birth to Mukt1 Pmcesmr Systena-on-Chp (MPSoC) allowing to feature a high level of parallel processing. Nowaday s, the performance of these systems rely on the communication medium between the interconnected processing elements. The problematic of the communication medium to feature a high bandwidth and flexibility is primordial in order to efficiently use the parallel processing capacity of the MPSoC In this context, Network-on-Chlps (NoCs) are developed where the aim is to allow the interconnection of a large number of elements in the same device while maintaining a tradeoff between performance and logical resources. Moreover, the emergence of the partial reconfigurable FPGA technology allows to the MPSoC to adapt their elements during its operation in order to meet the system requirements. Given this increasing complexity of the electronic systems and the shrinking size of the devices, the sensibility of the chip against phenomena generating fault has increased. Thereby, to design efficient and reliable Socs, new error detection and localization techniques must be proposed for the dynamic NoCs where the main difficulty is the identification and the distinction between real errors and adaptive behavior of the NoCs. In this context, we present new mechanisms and architectural solutions allowing to check during the system operation the correctness of dynamic NoCs in order to locate and isolate efficiently the faulty components avoiding a failure of the system
16

Méthodologies de conception pour multiprocesseurs sur circuits logiques programmables

Benmouhoub, Riad 07 May 2007 (has links) (PDF)
L'augmentation continue de la capacité d'intégration d'une part, la complexité croissante des applications embarquées d'autre part, ont conduit aux systèmes sur puce (SoC) puis aux systèmes multiprocesseurs sur puce (MPSoC). Le problème fondamental associé à ces systèmes sur puces de grande taille est celui des méthodologies de conception et de la crise de productivité en résultant ne permettant pas d'exploiter de manière efficace ces circuits. Cette crise de productivité est le résultat d'approches ad-hoc et manuelle de la conception alors que le problème doit être posé comme un problème d'optimisation multi-objectif dont la résolution doit faire appel à des techniques d'optimisation automatique. Dans cette thèse, nous présentons une méthodologie de conception pour les systèmes multiprocesseurs sur circuits logiques programmables, dont l'originalité porte sur trois aspects : (1) l'exploration évolutionnaire multi objectif de l'espace de conception afin de mener une recherche intelligente, (2) l'utilisation des circuits logiques programmables de grande taille pour l'évaluation rapide par émulation largement supérieure à la simulation, et enfin (3) l'utilisation de la synthèse MPSoC depuis un langage de programmation parallèle haut niveau (Occam) et de la prise en compte du monitoring sur puce. Des cas d'études sur circuits ont démontré l'efficacité d'une telle méthodologie pour résoudre le problème de la crise de productivité de la conception.
17

Méthodologies de synthèse de réseaux de neurones pour applications de traitement de signal adaptatif et implémentation sur circuits reconfigurables dynamiquement

Chtourou, Sofien 04 June 2007 (has links) (PDF)
Les progrès dans les techniques de conception et dans la technologie des semi-conducteurs ont permis l'intégration de systèmes embarqués de complexité croissante sur une puce, les systèmes sur puce (System on Chip - SOC). Les piliers de la stratégie adoptée pour atteindre ce résultat sont les suivants: (1) la réutilisation de composants (IP), (2) l'utilisation de plateformes, (3) l'abstraction. L'ensemble de ces techniques permet de concevoir des systèmes complets pouvant répondre aux besoins d'applications complexes et déterministes. Cette situation change si les applications visées sont diverses dans leur comportement à l'exécution en termes d'utilisation de ressources et si de plus chaque application elle même présente un caractère variable à l'exécution. Cette variabilité de la charge de travail va à l'encontre des méthodologies actuelles qui considèrent que toutes les informations relatives aux applications sont connues de manière statique et que donc toutes les décisions de partitionnement logiciel-matériel, et d'allocation de ressources ainsi que d'ordonnancement le sont aussi. Dans cette thèse nous proposons une nouvelle approche pour la conception de systèmes sur puce avec charge variable. Le problème est posé comme un problème de contrôle adaptatif avec prédicteur dynamique de charge de travail. La première partie de la thèse se focalise sur l'extraction automatique des différentes caractéristiques qui favorisent l'introduction d'un aspect adaptatif dans l'architecture d'une application et la résolution du problème de la prédiction de grandes séries de temps résultant de la capture de la variabilité de charge. Nous présentons les réseaux de neurones récurrents connus comme des approximateurs universels capables de modéliser un phénomène dynamique non linéaire et les appliquons à la prédiction dynamique de charge dans les Systèmes sur Puce. Les aspects théoriques fondamentaux ayant été fixés dans la deuxième partie de la thèse nous évaluons le coût en surface de l'implémentation d'un prédicteur par une exploration automatique multiobjective puis évaluons ses performances dans une plateforme SOC modélisée en SystemC TLM. Ces travaux ont été validés sur une application industrielle multimédia JPEG-2000. Il en découle un traitement adaptatif en nombre de ressources résultant en une meilleure efficacité de l'utilisation du circuit et une meilleure performance comparée a une architecture fixe.
18

A verilog-hdl implementation of virtual channels in a network-on-chip router

Park, Sungho 15 May 2009 (has links)
As the feature size is continuously decreasing and integration density is increasing, interconnections have become a dominating factor in determining the overall quality of a chip. Due to the limited scalability of system bus, it cannot meet the requirement of current System-on-Chip (SoC) implementations where only a limited number of functional units can be supported. Long global wires also cause many design problems, such as routing congestion, noise coupling, and difficult timing closure. Network-on-Chip (NoC) architectures have been proposed to be an alternative to solve the above problems by using a packet-based communication network. The processing elements (PEs) communicate with each other by exchanging messages over the network and these messages go through buffers in each router. Buffers are one of the major resource used by the routers in virtual channel flow control. In this thesis, we analyze two kinds of buffer allocation approaches, static and dynamic buffer allocations. These approaches aim to increase throughput and minimize latency by means of virtual channel flow control. In statically allocated buffer architecture, size and organization are design time decisions and thus, do not perform optimally for all traffic conditions. In addition, statically allocated virtual channel consumes a waste of area and significant leakage power. However, dynamic buffer allocation scheme claims that buffer utilization can be increased using dynamic virtual channels. Dynamic virtual channel regulator (ViChaR), have been proposed to use centralized buffer architecture which dynamically allocates virtual channels and buffer slots in real-time depending on traffic conditions. This ViChaR’s dynamic buffer management scheme increases buffer utilization, but it also increases design complexity. In this research, we reexamine performance, power consumption, and area of ViChaR’s buffer architecture through implementation. We implement a generic router and a ViChaR architecture using Verilog-HDL. These RTL codes are verified by dynamic simulation, and synthesized by Design Compiler to get area and power consumption. In addition, we get latency through Static Timing Analysis. The results show that a ViChaR’s dynamic buffer management scheme increases the latency and power consumption significantly even though it could increase buffer utilization. Therefore, we need a novel design to achieve high buffer utilization without a loss.
19

Efficient Lookahead Routing and Header Compression For Multicasting in Networks-On-Chip

Kumar, Poornachandran 2010 August 1900 (has links)
With advancing technology, Chip Multi-processor (CMP) architectures have emerged as a viable solution for designing processors. Networks-On-Chip (NOCs) provide a scalable communication method for CMP architectures with increasing numbers of cores. Although there has been significant research on NOC designs for unicast traffic, the research on the multicast router design is still in its infant stage. Considering that one-to-many (multicast) and one-to-all (broadcast) traffic are more common in CMP applications, it is important to design a router providing efficient multicasting. In this thesis, a lookahead multicast routing algorithm with limited area overhead is proposed. This lookahead algorithm reduces network latency by removing the need for a separate routing computation (RC) stage. An efficient area optimization technique is put forward to achieve minimal area overhead for the lookahead RC stage. Also, a novel compression scheme is proposed for multicast packet headers to alleviate their big overhead in large networks. Comprehensive simulation results show that with the new route computation logic design and area overhead optimization, providing lookahead routing in the multicast router only costs less than 20 percent area overhead and this percentage keeps decreasing with larger network sizes. Compared with the basic lookahead routing design, our design can save area by over 50 percent. With header compression and lookahead multicast routing, the network performance can be improved on an average by 22 percent for a (16 x 16) network.
20

CoNoC: Fast Full Chip Topology Generation for Application-Specific Network on Chip

Chen, Shu-yu 08 January 2010 (has links)
We propose a synthesis methodology for Network-on-Chips (NoC) or NoC-based multiprocessor systems-on-chip (MPSoCs) for application-specific or irregular topology generation.We first propose simultaneously synthesize both for processor and communication architectures in order to estimate area and routing more accurately during floorplanning stage, which is different with traditional router and link insertion after floorplanning. Our NoC topology generation is simultaneously optimized for fast, low power and wirelength. Compared with the state of art, our results outperforms averagely 445.45 X in CPU time, 33.20 % in power consumption, and 96.86 % in wirelength at cost of NoC Size of more 2.26 % because our method considering router shape; the number of routers of more 20.63 % because our method only allows router port limit of 5; the number of links of more 3.93 % because our method allows different link lengths. Also our method is scalable and experiments of 2 X, 4 X, 8 X and 16 X outperform averagely 355,089.11 X in CPU time, 1.21 X in the number hops, 78.33 % in power consumption. Our experimental results show our synthesis method is effective, efficiently and scalable.

Page generated in 0.0139 seconds