• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 79
  • 25
  • 17
  • 13
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 159
  • 55
  • 48
  • 45
  • 43
  • 42
  • 34
  • 32
  • 31
  • 24
  • 24
  • 23
  • 19
  • 18
  • 18
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
121

Um ambiente de execução para suporte à programação paralela com variáveis compartilhadas em sistemas distribuídos heterogêneos. / A runtime system for parallel programing with shared memory paradigm over a heterogeneus distributed systems.

Gisele da Silva Craveiro 31 October 2003 (has links)
O avanço na tecnologia de hardware está permitindo que máquinas SMP de 2 a 8 processadores estejam disponíveis a um custo cada vez menor, possibilitando que a incorporação de tais máquinas em aglomerados de PC's ou até mesmo a composição de um aglomerado de SMP's sejam alternativas cada vez mais viáveis para computação de alto desempenho. O grande desafio é extrair o potencial que tal conjunto de máquinas oferece. Uma alternativa é usar um paradigma híbrido de programação para aproveitar a arquitetura de memória compartilhada através de multihreadeing e utilizar o modelo de troca de mensagens para comunicação entre os nós. Contudo, essa estratégia impõe uma tarefa árdua e pouco produtiva para o programador da aplicação. Este trabalho apresenta o sistema CPAR- Cluster que oferece uma abstração de memória compartilhada no topo de um aglomerado formado por nós mono e multiprocessadores. O sistema é implementado no nível de biblioteca e não faz uso de recursos especiais tais como hardware especializado ou alteração na camada de sistema operacional. Serão apresentados os modelos, estratégias, questões de implementação e os resultados obtidos através de testes realizados com a ferramenta e que apresentaram comportamento esperado. / The advance in hardware technologies is making small configuration SMP machines (from 2 to 8 processors) available at a low cost. For this reason, the inclusion of an SMP node into a cluster of PCs or even clusters of SMPs are becoming viable alternatives for high performance computing. The challenge is the exploitation of the computational resources that these platforms provide. A Hybrid programming paradigm which uses shared memory architecture through multihreading and also message passing model for inter node communication is an alternative. However, programming in such paradigm is very hard. This thesis presents CPAR- Cluster, a runtime system, that provides shared memory abstraction on top of a cluster composed by mono and multiprocessor nodes. Its implementation is at the library level and doesn't require special resources such as particular hardware or operating system moditfications. Models, strategies, implementation aspects and results will be presented.
122

Sistema operacional e biblioteca de fun??es para plataformas MPSOC: um estudo de caso para simuladores de reservat?rios

Oliveira, Tadeu Ferreira 09 August 2010 (has links)
Made available in DSpace on 2014-12-17T15:48:02Z (GMT). No. of bitstreams: 1 TadeuFO_DISSERT.pdf: 1305505 bytes, checksum: 419b87148f7490aba343231bb89f4d72 (MD5) Previous issue date: 2010-08-09 / The increasingly request for processing power during last years has pushed integrated circuit industry to look for ways of providing even more processing power with less heat dissipation, power consumption, and chip area. This goal has been achieved increasing the circuit clock, but since there are physical limits of this approach a new solution emerged as the multiprocessor system on chip (MPSoC). This approach demands new tools and basic software infrastructure to take advantage of the inherent parallelism of these architectures. The oil exploration industry has one of its firsts activities the project decision on exploring oil fields, those decisions are aided by reservoir simulations demanding high processing power, the MPSoC may offer greater performance if its parallelism can be well used. This work presents a proposal of a micro-kernel operating system and auxiliary libraries aimed to the STORM MPSoC platform analyzing its influence on the problem of reservoir simulation / O aumento da demanda por poder de processamento nos ?ltimos anos for?ou a ind?stria de circuitos integrados a buscar formas de prover maior poder de processamento com menor dissipa??o de calor, menor consumo de pot?ncia e ?rea em chip. Isso vinha sendo feito com o aumento do clock dos circuitos. Por?m, com a proximidade dos limites f?sicos dessa abordagem, surgem como solu??o alternativa as arquiteturas com m?ltiplos processadores em um ?nico chip: os MPSoC (Multi-Processor System on a Chip). Essa abordagem exige que novas ferramentas e novos softwares sejam desenvolvidos buscando aproveitar ao m?ximo o aspecto paralelo destas arquiteturas. A ind?stria de explora??o de petr?leo tem como uma de suas atividades iniciais a decis?o de projetos de explora??o de campos de petr?leo. Essas decis?es s?o tomadas baseando-se em simula??es computacionalmente intensivas, situa??o em que os MPSoCs podem oferecer aumento de performance atrav?s de paralelismo. Este trabalho apresenta a proposta de implementa??o de um micro-kernel de sistema operacional e bibliotecas auxiliares para a plataforma MPSoC STORM analisando a influ?ncia na simula??o de reservat?rios
123

Exploration of multicore systems based on silicon integrated communication networks / Exploration de systèmes multicoeurs basés sur des réseaux de communication intégrés sur silicium

Effiong, Charles Emmanuel 16 November 2017 (has links)
De plus en plus de cœurs sont maintenant intégrés sur une seule puce afin de satisfaire les exigences toujours croissantes des applications en matière de systèmes haute performance et basse consommation. Le nombre de cœurs ne cesse d'augmenter, tout comme le besoin en réseaux de communications à haute vitesse entre ces cœurs. A l’inverse des réseaux de communication traditionnels, les Networks-on-Chip (NoCs) ont émergé comme une alternative mature pour les architectures massivement multicœur du fait de leur meilleure passage à l'échelle et de leur efficacité énergétique accrue.Les routeurs de NoC typiques sont constitués de mémoires-tampons qui servent au stockage temporaire de données. Cependant, des études ont montré que ces mémoires-tampons sont souvent inutilisées, en particulier lors de l'exécution application avec des modèles de trafic non uniformes. Cela est dû au fait que la plupart des routeurs typiques consacrent ces bouts de mémoire à leurs ports d'entrée et/ou de sortie, et toute cette mémoire ne peut être exploitée que par un certain type de flux de données. Cela entraîne une dégradation significative des performances dans les cas non favorables. Par conséquent, les architectures de routeurs capables de maximiser l'utilisation des mémoires-tampons pour des gains de performance sont recherchées.Dans le but de maximiser l'utilisation des ressources, cette thèse propose un concept novateur de routeur pour réseau sur puce appelé Roundabout NoC (RiNoC) qui s'inspire des ronds-points à plusieurs voies que l'on retrouve dans la gestion du trafic routier. Contrairement aux approches existantes, RiNoC assure intrinsèquement une utilisation efficace des ressources. Cependant, les routeurs inspirés des ronds-points sont sujet aux interblocages à cause de leur forme en anneau. Le routeur "Rotary NoC" partage le même concept d'organisation en anneau que nous proposons, mais repose sur une d'évitement des interblocages qui introduit des surcoûts non négligeables en terme de surface et de consommation énergétique. A l'inverse, RiNoC empêche les interblocages et améliore les performances des réseaux sur puce sans compromettre la surface ou l'énergie du réseau. Cette thèse exploite en particulier l'architecture hautement paramétrique de RiNoC afin de produire différentes configurations de routeur avec des compromis topologiques variables pour différents gains de performance sans sacrifier la surface. / More computing cores are now being integrated on a single chip in order to meet the ever-growing application demands for high performance and low power computing systems. As the number of cores continues to grow, so is the demand for scalable on-chip communication networks that can deliver high-speed communication among the cores. Contrary to traditional on-chip networks, Networks-on-Chip (NoCs) have emerged as a mature alternative interconnect for manycore architectures since it provides enhanced scalability and power efficiency.Typical NoC routers consist of buffers which serve as temporary data storage. However, studies have shown that buffers are often unutilized (i.e. idle or underutilized) especially when executing applications with non-uniform traffic patterns or bursty behaviours. This is because most typical routers dedicate a set of buffers to their input and/or output ports and these buffers can only be exploited by data-flows using them, which leads to significant performance degradation. Therefore, router architectures capable of maximizing buffer utilization for performance gains are indispensable.In order to maximize buffer resource utilization, this thesis proposes a novel NoC router concept called Roundabout NoC (RiNoC) that is inspired by real-life multi-lanes traffic roundabout. Contrary to existing approaches, RiNoC provides intrinsic and effective resource utilization. However, roundabout-inspired routers are susceptible to deadlocks due to their ring-like architecture. The Rotary NoC router shares similar ring-like concept with propose but relies on a deadlock-free technique which introduces significant area/power overheads. Conversely, RiNoC achieves deadlock-freeness and enhanced network performance over typical NoCs without compromising network area/power. This thesis further exploits RiNoC highly parametric architecture in order to produce different router configurations with varying topological trade-offs for performance gains without sacrificing area.
124

Environnement de développement d’applications multipériodiques sur plateforme multicoeur. : La boîte à outils SchedMCore / Multiperiodic application development environment on multicore architecture. : The SchedMCore framework

Cordovilla Mesonero, Mikel 02 April 2012 (has links)
Les logiciels embarqués critiques de contrôle-commande sont soumis à des contraintes fortes englobant le déterminisme, la correction logique et la correction temporelle. Nous supposons que les spécifications sont exprimées à l'aide du langage formel de description d'architectures logicielles temps réel multipériodiques Prelude. L'objectif de cette thèse est, à partir d'un programme Prelude ou d'un ensemble de tâches temps réel dépendantes, de générer un code multithreadé exécutable sur une architecture multicœur tout en respectant la sémantique initiale. Pour cela, nous avons développé une boîte à outil, SchedMCore,permettant : - d'une part, la vérification formelle de l'ordonnançabilité. La vérification proposée est basée sur le parcours exhaustif du comportement avec pas de temps discret. Il est alors possible d'analyser des politiques en-ligne (FP, gEDF, gLLF et LLREF) mais également de calculer une affectation de priorité fixe valide et une séquence valide hors-ligne.- d'autre part, l'exécution multithreadée sur une cible multicœur. L'exécutif encode les politiques proposées étudiées dans la partie d'analyse d'ordonnançabilité, à savoir les quatre politiques en-ligne ainsi que les séquences valides générées. L'exécutif permet 3 modes d'utilisation, allant de la simulation temporelle à l'exécution temps précis des comportements des tâches. Il est compatible Posix et facilement portable sur divers OS. / A real-time control-command embedded system is subject to strong constraints such as determinism, logical and temporal correctness. We assume that the specifications are expressed using the formal software architecture description language Prelude, dedicated to real-time multiperiodic applications. The goal of this thesis is, given a Prelude program or dependent real-time taskset, to generate amultithreaded executable code over a multicore architecture while respecting the original semantic. To do so we have developed a toolbox, SchedMcore, that allows: - the formal verification of schedulability. The verification is based on the exhaustive exploration of the behaviour with a discret time frame. It is possible to analyse on-line policies (FP, gEDF, gLLF et LLREF), as well as to compute a fixed valid priority assignment and a valid off-line sequence.- the multithreaded execution over a multicore target. The framework encodes the same policies as those studied in the first part (the four on-line policies and the generated sequences). The framework provides three usage modes, from temporal simulation to time accurate execution. The executive is compatible with Posix and easily portable on several OS.
125

Securing Multiprocessor Systems-on-Chip

Biswas, Arnab Kumar 16 August 2016 (has links) (PDF)
MHRD PhD scholarship / With Multiprocessor Systems-on-Chips (MPSoCs) pervading our lives, security issues are emerging as a serious problem and attacks against these systems are becoming more critical and sophisticated. We have designed and implemented different hardware based solutions to ensure security of an MPSoC. Security assisting modules can be implemented at different abstraction levels of an MPSoC design. We propose solutions both at circuit level and system level of abstractions. At the VLSI circuit level abstraction, we consider the problem of presence of noise voltage in input signal coming from outside world. This noise voltage disturbs the normal circuit operation inside a chip causing false logic reception. If the disturbance is caused intentionally the security of a chip may be compromised causing glitch/transient attack. We propose an input receiver with hysteresis characteristic that can work at voltage levels between 0.9V and 5V. The circuit can protect the MPSoC from glitch/transient attack. At the system level, we propose solutions targeting Network-on-Chip (NoC) as the on-chip communication medium. We survey the possible attack scenarios on present-day MPSoCs and investigate a new attack scenario, i.e., router attack targeted toward NoC enabled MPSoC. We propose different monitoring-based countermeasures against routing table-based router attack in an MPSoC having multiple Trusted Execution Environments (TEEs). Software attacks, the most common type of attacks, mainly exploit vulnerabilities like buffer overflow. This is possible if proper access control to memory is absent in the system. We propose four hardware based mechanisms to implement Role Based Access Control (RBAC) model in NoC based MPSoC.
126

Mouvement de données et placement des tâches pour les communications haute performance sur machines hiérarchiques

Moreaud, Stéphanie 12 October 2011 (has links)
Les architectures des machines de calcul sont de plus en plus complexes et hiérarchiques, avec des processeurs multicœurs, des bancs mémoire distribués, et de multiples bus d'entrées-sorties. Dans le cadre du calcul haute performance, l'efficacité de l'exécution des applications parallèles dépend du coût de communication entre les tâches participantes qui est impacté par l'organisation des ressources, en particulier par les effets NUMA ou de cache.Les travaux de cette thèse visent à l'étude et à l'optimisation des communications haute performance sur les architectures hiérarchiques modernes. Ils consistent tout d'abord en l'évaluation de l'impact de la topologie matérielle sur les performances des mouvements de données, internes aux calculateurs ou au travers de réseaux rapides, et pour différentes stratégies de transfert, types de matériel et plateformes. Dans une optique d'amélioration et de portabilité des performances, nous proposons ensuite de prendre en compte les affinités entre les communications et le matériel au sein des bibliothèques de communication. Ces recherches s'articulent autour de l'adaptation du placement des tâches en fonction des schémas de transfert et de la topologie des calculateurs, ou au contraire autour de l'adaptation des stratégies de mouvement de données à une répartition définie des tâches. Ce travail, intégré aux principales bibliothèques MPI, permet de réduire de façon significative le coût des communications et d'améliorer ainsi les performances applicatives. Les résultats obtenus témoignent de la nécessité de prendre en compte les caractéristiques matérielles des machines modernes pour en exploiter la quintessence. / The emergence of multicore processors led to an increasing complexity inside the modern servers, with many cores, distributed memory banks and multiple Input/Output buses. The execution time of parallel applications depends on the efficiency of the communications between computing tasks. On recent architectures, the communication cost is largely impacted by hardware characteristics such as NUMA or cache effects. In this thesis, we propose to study and optimize high performance communication on hierarchical architectures. We first evaluate the impact of the hardware affinities on data movement, inside servers or across high-speed networks, and for multiple transfer strategies, technologies and platforms. We then propose to consider affinities between hardware and communicating tasks inside the communication libraries to improve performance and ensure their portability. To do so,we suggest to adapt the tasks binding according to the transfer method and thetopology, or to adjust the data transfer strategies to a defined task distribution. Our approaches have been integrated in some main MPI implementations. They significantly reduce the communication costs and improve the overall application performance. These results highlight the importance of considering hardware topology for nowadays servers.
127

Optimizing Power Consumption, Resource Utilization, and Performance for Manycore Architectures using Reinforcement Learning

Fettes, Quintin 23 May 2022 (has links)
No description available.
128

Mapping Concurrent Applications to Multiprocessor Systems with Multithreaded Processors and Network on Chip-Based Interconnections

Pop, Ruxandra January 2011 (has links)
Network on Chip (NoC) architectures provide scalable platforms for designing Systems on Chip (SoC) with large number of cores. Developing products and applications using an NoC architecture offers many challenges and opportunities. A tool which can map an application or a set of applications to a given NoC architecture will be essential. In this thesis we first survey current techniques and we present our proposals for mapping and scheduling of concurrent applications to NoCs with multithreaded processors as computational resources. NoC platforms are basically a special class of Multiprocessor Embedded Systems (MPES). Conventional MPES architectures are mostly bus-based and, thus, are exposed to potential difficulties regarding scalability and reusability. There has been a lot of research on MPES development including work on mapping and scheduling of applications. Many of these results can also be applied to NoC platforms. Mapping and scheduling are known to be computationally hard problems. A large range of exact and approximate optimization algorithms have been proposed for solving these problems. The methods include Branch-and–Bound (BB), constructive and transformative heuristics such as List Scheduling (LS), Genetic Algorithms (GA) and various types of Mathematical Programming algorithms. Concurrent applications are able to capture a typical embedded system which is multifunctional. Concurrent applications can be executed on an NoC which provides a large computational power with multiple on-chip computational resources. Improving the time performances of concurrent applications which are running on Network on Chip (NoC) architectures is mainly correlated with the ability of mapping and scheduling methodologies to exploit the Thread Level Parallelism (TLP) of concurrent applications through the available NoC parallelism. Matching the architectural parallelism to the application concurrency for obtaining good performance-cost tradeoffs is  another aspect of the problem. Multithreading is a technique for hiding long latencies of memory accesses, through the overlapped execution of several threads. Recently, Multi-Threaded Processors (MTPs) have been designed providing the architectural infrastructure to concurrently execute multiple threads at hardware level which, usually, results in a very low context switching overhead. Simultaneous Multi-Threaded Processors (SMTPs) are superscalar processor architectures which adaptively exploit the coarse grain and the fine grain parallelism of applications, by simultaneously executing instructions from several thread contexts. In this thesis we make a case for using SMTPs and MTPs as NoC resources and show that such a multiprocessor architecture provides better time performances than an NoC with solely General-purpose Processors (GP). We have developed a methodology for task mapping and scheduling to an NoC with mixed SMTP, MTP and GP resources, which aims to maximize the time performance of concurrent applications and to satisfy their soft deadlines. The developed methodology was evaluated on many configurations of NoC-based platforms with SMTP, MTP and GP resources. The experimental results demonstrate that the use of SMTPs and MTPs in NoC platforms can significantly speed-up applications.
129

Ordonnancement temps réel multiprocesseur pour la réduction de la consommation énergétique des systèmes embarqués / Energy-aware real-time scheduling of multiprocessor embedded systems

Legout, Vincent 08 April 2014 (has links)
Réduire la consommation énergétique des systèmes temps réel embarqués multiprocesseurs est devenu un enjeu important notammentpour augmenter leur autonomie. Nous réduisons la consommation statique des processeurs en exploitant leurs états basseconsommation. Dans un état basse-consommation, la consommation énergétique est fortement réduite mais un délai de transition et une pénalité sont nécessaires pour revenir à l'état actif. Nous proposons dans cette thèse les premiers algorithmes d'ordonnancement tempsréel multiprocesseurs optimaux pour réduire la consommation énergétique des systèmes temps réel dur et des systèmes temps réel àcriticité mixte. Ces algorithmes d'ordonnancement permettent d'activer les état basse-consommation les plus économes en énergie.Chaque algorithme d'ordonnancement est divisé en deux parties. La première partie hors-ligne génère un ordonnancement en utilisant laprogrammation linéaire en nombres entiers pour minimiser la consommation énergétique. La seconde partie est en-ligne et augmente lataille des périodes d'inactivité les tâches terminent leur exécution plus tôt que prévu. Dans le cadre des systèmes temps réel à criticitémixte, nous profitons du fait que les tâches de plus faible criticité peuvent tolérer des dépassements d'échéances pour être plus agressifhors-ligne afin de réduire davantage la consommation énergétique. Les résultats montrent que les algorithmes proposés utilisent demanière plus efficace les états basse-consommation. La consommation énergétique lorsque ceux-ci sont activés est en effet jusqu'à dix fois plus faible qu'avec les algorithmes d'ordonnancement multiprocesseurs existants. / Reducing the energy consumption of multiprocessor real-time embedded systems is a growing concern to increase their autonomy. In thisthesis, we aim to reduce the energy consumption of the processors, it includes both static and dynamic consumption and it is nowdominated by static consumption as the semiconductor technology moves to deep sub-micron scale. Existing solutions mainly focused ondynamic consumption. On the other hand, we target static consumption by efficiently using the low-power states of the processors. In alow-power state, the processor is not active and the deeper the low-power state is, the lower is the energy consumption but the higher isthe transition delay to come back to the active state. In this thesis, we propose the first optimal multiprocessor real-time schedulingalgorithms minimizing the static energy consumption. They optimize the duration of the idle periods to activate the most appropriate lowpowerstates. We target hard real-time systems with periodic tasks and also mixed-criticality systems where tasks with lower criticalitiescan tolerate deadline misses, therefore allowing us to be more aggressive while trying to reduce the energy consumption. We use anadditional task to model the idle time and mixed integer linear programming to compute offline a schedule minimizing the energyconsumption. Evaluations have been performed using existing optimal multiprocessor real-time scheduling algorithms. Results show thatthe energy consumption while processors are idle is up to ten times reduced with our solutions compared to the existing multiprocessor real-time scheduling algorithms.
130

Design and Programming Methods for Reconfigurable Multi-Core Architectures using a Network-on-Chip-Centric Approach

Rettkowski, Jens 12 July 2022 (has links)
A current trend in the semiconductor industry is the use of Multi-Processor Systems-on-Chip (MPSoCs) for a wide variety of applications such as image processing, automotive, multimedia, and robotic systems. Most applications gain performance advantages by executing parallel tasks on multiple processors due to the inherent parallelism. Moreover, heterogeneous structures provide high performance/energy efficiency, since application-specific processing elements (PEs) can be exploited. The increasing number of heterogeneous PEs leads to challenging communication requirements. To overcome this challenge, Networks-on-Chip (NoCs) have emerged as scalable on-chip interconnect. Nevertheless, NoCs have to deal with many design parameters such as virtual channels, routing algorithms and buffering techniques to fulfill the system requirements. This thesis highly contributes to the state-of-the-art of FPGA-based MPSoCs and NoCs. In the following, the three major contributions are introduced. As a first major contribution, a novel router concept is presented that efficiently utilizes communication times by performing sequences of arithmetic operations on the data that is transferred. The internal input buffers of the routers are exchanged with processing units that are capable of executing operations. Two different architectures of such processing units are presented. The first architecture provides multiply and accumulate operations which are often used in signal processing applications. The second architecture introduced as Application-Specific Instruction Set Routers (ASIRs) contains a processing unit capable of executing any operation and hence, it is not limited to multiply and accumulate operations. An internal processing core located in ASIRs can be developed in C/C++ using high-level synthesis. The second major contribution comprises application and performance explorations of the novel router concept. Models that approximate the achievable speedup and the end-to-end latency of ASIRs are derived and discussed to show the benefits in terms of performance. Furthermore, two applications using an ASIR-based MPSoC are implemented and evaluated on a Xilinx Zynq SoC. The first application is an image processing algorithm consisting of a Sobel filter, an RGB-to-Grayscale conversion, and a threshold operation. The second application is a system that helps visually impaired people by navigating them through unknown indoor environments. A Light Detection and Ranging (LIDAR) sensor scans the environment, while Inertial Measurement Units (IMUs) measure the orientation of the user to generate an audio signal that makes the distance as well as the orientation of obstacles audible. This application consists of multiple parallel tasks that are mapped to an ASIR-based MPSoC. Both applications show the performance advantages of ASIRs compared to a conventional NoC-based MPSoC. Furthermore, dynamic partial reconfiguration in terms of relocation and security aspects are investigated. The third major contribution refers to development and programming methodologies of NoC-based MPSoCs. A software-defined approach is presented that combines the design and programming of heterogeneous MPSoCs. In addition, a Kahn-Process-Network (KPN) –based model is designed to describe parallel applications for MPSoCs using ASIRs. The KPN-based model is extended to support not only the mapping of tasks to NoC-based MPSoCs but also the mapping to ASIR-based MPSoCs. A static mapping methodology is presented that assigns tasks to ASIRs and processors for a given KPN-model. The impact of external hardware components such as sensors, actuators and accelerators connected to the processors is also discussed which makes the approach of high interest for embedded systems.

Page generated in 0.0664 seconds