Global ETD Search

111	Evaluation de la sensibilité face aux SEE et méthodologie pour la prédiction de taux d’erreurs d’applications implémentées dans des processeurs Multi-cœur et Many-cœur / Evaluation of the SEE sensitivity and methodology for error rate prediction of applications implemented in Multi-core and Many-core processors Ramos Vargas, Pablo Francisco 18 April 2017 (has links) La présente thèse vise à évaluer la sensibilité statique et dynamique face aux SEE de trois dispositifs COTS différents. Le premier est le processeur multi-cœurs P2041 de Freescale fabriqué en technologie 45nm SOI qui met en œuvre ECC et la parité dans leurs mémoires cache. Le second est le processeur multifonction Kalray MPPA-256 fabriqué en technologie CMOS 28nm TSMC qui intègre 16 clusters de calcul chacun avec 16 cœurs, et met en œuvre ECC dans ses mémoires statiques et parité dans ses mémoires caches. Le troisième est le microprocesseur Adapteva E16G301 fabriqué en 65nm CMOS processus qui intègre 16 cœurs de processeur et ne pas mettre en œuvre des mécanismes de protection. L'évaluation a été réalisée par des expériences de rayonnement avec des neutrons de 14 Mev dans des accélérateurs de particules pour émuler un environnement de rayonnement agressif, et par injection de fautes dans des mémoires cache, des mémoires partagées ou des registres de processeur pour simuler les conséquences des SEU dans l'exécution du programme. Une analyse approfondie des erreurs observées a été effectuée pour identifier les vulnérabilités dans les mécanismes de protection. Des zones critiques telles que des Tag adresses et des registres à usage général ont été affectées pendant les expériences de rayonnement. De plus, l'approche Code Emulating Upset (CEU), développée au Laboratoire TIMA, a été étendue pour des processeurs multi-cœur et many-cœur pour prédire le taux d'erreur d'application en combinant les résultats issus des campagnes d'injection de fautes avec ceux issus des expériences de rayonnement. / The present thesis aims at evaluating the SEE static and dynamic sensitivity of three different COTS multi-core and many-core processors. The first one is the Freescale P2041 multi-core processor manufactured in 45nm SOI technology which implements ECC and parity in their cache memories. The second one is the Kalray MPPA-256 many-core processor manufactured in 28nm TSMC CMOS technology which integrates 16 compute clusters each one with 16 processor cores, and implements ECC in its static memories and parity in its cache memories. The third one is the Adapteva Epiphany E16G301 microprocessor manufactured in 65nm CMOS process which integrates 16 processor cores and do not implement protection mechanisms. The evaluation was accomplished through radiation experiments with 14 Mev neutrons in particle accelerators to emulate a harsh radiation environment, and by fault injection in cache memories, shared memories or processor registers, to simulate the consequences of SEUs in the execution of the program. A deep analysis of the observed errors was carried out to identify vulnerabilities in the protection mechanisms. Critical zones such as address tag and general purpose registers were affected during the radiation experiments. In addition, The Code Emulating Upset (CEU) approach, developed at TIMA Laboratory was extended to multi-core and many core processors for predicting the application error rate by combining the results issued from fault injection campaigns with those coming from radiation experiments. Fiabilité Test Injection de fautes Processeurs many-Core Single Event Upsets Essai de radiation Reliability Testing Fault injection Many- core processors Single Event Upsets Radiation test 600
112	Approche logicielle pour améliorer la fiabilité d’applications parallèles implémentées dans des processeurs multi-cœur et many-cœur / Software approach to improve the reliability of parallel applications implemented on multi-core and many-core processors Vargas Vallejo, Vanessa Carolina 28 April 2017 (has links) La grande capacité de calcul, flexibilité, faible consommation d'énergie, redondance intrinsèque et la haute performance fournie par les processeurs multi/many-cœur les rendent idéaux pour surmonter les nouveaux défis dans les systèmes informatiques. Cependant, le degré d'intégration de ces dispositifs augmente leur sensibilité aux effets des radiations naturelles. Par conséquent, des fabricants, partenaires industriels et universitaires travaillent ensemble pour améliorer les caractéristiques de ces dispositifs ce qui permettrait leur utilisation dans des systèmes embarqués et critiques. Dans ce contexte, le travail effectué dans le cadre de cette thèse vise à évaluer l'impact des SEEs (Single Event Effects) dans des applications parallèles s'exécutant sur des processeurs multi-cœur et many-cœur, et développer et valider une approche logicielle pour améliorer la fiabilité du système appelée N- MoRePar. La méthodologie utilisée pour l'évaluation était fondée sur des études de cas multiples. Les différents scénarios mis en œuvre envisagent une large gamme de configurations de système en termes de mode de multi-processing, modèle de programmation, modèle de mémoire et des ressources utilisées. Pour l'expérimentation, deux dispositifs COTS ont été sélectionnés: le quad-core Freescale PowerPC P2041 en technologie SOI 45nm, et le processeur multi-cœur KALRAY MPPA-256 en CMOS 28nm. Les études de cas ont été évaluées par l'injection de fautes et par des campagnes des tests sur neutron. Les résultats obtenus servent de guide aux développeurs pour choisir la configuration du système la plus fiable en fonction de leurs besoins. En outre, les résultats de l'évaluation de l'approche N-MoRePar basée sur des critères de redondance et de partitionnement augmente l'utilisation des processeurs COTS multi/many-cœur dans des systèmes qui requièrent haute fiabilité. / The large computing capacity, great flexibility, low power consumption, intrinsic redundancy and high performance provided by multi/many-core processors make them ideal to overcome with the new challenges in computing systems. However, the degree of scale integration of these devices increases their sensitivity to the effects of natural radiation. Consequently manufacturers, industrial and university partners are working together to improve their characteristics which allow their usage in critical embedded systems. In this context, the work done throughout this thesis aims at evaluating the impact of SEEs on parallel applications running on multi-core and many-core processors, and proposing a software approach to improve the system reliability. The methodology used for evaluation was based on multiple-case studies. The different scenarios implemented consider a wide range of system configurations in terms of multi-processing mode, programming model, memory model, and resources used. For the experimentation, two COTS devices were selected: the Freescale PowerPC P2041 quad-core built in 45nm SOI technology, and the KALRAY MPPA-256 many-core processor built in 28nm CMOS technology. The case-studies were evaluated through fault-injection and neutron radiation. The obtained results serve as useful guidelines to developers for choosing the most reliable system configuration according to their requirements. Furthermore, the evaluation results of the proposed N-MoRePar fault-tolerant approach based on redundancy and partitioning criteria boost the usage of COTS multi/many-core processors in high level dependability systems. Architectures parallèles Multi-Cœur et many-Cœur Fiabilité Redondance Multi-Processing mode Injection de fautes Parallel Architectures Multi-Core and many-Core Reliability Redundancy Multi-Processing mode Fault injection 620
113	Parallélisation sur un moteur exécutif à base de tâches des méthodes itératives pour la résolution de systèmes linéaires creux sur architecture multi et many coeurs : application aux méthodes de types décomposition de domaines multi-niveaux / Parallelization of iterative methods to solve sparse linear systems using task based runtime systems on multi and many-core architectures : application to Multi-Level Domain Decomposition methods Roussel, Adrien 06 February 2018 (has links) Les méthodes en simulation numérique dans le domaine de l’ingénierie pétrolière nécessitent la résolution de systèmes linéaires creux de grande taille et non structurés. La performance des méthodes itératives utilisées pour résoudre ces systèmes représente un enjeu majeur afin de permettre de tester de nombreux scénario.Dans ces travaux, nous présentons une manière d'implémenter des méthodes itératives parallèles au dessus d’un support exécutif à base de tâches. Afin de simplifier le développement des méthodes tout en gardant un contrôle fin sur la gestion du parallélisme, nous avons proposé une API permettant d’exprimer implicitement les dépendances entre tâches : la sémantique de l'API reste séquentielle et le parallélisme est implicite.Nous avons étendu le support exécutif HARTS pour enregistrer une trace d'exécution afin de mieux exploiter les architectures NUMA, tout comme de prendre en compte un placement des tâches et des données calculé au niveau de l’API. Nous avons porté et évalué l'API sur les processeurs many-coeurs KNL en considérant les différents types de mémoires de l’architecture. Cela nous a amené à optimiser le calcul du SpMV qui limite la performance de nos applications.L'ensemble de ce travail a été évalué sur des méthodes itératives et en particulier l’une de type décomposition de domaine. Nous montrons alors la pertinence de notre API, qui nous permet d’atteindre de très bon niveaux de performances aussi bien sur architecture multi-coeurs que many-coeurs. / Numerical methods in reservoir engineering simulations lead to the resolution of unstructured, large and sparse linear systems. The performances of iterative methods employed in simulator to solve these systems are crucial in order to consider many more scenarios.In this work, we present a way to implement efficient parallel iterative methods on top of a task-based runtime system. It enables to simplify the development of methods while keeping control on parallelism management. We propose a linear algebra API which aims to implicitly express task dependencies: the semantic is sequential while the parallelism is implicit.We have extended the HARTS runtime system to monitor executions to better exploit NUMA architectures. Moreover, we implement a scheduling policy which exploits data locality for task placement. We have extended the API for KNL many-core systems while considering the various memory banks available. This work has led to the optimization of the SpMV kernel, one of the most time consuming operation in iterative methods.This work has been evaluated on iterative methods, and particularly on one method coming from domain decomposition. Hence, we demonstrate that the API enables to reach good performances on both multi-core and many-core architectures. Calcul parallèle Décomposition de domaine Moteur exécutif Multi and Many-Core Parallel computing Domain decomposition methods Runtime system Multi and many-Core architecture 004
114	E pluribus unum : la relation chez Aristote / E Pluribus Unum : The relation in Aristotle work’s Chabert, Gérard 12 February 2011 (has links) E Pluribus Unum. La relation chez AristoteAristote déconstruit l’unité originaire des réalités à l’aide d’intelligibles en relation de plusieurs façons (relation indivisible, proportion, direction d’unification, assemblage de parties proportionnées entre elles, coïncidence). En donnant du volume à la pensée Aristote ouvre la voie de la connaissance de la complexité. Il ne faut pas hiérarchiser, voire opposer, la matière et l’esprit, l’ordre et le changement, l’unité et la pluralité, le déterminé et l’imprévisible, mais les mettre en relation. La relation, ce qui est partagé par plusieurs, est au cœur des « catégories » aussi bien que de l’unité « tout en un » des principes essentiels de l’étant. La relation seule n’explique pas l’être, mais l’être rend compte du fait que le mode de structuration des relations (séparabilité et indivisibilité) est un de ses principes premiers.Dans une première partie nous montrons qu’Aristote utilise toutes les topologies relationnelles, dont la relation de plusieurs à plusieurs propre aux réseaux.Dans la deuxième partie nous montrons comment sa notion d’être s’étudie à l’aide de la relation. L’être est une relation entre ses conditions d’existence impartageables (essentiel) et sa réalisation par des séparables (accidents). Aristote constate et analyse les principes (dimensions) de cette complexité, et il identifie que, pour penser les relations indivisibles (causes), l’homme possède un esprit immortel (le « noûs »). / E Pluribus Unum. The relation in Aristotle work’sAristotle studies unity as a complexity with several kinds of relationships. By doing so, he opens the way to the understanding of complexity. Our purpose is to study the different uses of relationship in Aristotle works. Relation is used from an accidental point of view (prós ti) as well as from an essential point of view (ousia), for the pros hén directionnal convergence and for the coïncidence of homonyms. Relationship fits with the “all-in-one” and pluri-dimensionnal approach of being by Aristotle. Relation alone is not being, but is part of being.In a first part we show that Aristotle uses each of the relation topologies in his studies, the many to many relationships used for networking is the base for Catégories.In the second part we show, from the major thesis of Aristotle, that a relational approach, coupled to directional finality and the prime motricity of each being, helps to understand the subject matter of what Aristotle is saying. Relation, because it is a first principle, helps to understand the role of all other principles used by Aristotle. Since man thinks unbreakable relations, it should possess an immortal “noûs”. Aristote Relation Complexité Direction Être Étant Principes premiers Ontologie Un Relation de plusieurs à plusieurs Relationship Complexity Being Prime principle Ontology The One Many to many relartionship Pluri-dimensionnal
115	Special states in quantum many-body spectra of low dimensional systems Nagara Srinivasa Prasanna, Srivatsa 06 September 2021 (has links) Strong quantum correlations between many particles in low dimensions lead to emergence of interesting phases of matter. These phases are often studied through the properties of the many-body eigenstates of an interacting quantum many-body system. The folklore example of topological order in the ground states is the fractional quantum Hall (FQH) effect. With the current developments in the field of ultracold atoms in optical lattices, realizing FQH physics on a lattice and being able to create and braid anyons is much awaited from the view point of fault tolerant quantum computing. This thesis contributes to the field of FQH effect and anyons in a lattice setting. Conformal field theory has been useful to build interesting lattice FQH models which are few-body and non-local. We provide a general scheme of truncation to arrive at tractable local models whose ground states have the desired topological properties. FQH models are known to host anyons, but, it is a hard task when it comes to braiding them on small sized lattices with edges. To get around this problem, we demonstrate that one can squeeze the anyons and braid them successfully within a smaller area by crawling them like snakes on modest sized open lattices. As a numerically cheap approach to detect topological quantum phase transitions, we again resort to anyons that are only well defined in a topological phase. We create defects and study a simple quantity such as the charge of the defect to test whether the phase supports anyons or not. On the other hand, with the advent of many-body localization (MBL) and quantum many-body scars, interesting eigenstate phases which were otherwise only known to occur in ground states have been identified even at finite energy densities in the many-body spectra of generic systems. This thesis also contributes to the field of non-equilibrium physics by portraying models that display interesting non-ergodic phases and also quantum many-body scars. For instance, we show that an emergent symmetry in a disordered model can be used as a tool to escape MBL in a single eigenstate while not preventing the rest of the states from localizing. This can lead to an interesting situation of weakly broken MBL phase where a non-MBL state lives in the spectrum of MBL like states. We also demonstrate the emergence of a non-ergodic, but also a non-mbl phase in a non-local model with SU(2) symmetry. We provide two constructions of rather different models with quantum many-body scars with chiral and non-chiral topological order. info:eu-repo/classification/ddc/530 ddc:530
116	Optimisation de transfert de données pour les processeurs pluri-coeurs, appliqué à l'algèbre linéaire et aux calculs sur stencils / Optimization of data transfer on many-core processors, applied to dense linear algebra and stencil computations Ho, Minh Quan 05 July 2018 (has links) La prochaine cible de Exascale en calcul haute performance (High Performance Computing - HPC) et des récent accomplissements dans l'intelligence artificielle donnent l'émergence des architectures alternatives non conventionnelles, dont l'efficacité énergétique est typique des systèmes embarqués, tout en fournissant un écosystème de logiciel équivalent aux plateformes HPC classiques. Un facteur clé de performance de ces architectures à plusieurs cœurs est l'exploitation de la localité de données, en particulier l'utilisation de mémoire locale (scratchpad) en combinaison avec des moteurs d'accès direct à la mémoire (Direct Memory Access - DMA) afin de chevaucher le calcul et la communication. Un tel paradigme soulève des défis de programmation considérables à la fois au fabricant et au développeur d'application. Dans cette thèse, nous abordons les problèmes de transfert et d'accès aux mémoires hiérarchiques, de performance de calcul, ainsi que les défis de programmation des applications HPC, sur l'architecture pluri-cœurs MPPA de Kalray. Pour le premier cas d'application lié à la méthode de Boltzmann sur réseau (Lattice Boltzmann method - LBM), nous fournissons des techniques génériques et réponses fondamentales à la question de décomposition d'un domaine stencil itérative tridimensionnelle sur les processeurs clusterisés équipés de mémoires locales et de moteurs DMA. Nous proposons un algorithme de streaming et de recouvrement basé sur DMA, délivrant 33% de gain de performance par rapport à l'implémentation basée sur la mémoire cache par défaut. Le calcul de stencil multi-dimensionnel souffre d'un goulot d'étranglement important sur les entrées/sorties de données et d'espace mémoire sur puce limitée. Nous avons développé un nouvel algorithme de propagation LBM sur-place (in-place). Il consiste à travailler sur une seule instance de données, au lieu de deux, réduisant de moitié l'empreinte mémoire et cède une efficacité de performance-par-octet 1.5 fois meilleur par rapport à l'algorithme traditionnel dans l'état de l'art. Du côté du calcul intensif avec l'algèbre linéaire dense, nous construisons un benchmark de multiplication matricielle optimale, basé sur exploitation de la mémoire locale et la communication DMA asynchrone. Ces techniques sont ensuite étendues à un module DMA générique du framework BLIS, ce qui nous permet d'instancier une bibliothèque BLAS3 (Basic Linear Algebra Subprograms) portable et optimisée sur n'importe quelle architecture basée sur DMA, en moins de 100 lignes de code. Nous atteignons une performance maximale de 75% du théorique sur le processeur MPPA avec l'opération de multiplication de matrices (GEMM) de BLAS, sans avoir à écrire des milliers de lignes de code laborieusement optimisé pour le même résultat. / Upcoming Exascale target in High Performance Computing (HPC) and disruptive achievements in artificial intelligence give emergence of alternative non-conventional many-core architectures, with energy efficiency typical of embedded systems, and providing the same software ecosystem as classic HPC platforms. A key enabler of energy-efficient computing on many-core architectures is the exploitation of data locality, specifically the use of scratchpad memories in combination with DMA engines in order to overlap computation and communication. Such software paradigm raises considerable programming challenges to both the vendor and the application developer. In this thesis, we tackle the memory transfer and performance issues, as well as the programming challenges of memory- and compute-intensive HPC applications on he Kalray MPPA many-core architecture. With the first memory-bound use-case of the lattice Boltzmann method (LBM), we provide generic and fundamental techniques for decomposing three-dimensional iterative stencil problems onto clustered many-core processors fitted withs cratchpad memories and DMA engines. The developed DMA-based streaming and overlapping algorithm delivers 33%performance gain over the default cache-based implementation.High-dimensional stencil computation suffers serious I/O bottleneck and limited on-chip memory space. We developed a new in-place LBM propagation algorithm, which reduces by half the memory footprint and yields 1.5 times higher performance-per-byte efficiency than the state-of-the-art out-of-place algorithm. On the compute-intensive side with dense linear algebra computations, we build an optimized matrix multiplication benchmark based on exploitation of scratchpad memory and efficient asynchronous DMA communication. These techniques are then extended to a DMA module of the BLIS framework, which allows us to instantiate an optimized and portable level-3 BLAS numerical library on any DMA-based architecture, in less than 100 lines of code. We achieve 75% peak performance on the MPPA processor with the matrix multiplication operation (GEMM) from the standard BLAS library, without having to write thousands of lines of laboriously optimized code for the same result. Calcul haute performance Processeur many-Core Calcul numérique Communication Systèmes distribués High performance computing (HPC) Many-Core processor Numerical computation Communication Distributed systems 004
117	Cosmological and theoretical aspects of higher dimensions Fairbairn, Malcolm January 2001 (has links) No description available. 539.72
118	Complementing user-level coarse-grain parallelism with implicit speculative parallelism Ioannou, Nikolas January 2012 (has links) Multi-core and many-core systems are the norm in contemporary processor technology and are expected to remain so for the foreseeable future. Parallel programming is, thus, here to stay and programmers have to endorse it if they are to exploit such systems for their applications. Programs using parallel programming primitives like PThreads or OpenMP often exploit coarse-grain parallelism, because it offers a good trade-off between programming effort versus performance gain. Some parallel applications show limited or no scaling beyond a number of cores. Given the abundant number of cores expected in future many-cores, several cores would remain idle in such cases while execution performance stagnates. This thesis proposes using cores that do not contribute to performance improvement for running implicit fine-grain speculative threads. In particular, we present a many-core architecture and protocols that allow applications with coarse-grain explicit parallelism to further exploit implicit speculative parallelism within each thread. We show that complementing parallel programs with implicit speculative mechanisms offers significant performance improvements for a large and diverse set of parallel benchmarks. Implicit speculative parallelism frees the programmer from the additional effort to explicitly partition the work into finer and properly synchronized tasks. Our results show that, for a many-core comprising 128 cores supporting implicit speculative parallelism in clusters of 2 or 4 cores, performance improves on top of the highest scalability point by 44% on average for the 4-core cluster and by 31% on average for the 2-core cluster. We also show that this approach often leads to better performance and energy efficiency compared to existing alternatives such as Core Fusion and Turbo Boost. Moreover, we present a dynamic mechanism to choose the number of explicit and implicit threads, which performs within 6% of the static oracle selection of threads. To improve energy efficiency processors allow for Dynamic Voltage and Frequency Scaling (DVFS), which enables changing their performance and power consumption on-the-fly. We evaluate the amenability of the proposed explicit plus implicit threads scheme to traditional power management techniques for multithreaded applications and identify room for improvement. We thus augment prior schemes and introduce a novel multithreaded power management scheme that accounts for implicit threads and aims to minimize the Energy Delay2 product (ED2). Our scheme comprises two components: a “local” component that tries to adapt to the different program phases on a per explicit thread basis, taking into account implicit thread behavior, and a “global” component that augments the local components with information regarding inter-thread synchronization. Experimental results show a reduction of ED2 of 8% compared to having no power management, with an average reduction in power of 15% that comes at a minimal loss of performance of less than 3% on average.
119	Dialectique et mathématique dans le Parménide de Platon / Dialectics and Mathematics in Plato’s Parmenides Minesi, Gianmarco 29 November 2018 (has links) Cette thèse de doctorat, intitulée “Dialectique et mathématique dans le Parménide de Platon” est une nouvelle lecture du Parménide de Platon, dont l’originalité réside essentiellement dans la tentative d’encadrer d’un point de vue mathématique la question de la « participation », qui est la question dialectique par excellence. Le but de la thèse n’est pas seulement de donner une interprétation cohérente et détaillée de toutes les parties du dialogue mais aussi d’aborder d’une façon atypique, via le Parménide, la problématique concernant la relation que la dialectique platonicienne entretient avec la mathématique. La thèse comprend huit chapitres et une introduction. L’introduction contient un état de la question et une panoramique générale des sujets affrontés. Le premier chapitre est un commentaire de la première partie de l’œuvre, le second est un commentaire de la partie intermédiaire, qui traite des indications méthodologiques de l’« exercice » effectué dans la deuxième partie. Le deuxième, le troisième, le quatrième et cinquième chapitre portent sur les trois premières « skepsis » (chacun sur une skepsis différent), et le sixième porte à la fois sur les deux dernières skepseis de la première hypothèse (si l’un est). Le septième chapitre, en revanche, traite des quatre skepseis de la deuxième hypothèse (si l’un n’est pas), alors que le huitième est une conclusion générale qui entend explorer d’une façon plus détaillée, à la lumière des résultats obtenus, la question des rapports entre Dialectique et Mathématique dans le Parménide, et introduire à la question du lien entre le Parménide et les témoignages indirects concernant les « άγραφα δόγματα ». / This Ph.d. Thesis, entitled « Dialectics and Mathematics in Plato’s Parmenides », is a new interpretation of Plato’s Parmenides, whose originality lies in the attempt to focus on the problem of participation, namely the ultimate dialectical problem, from a mathematical perspective. The aim of this thesis is not only to ensure a detailed and coherent interpretation of all parts of the dialogue, but also to approach the problem of the relation between Plato’s dialectics and mathematics via the Parmenides, therefore in an atypical way. The Thesis contains an introduction and eight chapters: the introduction deals mainly with the current state of the research and provides an overview of the subjects that will be treated in the chapters. The first chapter is a commentary on the first part of the dialogue, the second is a commentary of the middle part, concerning with the methodological pattern of the “exercise” performed in the second part of the dialogue. The third, fourth and fifth chapters shall examine the first three “skepseis” (one skepsis for each chapter) of the first “hypothesis” (if the one is), while the sixth deals with the last two skepseis. The seventh chapter comment on the four skepseis of the second hypothesis (if the one is not) and, finally, the eighth is a general conclusion whose aim is to explore more extensively, according to the obtained results, the relation between Dialectics and Mathematics in Plato’s Parmenides, taking also into account some connections between the Parmenides and the so-called άγραφα δόγματα. Platon Parménide Un Multiples Participation Idées Mathématiques Dialectique Plato Parmenides One Many Participation Ideas Mathematics Dialectics
120	Self-adaptive QOS at communication and computation levels for many-core system-on-chip Ruaro, Marcelo 16 March 2018 (has links) Submitted by PPG Ci?ncia da Computa??o (ppgcc@pucrs.br) on 2018-04-03T14:37:48Z No. of bitstreams: 1 MARCELO_RUARO_TES.pdf: 4683751 bytes, checksum: 6eb242e44efbbffa6fa556ea81cdeace (MD5) / Approved for entry into archive by Tatiana Lopes (tatiana.lopes@pucrs.br) on 2018-04-13T17:30:40Z (GMT) No. of bitstreams: 1 MARCELO_RUARO_TES.pdf: 4683751 bytes, checksum: 6eb242e44efbbffa6fa556ea81cdeace (MD5) / Made available in DSpace on 2018-04-13T17:37:13Z (GMT). No. of bitstreams: 1 MARCELO_RUARO_TES.pdf: 4683751 bytes, checksum: 6eb242e44efbbffa6fa556ea81cdeace (MD5) Previous issue date: 2018-03-16 / Sistemas multi-n?cleos intra-chip s?o o estado-da-arte em termos de poder computacional, alcan?ando de d?zias a milhares de elementos de processamentos (PE) em um ?nico circuito integrado. Sistemas multi-n?cleos de prop?sito geral assumem uma admiss?o din?mica de aplica??es, onde o conjunto de aplica??es n?o ? conhecido em tempo de projeto e as aplica??es podem iniciar sua execu??o a qualquer momento. Algumas aplica??es podem ter requisitos de tempo real, requisitando n?veis de qualidade de servi?o (QoS) do sistema. Devido ao alto grau de imprevisibilidade do uso dos recursos e o grande n?mero de componentes para se gerenciar, propriedades autoadaptativas tornam-se fundamentais para dar suporte a QoS em tempo de execu??o. A literatura fornece diversas propostas de QoS autoadaptativo, focado em recursos de comunica??o (ex., redes intra-chip), ou computa??o (ex., CPU). Contudo, para fornecer um suporte de QoS completo, ? fundamental uma autoconsci?ncia abrangente dos recursos do sistema, e assumir t?cnicas adaptativas que permitem agir em ambos os n?veis de comunica??o e computa??o para atender os requisitos das aplica??es. Para suprir essas demandas, essa Tese prop?e uma infraestrutura e t?cnicas de gerenciamento de QoS autoadaptativo, cobrindo ambos os n?veis de computa??o e comunica??o. No n?vel de computa??o, a infraestrutura para QoS consiste em um escalonador din?mico de tarefas de tempo real e um protocolo de migra??o de tarefas de baixo custo. Estas t?cnicas fornecem QoS de computa??o, devido ao gerenciamento da utiliza??o e aloca??o da CPU. A novidade do escalonador de tarefas ? o suporte a requisitos de tempo real din?micos, o que gera mais flexibilidade para as tarefas em explorar a CPU de acordo com uma carga de trabalho vari?vel. A novidade do protocolo de migra??o de tarefas ? o baixo custo no tempo de execu??o comparado a trabalhos do estado-da-arte. No n?vel de comunica??o, a t?cnica proposta ? um chaveamento por circuito (CS) baseado em redes definidas por software (SDN). O paradigma SDN para NoCs ? uma inova??o desta Tese, e ? alcan?ado atrav?s de uma arquitetura gen?rica de software e hardware. Para QoS de comunica??o, SDN ? usado para definir caminhos CS em tempo de execu??o. Essas infraestruturas de QoS s?o gerenciadas de uma forma integrada por um gerenciamento de QoS autoadaptativo, o qual segue o paradigma ODA (Observar, Decidir, Agir), implementando um la?o fechado de adapta??es em tempo de execu??o. O gerenciamento de QoS ? autoconsciente dos recursos do sistema e das aplica??es em execu??o, e pode decidir por adapta??es no n?vel de computa??o ou comunica??o, baseado em notifica??es das tarefas, monitoramento do ambiente, e monitoramento de atendimento de QoS. A autoadapta??o decide reativamente assim como proativamente. Uma t?cnica de aprendizagem do perfil das aplica??es ? proposta para tra?ar o comportamento das tarefas de tempo real, possibilitando a??es proativas. Resultados gerais mostram que o gerenciamento de QoS autoadaptativo proposto pode restaurar os n?veis de QoS para as aplica??es com um baixo custo no tempo de execu??o das aplica??es. Uma avalia??o abrangente, assumindo diversos benchmarks mostra que, mesmo sob diversas interfer?ncias de QoS nos n?veis de computa??o e comunica??o, o tempo de execu??o das aplica??es ? restaurado pr?ximo ao cen?rio ?timo, como 99,5% das viola??es de deadlines mitigadas. / Many-core systems-on-chip are the state-of-the-art in processing power, reaching from a dozen to thousands of processing elements (PE) in a single integrated circuit. General purpose many-cores assume a dynamic application admission, where the application set is unknown at design-time and applications may start their execution at any moment, inducing interference between them. Some applications may have real-time constraints to fulfill, requiring levels of quality of service (QoS) from the system. Due to the high degree of resource?s utilization unpredictability and the number of components to manage, self-adaptive properties become fundamental to support QoS at run-time. The literature provides several self-adaptive QoS proposals, targeting either communication (e.g., Network-on-Chip) or computation resources (e.g., CPU). However, to offer a complete QoS support, it is fundamental to provide a comprehensive self-awareness of the system?s resources, assuming adaptive techniques enabling to act simultaneously at the communication and computation levels to meet the applications' constraints. To cope with these requirements, this Thesis proposes a self-adaptive QoS infrastructure and management techniques, covering both the computation and communication levels. At the computation level, the QoS-driven infrastructure comprises a dynamic real-time task scheduler and a low overhead task migration protocol. These techniques ensure computation QoS by managing the CPU utilization and allocation. The novelty of the task scheduler is the support for dynamic real time constraints, which leverage more flexibility to tasks to explore the CPU according to a variable workload. The novelty of the task migration protocol is its low execution time overhead compared to the state-of-the-art. At the communication level, the proposed technique is a Circuit-Switching (CS) approach based on the Software Defined Networking (SDN) paradigm. The SDN paradigm for NoCs is an innovation of this Thesis and is achieved through a generic software and hardware architecture. For communication QoS, SDN is used to define CS paths at run-time. A self-adaptive QoS management following the ODA (Observe Decide Act) paradigm controls these QoS-driven infrastructures in an integrated way, implementing a closed loop for run time adaptations. The QoS management is self-aware of the system and running applications and can decide to take adaptations at computation or communication levels based on the task feedbacks, environment monitoring, and QoS fulfillment monitoring. The self-adaptation decides reactively as well as proactively. An online application profile learning technique is proposed to trace the behavior of the RT tasks and enabling the proactive actions. Results show that the proposed self-adaptive QoS management can restore the QoS level for the applications with a low overhead over the applications execution time. A broad evaluation, using known benchmarks, shows that even under severe QoS disturbances at computation and communication levels, the execution time of the application is restored near to the optimal scenario, mitigating 99.5% of deadline misses. System-on-Chip Many-Core Network-on-Chip Quality-of-Service Self-adaptation

Search results