• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 40
  • 37
  • 10
  • 5
  • 4
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 105
  • 30
  • 25
  • 25
  • 23
  • 23
  • 20
  • 19
  • 18
  • 17
  • 15
  • 15
  • 14
  • 13
  • 13
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
61

Projeto de Sistemas Integrados de Prop?sito Geral Baseados em Redes em Chip Expandindo as Funcionalidades dos Roteadores para Execu??o de Opera??es: A plataforma IPNoSys

Ara?jo, S?lvio Roberto Fernandes de 30 March 2012 (has links)
Made available in DSpace on 2014-12-17T15:47:00Z (GMT). No. of bitstreams: 1 SilvioRFA_TESE.pdf: 5797455 bytes, checksum: 65da3be6db5be8c8185888e31c1f294c (MD5) Previous issue date: 2012-03-30 / It bet on the next generation of computers as architecture with multiple processors and/or multicore processors. In this sense there are challenges related to features interconnection, operating frequency, the area on chip, power dissipation, performance and programmability. The mechanism of interconnection and communication it was considered ideal for this type of architecture are the networks-on-chip, due its scalability, reusability and intrinsic parallelism. The networks-on-chip communication is accomplished by transmitting packets that carry data and instructions that represent requests and responses between the processing elements interconnected by the network. The transmission of packets is accomplished as in a pipeline between the routers in the network, from source to destination of the communication, even allowing simultaneous communications between pairs of different sources and destinations. From this fact, it is proposed to transform the entire infrastructure communication of network-on-chip, using the routing mechanisms, arbitration and storage, in a parallel processing system for high performance. In this proposal, the packages are formed by instructions and data that represent the applications, which are executed on routers as well as they are transmitted, using the pipeline and parallel communication transmissions. In contrast, traditional processors are not used, but only single cores that control the access to memory. An implementation of this idea is called IPNoSys (Integrated Processing NoC System), which has an own programming model and a routing algorithm that guarantees the execution of all instructions in the packets, preventing situations of deadlock, livelock and starvation. This architecture provides mechanisms for input and output, interruption and operating system support. As proof of concept was developed a programming environment and a simulator for this architecture in SystemC, which allows configuration of various parameters and to obtain several results to evaluate it / Aposta-se na pr?xima gera??o de computadores como sendo de arquitetura com m?ltiplos processadores e/ou processadores com v?rios n?cleos. Neste sentido h? desafios relacionados aos mecanismos de interconex?o, frequ?ncia de opera??o, ?rea ocupada em chip, pot?ncia dissipada, programabilidade e desempenho. O mecanismo de interconex?o e comunica??o considerado ideal para esse tipo de arquitetura s?o as redes em chip, pela escalabilidade, paralelismo intr?nseco e reusabilidade. A comunica??o nas redes em chip ? realizada atrav?s da transmiss?o de pacotes que carregam dados e instru??es que representam requisi??es e respostas entre os elementos processadores interligados pela rede. A transmiss?o desses pacotes acontece como em um pipeline entre os roteadores da rede, da origem at? o destino da comunica??o, permitindo inclusive comunica??es simult?neas entre pares de origem e destinos diferentes. Partindo desse fato, prop?ese transformar toda a infraestrutura de comunica??o de uma rede em chip, aproveitando os mecanismos de roteamento, arbitragem e memoriza??o em um sistema de processamento paralelo de alto desempenho. Nessa proposta os pacotes s?o formados por instru??es e dados que representam as aplica??es, os quais s?o executados nos roteadores enquanto s?o transmitidos, aproveitando o pipeline das transmiss?es e a comunica??o paralela. Em contrapartida, n?o s?o utilizados processadores tradicionais, mas apenas n?cleos simples que controlam o acesso a mem?ria. Uma implementa??o dessa ideia ? a arquitetura intitulada IPNoSys (Integrated Processing NoC System), que conta com um modelo de programa??o pr?prio e um algoritmo de roteamento que garante a execu??o de todas as instru??es presentes nos pacotes, prevenindo situa??es de deadlock, livelock e starvation. Essa arquitetura apresenta mecanismos de entrada e sa?da, interrup??o e suporte ao sistema operacional. Como prova de conceito foi desenvolvido um ambiente de programa??o e simula??o para esta arquitetura em SystemC, o qual permite a configura??o de v?rios par?metros da arquitetura e obten??o dos resultados para avalia??o da mesma
62

Hiérarchie mémoire dans les systèmes intégrés multiprocesseurs construits autour de réseaux sur puce / Memory hierarchy in embedded multiprocessor system built around networks on chip

Belhadj Amor, Hela 05 October 2017 (has links)
Les systèmes parallèles de type multi/pluri-cœurs permettant d'obtenir une grande puissance de calcul à bas coût énergétique sont de nos jours une réalité. Néanmoins, l'exploitation des performances de ces architectures dépend de l'efficacité du système à gérer les accès aux données. Le but de nos travaux est d'améliorer l'efficacité de ces accès en exploitant les caractéristiques de l'architecture matérielle.Dans une première partie, nous proposons une nouvelle organisation de la hiérarchie des mémoires caches qui maximise l'utilisation de l'espace de stockage disponible à chaque niveau. Cette solution, basée sur les architectures à accès non uniforme au cache (NUCA), supporte les transferts inter et intra-niveau de la hiérarchie. Elle requiert un protocole de cohérence de cache qui s'adapte à ses spécifications.Certes, le transfert des données au niveau de la hiérarchie est aussi un déterminant de la performance du système. Dans une seconde partie, nous prenons en compte les besoins de communication spécifiques du protocole. Nous proposons un réseau virtualisé comme support de communication ad-hoc afin de gérer le trafic de cohérence à moindre coût. Ce dernier relie les caches d'un même niveau pour supporter les transferts intra-niveaux, qui sont une spécificité de notre protocole, en vue de réduire la latence moyenne d'accès. / Multi/many-cores parallel systems for high-power computing at low energy costs are nowadays a reality. However, exploiting the performance of these architectures depends on the efficiency of the system in managing data accesses. The aim of our work is to improve the efficiency of these accesses by exploiting the hardware architecture characteristics.In a first part, we propose a new cache hierarchy organization that aims at maximizing the use of the available storage space at each level. This solution, based on non-uniform cache access architectures (NUCA), supports inter and intra-level transfers of the hierarchy. It requires a cache coherency protocol that suits its specifications.Obviously, the transfer of data in the hierarchy is also a determinant of the system performance. In a second part, we consider the specific communication needs of the protocol. We suggest the use of a virtualized network as an ad-hoc communication medium to manage consistency traffic at a lower cost. It links the caches of the same level to support intra-level transfers, which are a specificity of our protocol, in order to reduce the average access latency.
63

Compilation d'applications flot de données paramétriques pour MPSoC dédiés à la radio logicielle / Compilation of Parametric Dataflow Applications for Software-Defined-Radio-Dedicated MPSoCs

Dardaillon, Mickaël 19 November 2014 (has links)
Le développement de la radio logicielle fait suite à l’évolution rapide du domaine des télécommunications. Les besoins en performance et en dynamicité ont donné naissance à des MPSoC dédiés à la radio logicielle. La spécialisation de ces MPSoC rend cependant leur pro- grammation et leur vérification complexes. Des travaux proposent d’atténuer cette complexité par l’utilisation de paradigmes tels que le modèle de calcul flot de données. Parallèlement, le besoin de modèles flexibles et vérifiables a mené au développement de nouveaux modèles flot de données paramétriques. Dans cette thèse, j’étudie la compilation d’applications utilisant un modèle de calcul flot de données paramétrique et ciblant des plateformes de radio logicielle. Après un état de l’art du matériel et logiciel du domaine, je propose un raffinement de l’ordonnancement flot de données, et présente son application à la vérification des tailles mémoires. Ensuite, j’introduis un nouveau format de haut niveau pour définir le graphe et les acteurs flot de données, ainsi que le flot de compilation associé. J’applique ces concepts à la génération de code optimisé pour la plateforme de radio logicielle Magali. La compilation de parties du protocole LTE permet d’évaluer les performances du flot de compilation proposé. / The emergence of software-defined radio follows the rapidly evolving telecommunication domain. The requirements in both performance and dynamicity has engendered software- defined-radio-dedicated MPSoCs. Specialization of these MPSoCs make them difficult to program and verify. Dataflow models of computation have been suggested as a way to mi- tigate this complexity. Moreover, the need for flexible yet verifiable models has led to the development of new parametric dataflow models. In this thesis, I study the compilation of parametric dataflow applications targeting software-defined-radio platforms. After a hardware and software state of the art in this field, I propose a new refinement of dataflow scheduling, and outline its application to buffer size’s verification. Then, I introduce a new high-level format to define dataflow actors and graph, with the associated compilation flow. I apply these concepts to optimised code generation for the Magali software-defined-radio platform. Compilation of parts of the LTE protocol are used to evaluate the performances of the proposed compilation flow.
64

Estimation à haut-niveau des dégradations temporelles dans les processeurs : méthodologie et mise en oeuvre logicielle / Aging and IC timing estimation at high level : methodology and simulation

Bertolini, Clément 13 December 2013 (has links)
Actuellement, les circuits numériques nécessitent d'être de plus en plus performants. Aussi, les produits doivent être conçus le plus rapidement possible afin de gagner les précieuses parts de marché. Les méthodes rapides de conception et l'utilisation de MPSoC ont permis de satisfaire à ces exigences, mais sans tenir compte précisément de l'impact du vieillissement des circuits sur la conception. Or les MPSoC utilisent les technologies de fabrication les plus récentes et sont de plus en plus soumis aux défaillances matérielles. De nos jours, les principaux mécanismes de défaillance observés dans les transistors des MPSoC sont le HCI et le NBTI. Des marges sont alors ajoutées pour que le circuit soit fonctionnel pendant son utilisation, en considérant le cas le plus défavorable pour chaque mécanisme. Ces marges deviennent de plus en plus importantes et diminuent les performances attendues. C'est pourquoi les futures méthodes de conception nécessitent de tenir compte des dégradations matérielles en fonction de l’utilisation du circuit. Dans cette thèse, nous proposons une méthode originale pour simuler le vieillissement des MPSoC à haut niveau d'abstraction. Cette méthode s'applique lors de la conception du système c.-à-d. entre l'étape de définition des spécifications et la mise en production. Un modèle empirique permet d'estimer les dégradations temporelles en fin de vie d'un circuit. Un exemple d'application est donné pour un processeur embarqué et les résultats pour un ensemble d'applications sont reportés. La solution proposée permet d'explorer différentes configurations d'une architecture MPSoC pour comparer le vieillissement. Aussi, l'application la plus sévère pour le vieillissement peut être identifiée. / Nowadays, more and more performance is expected from digital circuits. What’s more, the market requires fast conception methods, in order to propose the newest technology available. Fast conception methods and the utilization of MPSoC have enabled high performance and short time-to-market while taking little attention to aging. However, MPSoC are more and more prone to hardware failures that occur in transistors. Today, the prevailing failure mechanisms in MPSoC are HCI and NBTI. Margins are usually added on new products to avoid failures during execution, by considering worst case scenario for each mechanism. For the newest technology, margins are becoming more and more important and products performance is getting lower and lower. That’s why the conception needs to take into account hardware failures according to the execution of software. This thesis propose a new methodology to simulate aging at high level of abstraction, which can be applied to MPSoC. The method can be applied during product conception, between the specification phase and the production. An empirical model is used to estimate slack time at circuit's end of life. A use case is conducted on an embedded processor and degradation results are reported for a set of applications. The solution enables architecture exploration and MPSoC aging can thus be compared. The software with most severe impact on aging can also be determined.
65

Conception et développement d'un circuit multiprocesseurs en ASIC dédié à une caméra intelligente / Design of a multiprocessor ASIC dedicated to smart camera

Boussadi, Mohamed Amine 25 February 2015 (has links)
Suffisante pour exécuter les algorithmes à la cadence de ces capteurs d’images performants, tout en gardant une faible consommation d’énergie. Les systèmes monoprocesseur n’arrivent plus à satisfaire les exigences de ce domaine. Ainsi, grâce aux avancées technologiques et en s’appuyant sur de précédents travaux sur les machines parallèles, les systèmes multiprocesseurs sur puce (MPSoC) représentent une solution intéressante et prometteuse. Dans de précédents travaux à cette thèse, la cible technologique pour développer de tels systèmes était les FPGA. Or les résultats ont montré les limites de cette cible en terme de ressource matérielles et en terme de performance (vitesse notamment). Ce constat nous amène à changer de cible c’est-à-dire à passer sur cible ASIC nécessitant ainsi de retravailler profondément l’architecture et les IPs qui existaient autour de la méthode existante (appelée HNCP, pour Homogeneous Network of Communicating Processors). Afin de bénéficier de la performance offerte par la cible ASIC, les systèmes multiprocesseurs proposés s’appuient sur la flexibilité de son architecture. Combinés à des squelettes de parallélisation facilitant la programmabilité de l’architecture, les circuits proposés permettent d’offrir des systèmes supportant le portage en temps réels de différentes classes d’algorithme de traitement d’images. Le résultat de ce travail a abouti à la fabrication d’un circuit intégré à base d’un seul processeur et de ses périphériques en technologie ST CMOS 65nm dont la surface est d’environ 1 mm² et à la définition de 2 architectures multiprocesseurs flexibles basées sur le concept des squelettes de parallélisation (une architecture de 16 coeurs de processeur en technologie ST CMOS 65 nm et une deuxième architecture de 64 coeurs de processeur en technologie ST CMOS FD-SOI 28 nm). / Smart sensors today require processing components with sufficient power to run algorithms at the rate of these high-performance image sensors, while maintaining low power consumption. Monoprocessor systems are no longer able to meet the requirements of this field. Thus, thanks to technological advances and based on previous works on parallel computers, multiprocessor systems on chip (MPSoC) represent an interesting and promising solution. Previous works around this thesis have used FPGA as technological target. However, results have shown the limits of this target in terms of hardware resources and in terms of performance (speed in particular). This observation leads us to change the target from FPGA to ASIC. This migration requires deep rework at the architecture level. Particularly, existing IPs around the method (called HNCP for Homogeneous Network of Communicating Processors) have to be revisited. To take advantage of the performance offered by the ASIC target, proposed multiprocessor systems are based on the flexibility of its architecture. Combined with parallel skeletons that ease programmability of the architecture, the proposed circuits allow to offer systems that support various real-time image processing algorithms. This work has led to the fabrication of an integrated circuit based on a single processor and its peripheral using ST CMOS 65nm technology with an area around 1 mm². Moreover, two flexible multiprocessor architectures based on the concept of parallel skeletons have been proposed (a 16 cores 65 nm CMOS multiprocessors and a 64 cores 28 nm FD-SOI CMOS multiprocessors).
66

Evaluation of Software Architectures in the Automotive Domain for Multicore Targets in regard to Architectural Estimation Decisions at Design Time

Roßbach, André Christian 05 November 2014 (has links)
In this decade the emerging multicore technology will hit the automotive industry. The increasing complexity of the multicore-systems will make a manual verification of the safety and realtime constraints impossible. For this reason, dedicated methods and tools are utterly necessary, in order to deal with the upcoming multicore issues. A lot of researchprojects for new hardware platforms and software frameworks for the automotive industry are running nowadays, because the paradigms of the “High-Performance Computing” and “Server/Desktop Domain” cannot be easily adapted for the embedded systems. One of the difficulties is the early suitability estimation of a hardware platform for a software architecture design, but hardly a research-work is tackling that. This thesis represents a procedure to evaluate the plausibility of software architecture estimations and decisions at design stage. This includes an analysis technique of multicore systems, an underlying graph-model – to represent the multicore system – and a simulation tool evaluation. This can guide the software architect, to design a multicore system, in full consideration of all relevant parameters and issues.:Contents List of Figures vii List of Tables viii List of Abbreviations ix 1. Introduction 1 1.1. Motivation 1 1.2. Scope 2 1.3. Goal and Tasks 2 1.4. Structure of the Thesis 3 I. Multicore Technology 4 2. Fundamentals 5 2.1. Automotive Domains 5 2.2. Embedded System 7 2.2.1. Realtime 7 2.2.2. Runtime Predictions 8 2.2.3. Multicore Processor Architectures 8 2.3. Development of Automotive Embedded Systems 9 2.3.1. Applied V-Model 9 2.3.2. System Description and System Implementation 10 2.4. Software Architecture 11 2.5. Model Description of Software Structures 13 2.5.1. Design Domains of Multicore Systems 13 2.5.2. Software Structure Components 13 3. Trend and State of the Art of Multicore Research, Technology and Market 17 3.1. The Importance of Multicore Technology 17 3.2. Multicore Technology for the Automotive Industry 19 3.2.1. High-Performance Computing versus Embedded Systems 19 3.2.2. The Trend for the Automotive Industry 20 3.2.3. Examples of Multicore Hardware Platforms 23 3.3. Approaches for Upcoming Multicore Problems 24 3.3.1. Migration from Single-Core to Multicore 24 3.3.2. Correctness-by-Construction 25 3.3.3. AUTOSAR Multicore System 26 3.4. Software Architecture Simulators 28 3.4.1. Justification for Simulation Tools 28 3.4.2. System Model Simulation Software 29 3.5. Current Software Architecture Research Projects 31 3.6. Portrait of the current Situation 32 3.7. Summary of the Multicore Trend 32 II. Identification of Multicore System Parameters 34 4. Project Analysis to Identify Crucial Parameters 35 4.1. Analysis Procedure 35 4.1.1. Question Catalogue 36 4.1.2. Three Domains of Investigation 37 4.2. Analysed Projects 41 4.2.1. Project 1: Online Camera Calibration 41 4.2.2. Project 2: Power Management 45 4.2.3. Project 3: Battery Management 46 4.3. Results of Project Analysis 51 4.3.1. Ratio of Parameter Influence 51 4.3.2. General Influences of Parameters 53 5. Abstract System Model 54 5.1. Requirements for the System-Model 54 5.2. Simulation Tool Model Evaluation 55 5.2.1. System Model of PRECISION PRO 55 5.2.2. System Model of INCHRON 57 5.2.3. System Model of SymTA/S 58 5.2.4. System Model of Timing Architects 59 5.2.5. System Model of AMALTHEA 60 5.3. Concept of Abstract System Model 62 5.3.1. Components of the System Model 63 5.3.2. Software Function-Graph 63 5.3.3. Hardware Architecture-Graph 64 5.3.4. Specification-Graph for Mapping 65 6. Testcase Implementation 67 6.1. Example Test-System 68 6.1.1. Simulated Test-System 70 6.1.2. Testcases 73 6.2. Result of Tests 74 6.2.1. Processor Core Runtime Execution 74 6.2.2. Communication 75 6.2.3. Memory Access 76 6.3. Summary of Multicore System Parameters Identification 78 III. Evaluation of Software Architectures 80 7. Estimation-Procedure 81 7.1. Estimation-Procedure in a Nutshell 81 7.2. Steps of Estimation-Procedure 82 7.2.1. Project Analysis 82 7.2.2. Timing and Memory Requirements 83 7.2.3. System Modelling 84 7.2.4. Software Architecture Simulation 85 7.2.5. Results of a Validated Software Architecture 86 7.2.6. Feedback of Partly Implemented System 88 8. Implementation and Simulation 89 8.1. Example Project Analysis – Online Camera Calibration 89 8.1.1. Example Project Choice 90 8.1.2. OCC Timing Requirements Analysis 90 8.2. OCC Modelling 94 8.2.1. OCC Software Function-Graph 95 8.2.2. OCC Hardware Architecture 96 8.2.3. OCC Mapping – Specification-Graph 101 8.3. Simulation of the OCC Model with Tool Support 102 8.3.1. Tasks for Tool Setup 103 8.3.2. PRECISION PRO 105 8.3.3. INCHRON 107 8.3.4. SymTA/S 108 8.3.5. Timing Architects 112 8.3.6. AMALTHEA 115 8.4. System Optimisation Possibilities 116 8.5. OCC Implementation Results 117 9. Results of the Estimation-Procedure Evaluation 119 9.1. Tool-Evaluation Results 119 9.2. Findings of Estimation, Simulation and ECU-Behavior. 123 9.2.1. System-Specific Issues 123 9.2.2. Communication Issues 123 9.2.3. Memory Issues 124 9.2.4. Timing Issues 124 9.3. Summary of the Software Architecture Evaluation 125 10.Summary and Outlook 127 10.1. Summary 127 10.2. Usability of the Estimation-Procedure 128 10.3. Outlook and Future Work 129 11. Bibliography xii IV. Appendices xxi A. Appendices xxii A.1. Embedded Multicore Technology Research Projects xxii A.1.1. Simulation Software xxii A.1.2. Multicore Software Research Projects xxiii A.2. Testcase Implementation Results xxvi A.2.1. Function Block Processor Core Executions xxvi A.2.2. Memory Access Mechanism xxvii A.2.3. Memory Access Timings of Different Datatypes xxviii A.2.4. Inter-Processor Communication xxix A.3. Further OCC System Description xxxii A.3.1. OCC Timing Requirements of the FB xxxii A.3.2. INCHRON Validation Results xxxiv A.4. Detailed System Optimisation xxxv A.4.1. Optimisation through Hardware Alternation xxxv A.4.2. Optimisation through Mapping Alternation xxxv A.4.3. Optimisation of Execution Timings xxxvii B. Estimation-Procedure Engineering Paper xl B.1. Components and Scope of Software Architecture xl B.2. Estimation-Procedure in a Nutshell xlii B.3. Project Analysis xliii B.3.1. System level analysis xliv B.3.2. Communication Domain xlv B.3.3. Processor Core Domain xlvi B.3.4. Memory Domain xlvii B.3.5. Timing and Memory Requirements xlviii B.4. System Modelling xlix B.4.1. Function Model xlix B.4.2. Function-Graph l B.4.3. Possible ECU Target l B.4.4. Architecture-Graph l B.4.5. Software Architecture Mapping li B.4.6. Domain Specific Decision Guide lii B.5. Software Architecture Simulation liii B.6. Results of a Simulated Software Architecture lv B.7. Feedback of Partly Implemented System for Software Architecture Improvement lvi B.8. Benefits of the Estimation-Procedure lvii / In den nächsten Jahren wird die aufkommende Multicore-Technologie auf die Automobil-Branche zukommen. Die wachsende Komplexität der Multicore-Systeme lässt es nicht mehr zu, die Verifikation von Sicherheits- und Echtzeit-Anforderungen manuell auszuführen. Daher sind spezielle Methoden und Werkzeuge zwingend notwendig, um gerade mit den bevorstehenden Multicore-Problemfällen richtig umzugehen. Heutzutage laufen viele Forschungsprojekte für neue Hardware-Plattformen und Software-Frameworks für die Automobil-Industrie, weil die Paradigmen des “High-Performance Computings” und der “Server/Desktop-Domäne” nicht einfach so für die Eingebetteten Systeme angewendet werden können. Einer der Problemfälle ist das frühe Erkennen, ob die Hardware-Plattform für die Software-Architektur ausreicht, aber nur wenige Forschungs-Arbeiten berücksichtigen das. Diese Arbeit zeigt ein Vorgehens-Model auf, welches ermöglicht, dass Software-Architektur Abschätzungen und Entscheidungen bereits zur Entwurfszeit bewertet werden können. Das beinhaltet eine Analyse Technik für Multicore-Systeme, ein grundsätzliches Graphen-Model, um ein Multicore-System darzustellen, und eine Simulatoren Evaluierung. Dies kann den Software-Architekten helfen, ein Multicore System zu entwerfen, welches alle wichtigen Parameter und Problemfälle berücksichtigt.:Contents List of Figures vii List of Tables viii List of Abbreviations ix 1. Introduction 1 1.1. Motivation 1 1.2. Scope 2 1.3. Goal and Tasks 2 1.4. Structure of the Thesis 3 I. Multicore Technology 4 2. Fundamentals 5 2.1. Automotive Domains 5 2.2. Embedded System 7 2.2.1. Realtime 7 2.2.2. Runtime Predictions 8 2.2.3. Multicore Processor Architectures 8 2.3. Development of Automotive Embedded Systems 9 2.3.1. Applied V-Model 9 2.3.2. System Description and System Implementation 10 2.4. Software Architecture 11 2.5. Model Description of Software Structures 13 2.5.1. Design Domains of Multicore Systems 13 2.5.2. Software Structure Components 13 3. Trend and State of the Art of Multicore Research, Technology and Market 17 3.1. The Importance of Multicore Technology 17 3.2. Multicore Technology for the Automotive Industry 19 3.2.1. High-Performance Computing versus Embedded Systems 19 3.2.2. The Trend for the Automotive Industry 20 3.2.3. Examples of Multicore Hardware Platforms 23 3.3. Approaches for Upcoming Multicore Problems 24 3.3.1. Migration from Single-Core to Multicore 24 3.3.2. Correctness-by-Construction 25 3.3.3. AUTOSAR Multicore System 26 3.4. Software Architecture Simulators 28 3.4.1. Justification for Simulation Tools 28 3.4.2. System Model Simulation Software 29 3.5. Current Software Architecture Research Projects 31 3.6. Portrait of the current Situation 32 3.7. Summary of the Multicore Trend 32 II. Identification of Multicore System Parameters 34 4. Project Analysis to Identify Crucial Parameters 35 4.1. Analysis Procedure 35 4.1.1. Question Catalogue 36 4.1.2. Three Domains of Investigation 37 4.2. Analysed Projects 41 4.2.1. Project 1: Online Camera Calibration 41 4.2.2. Project 2: Power Management 45 4.2.3. Project 3: Battery Management 46 4.3. Results of Project Analysis 51 4.3.1. Ratio of Parameter Influence 51 4.3.2. General Influences of Parameters 53 5. Abstract System Model 54 5.1. Requirements for the System-Model 54 5.2. Simulation Tool Model Evaluation 55 5.2.1. System Model of PRECISION PRO 55 5.2.2. System Model of INCHRON 57 5.2.3. System Model of SymTA/S 58 5.2.4. System Model of Timing Architects 59 5.2.5. System Model of AMALTHEA 60 5.3. Concept of Abstract System Model 62 5.3.1. Components of the System Model 63 5.3.2. Software Function-Graph 63 5.3.3. Hardware Architecture-Graph 64 5.3.4. Specification-Graph for Mapping 65 6. Testcase Implementation 67 6.1. Example Test-System 68 6.1.1. Simulated Test-System 70 6.1.2. Testcases 73 6.2. Result of Tests 74 6.2.1. Processor Core Runtime Execution 74 6.2.2. Communication 75 6.2.3. Memory Access 76 6.3. Summary of Multicore System Parameters Identification 78 III. Evaluation of Software Architectures 80 7. Estimation-Procedure 81 7.1. Estimation-Procedure in a Nutshell 81 7.2. Steps of Estimation-Procedure 82 7.2.1. Project Analysis 82 7.2.2. Timing and Memory Requirements 83 7.2.3. System Modelling 84 7.2.4. Software Architecture Simulation 85 7.2.5. Results of a Validated Software Architecture 86 7.2.6. Feedback of Partly Implemented System 88 8. Implementation and Simulation 89 8.1. Example Project Analysis – Online Camera Calibration 89 8.1.1. Example Project Choice 90 8.1.2. OCC Timing Requirements Analysis 90 8.2. OCC Modelling 94 8.2.1. OCC Software Function-Graph 95 8.2.2. OCC Hardware Architecture 96 8.2.3. OCC Mapping – Specification-Graph 101 8.3. Simulation of the OCC Model with Tool Support 102 8.3.1. Tasks for Tool Setup 103 8.3.2. PRECISION PRO 105 8.3.3. INCHRON 107 8.3.4. SymTA/S 108 8.3.5. Timing Architects 112 8.3.6. AMALTHEA 115 8.4. System Optimisation Possibilities 116 8.5. OCC Implementation Results 117 9. Results of the Estimation-Procedure Evaluation 119 9.1. Tool-Evaluation Results 119 9.2. Findings of Estimation, Simulation and ECU-Behavior. 123 9.2.1. System-Specific Issues 123 9.2.2. Communication Issues 123 9.2.3. Memory Issues 124 9.2.4. Timing Issues 124 9.3. Summary of the Software Architecture Evaluation 125 10.Summary and Outlook 127 10.1. Summary 127 10.2. Usability of the Estimation-Procedure 128 10.3. Outlook and Future Work 129 11. Bibliography xii IV. Appendices xxi A. Appendices xxii A.1. Embedded Multicore Technology Research Projects xxii A.1.1. Simulation Software xxii A.1.2. Multicore Software Research Projects xxiii A.2. Testcase Implementation Results xxvi A.2.1. Function Block Processor Core Executions xxvi A.2.2. Memory Access Mechanism xxvii A.2.3. Memory Access Timings of Different Datatypes xxviii A.2.4. Inter-Processor Communication xxix A.3. Further OCC System Description xxxii A.3.1. OCC Timing Requirements of the FB xxxii A.3.2. INCHRON Validation Results xxxiv A.4. Detailed System Optimisation xxxv A.4.1. Optimisation through Hardware Alternation xxxv A.4.2. Optimisation through Mapping Alternation xxxv A.4.3. Optimisation of Execution Timings xxxvii B. Estimation-Procedure Engineering Paper xl B.1. Components and Scope of Software Architecture xl B.2. Estimation-Procedure in a Nutshell xlii B.3. Project Analysis xliii B.3.1. System level analysis xliv B.3.2. Communication Domain xlv B.3.3. Processor Core Domain xlvi B.3.4. Memory Domain xlvii B.3.5. Timing and Memory Requirements xlviii B.4. System Modelling xlix B.4.1. Function Model xlix B.4.2. Function-Graph l B.4.3. Possible ECU Target l B.4.4. Architecture-Graph l B.4.5. Software Architecture Mapping li B.4.6. Domain Specific Decision Guide lii B.5. Software Architecture Simulation liii B.6. Results of a Simulated Software Architecture lv B.7. Feedback of Partly Implemented System for Software Architecture Improvement lvi B.8. Benefits of the Estimation-Procedure lvii
67

Modèles de simulation pour la validation logicielle et l'exploration d'architectures des systèmes multiprocesseurs sur puce

Gerin, P. 30 November 2009 (has links) (PDF)
Les systèmes sur puces actuels mettent à profit des architectures multiprocesseurs (MPSoC) afin de répondre aux exigences en termes de performances et de consommation. Cette dominance du logiciel nous contraint à débuter la validation et l'intégration avec le matériel dès les premières étapes des flots de conception. Les principales contributions de cette thèse sont (1) la proposition d'une méthodologie de conception de plateformes de simulation basée sur l'exécution native du logiciel, (2) une technique d'instrumentation permettant l'annotation du logiciel s'exécutant sur cette plateforme de simulation. Les plateformes de simulation ainsi développées permettent l'exécution de la quasi totalité du logiciel final (y compris le système d'exploitation) sur des modèles réalistes de l'architecture matérielle du système. Associées à la technique d'instrumentation, ces plateformes permettent de prendre en compte de manière précise des grandeurs physiques telles que le temps liées à l'exécution du logiciel.
68

Gestion dynamique locale de la variabilité et de la consommation dans les architectures MPSoCs / Local dynamic management of variability and power consumption in MPSoCs architectures

Vincent, Lionel 12 December 2013 (has links)
Dans le contexte du développement de systèmes embarqués alliant hautes performances et basse consommation, la recherche de l'efficacité énergétique optimale des processeurs est devenue un défi majeur. Les solutions architecturales se sont positionnées durant les dernières décennies comme d'importantes contributrices à ce challenge. Ces solutions, permettant la gestion du compromis performance de calcul/consommation, se sont dans un premier temps développées pour les circuits mono-processeurs. Elles évoluent aujourd'hui pour s'adapter aux contraintes de circuits MPSoCs de plus en plus complexes et sensibles aux déviations des procédés de fabrication, aux variations de tension et de température. Cette variabilité limite aujourd'hui drastiquement l'efficacité énergétique de chacune des unités de calcul qui composent une architecture MPSoC, car des marges pessimistes de fonctionnement sont généralement prises en compte. De grandes améliorations peuvent être attendues de la diminution de ces marges de fonctionnement en surveillant dynamiquement et localement la variabilité de chaque unité de calcul afin de réajuster ses paramètres de fonctionnement tension/fréquence. Ce travail s'insère dans une solution architecturale bas-coût nommée AVFS, basée sur une optimisation des techniques de gestion locales DVFS, permettant de réduire les marges de conception afin d'améliorer l'efficacité énergétique des MPSoCs, tout en minimisant l'impact de la solution proposée sur la surface de silicium et l'énergie consommée. Le développement d'un système de surveillance des variations locales et dynamiques de la tension et de la température à partir d'un capteur bas coût a été proposé. Une première méthode permet d'estimer conjointement la tension et la température à l'aide de tests statistiques. Une seconde permet d'accélérer l'estimation de la tension. Enfin, une méthode de calibration associée aux deux méthodes précédentes a été développée. Ce système de surveillance a été validé sur une plateforme matérielle afin d'en démontrer le caractère opérationnel. En prenant en compte les estimées de tension et de température, des politiques visant à réajuster dynamiquement les consignes des actionneurs locaux de tension et de fréquence ont été proposées. Finalement, la consommation additionnelle due à l'intégration des éléments constitutifs de l'architecture AVFS a été évaluée et comparée aux réductions de consommation atteignables grâce aux réductions des marges de fonctionnement. Ces résultats ont montré que la solution AVFS permet de réaliser des gains en consommation substantiels par rapport à une solution DVFS classique. / Nowadays, embedded systems requiring high performance and low power, the search for the optimal efficiency of the processors has become a major challenge. Architectural solutions have positioned themselves in recent decades as one of the main contributors to this challenge. These solutions enable the management of the trade-off between performance / power consumption, initially developed for single -processor systems. Today, they evolve to be adapted to the constraints of circuits MPSoCs increasingly complex and sensitive to process, voltage and temperature variations. This PVT variability limits drastically the energy efficiency of each of the processing units of a MPSoC architecture, taking into account pessimistic operating margins. Significant improvements can be expected from the reduction of the operating margins by dynamically monitoring and local variability of each resource and by adjusting its voltage / frequency operating point. This work is part of a low-cost architectural solution called AVFS, based on local DVFS optimization technique, to reduce design margins and improve the energy efficiency of MPSoCs, while minimizing the silicon surface and the energy additional cost. The development of a monitoring system of local and dynamic voltage and temperature variations using a low-cost sensor has been proposed. A first method estimates jointly voltage and temperature using statistical tests. A second one speeds up estimation of the voltage. Finally, a calibration method associated with the two previous methods has been developed. This monitoring system has been validated on a hardware platform to demonstrate its operational nature. Taking into account the estimation of voltage and temperature values, policies to dynamically adjust the set point of the local voltage and frequency actuators have been proposed. Finally, the additional power consumption due to the integration of the components of the architecture AVFS was evaluated and compared with reductions achievable through reductions in operating margins consumption. These results showed that the AVFS solution can achieve substantial power savings compared to conventional DVFS solution.
69

Gerenciamento t?rmico e energ?tico em MPSoCs

Castilhos, Guilherme Machado de 10 August 2017 (has links)
Submitted by PPG Ci?ncia da Computa??o (ppgcc@pucrs.br) on 2018-10-24T21:20:23Z No. of bitstreams: 1 ALEXANDRE LAZARETTI ZANATTA.DIS.pdf: 3682553 bytes, checksum: f4e0c608791ce6787d609d8099456e04 (MD5) / Rejected by Sheila Dias (sheila.dias@pucrs.br), reason: Devolvido devido ao trabalho que foi enviado ser de outro aluno. No TEDE est? o nome de um aluno com o t?tulo de um trabalho e no arquivo PDF que veio, est? um outro trabalho de outro aluno. on 2018-10-26T13:20:54Z (GMT) / Submitted by PPG Ci?ncia da Computa??o (ppgcc@pucrs.br) on 2018-10-26T15:12:04Z No. of bitstreams: 1 GUILHERME MACHADO DE CASTILHOS.DIS.pdf: 4635819 bytes, checksum: a352dd213c362adb08b9c172c053a214 (MD5) / Approved for entry into archive by Caroline Xavier (caroline.xavier@pucrs.br) on 2018-10-30T16:51:02Z (GMT) No. of bitstreams: 1 GUILHERME MACHADO DE CASTILHOS.DIS.pdf: 4635819 bytes, checksum: a352dd213c362adb08b9c172c053a214 (MD5) / Made available in DSpace on 2018-10-30T16:56:34Z (GMT). No. of bitstreams: 1 GUILHERME MACHADO DE CASTILHOS.DIS.pdf: 4635819 bytes, checksum: a352dd213c362adb08b9c172c053a214 (MD5) Previous issue date: 2017-08-10 / Thermal cycles and high temperatures can have a significant impact on the systems performance, power consumption and reliability, which is a major and increasingly critical design metric in emerging multi-processor embedded systems. Existing thermal management techniques rely on physical sensors to provide them temperature values to regulate the system?s operating temperature and thermal variation at runtime. However, on-chip thermal sensors present limitations (e.g., extra power and area cost), which may restrict their use in large-scale systems. In this context, this Thesis proposes a lightweight software-based runtime temperature model, enabling to capture detailed temperature distribution information of multiprocessor systems with negligible overhead in the execution time. The temperature model is embedded in a distributedmemory MPSoC platform, described at the RTL level. Results show that the average absolute temperature error estimation, compared to the HotSpot tool is smaller than 4% in systems with up to 36 processing elements. Task mapping is the process selected to act in the system, using the temperature information generated by the proposed model. Task mapping is the process of assigning a processing element to execute a given task. The number of cores in many-core systems increases the complexity of the task mapping. The main concerns of task mapping for large systems include (i) scalability; (ii) dynamic workload; and (iii) reliability. It is necessary to distribute the mapping decisions across the system to ensure scalability. The workload of emerging many-core systems may be dynamic, i.e., new applications may start at any moment, leading to different mapping scenarios. Therefore, it is necessary to execute the mapping process at runtime to support dynamic workload. The workload assignment plays a major role in the many-core system reliability. Load imbalance may generate hotspots zones and consequently thermal implications. Recently, task mapping techniques aiming at improving system reliability have been proposed in the literature. However, such approaches rely on centralized mapping decisions, which are not scalable. To address these challenges, the main goal of this Thesis is to propose a hierarchical runtime mapping heuristic, which provides scalability and fair thermal distribution. Thermal distribution inside the system increases the system reliability in long-term, due to the reduction of hotspot regions. The proposed mapping heuristic considers the PE temperature as a cost function. The proposal adopts a hierarchical thermal monitoring scheme, able to estimate at runtime the instantaneous temperature at each processing element. The mapping uses the temperature estimated by the monitoring scheme to guide the mapping decision. Results compare the proposal against a mapping heuristic whose main cost function minimizes the communication energy. Results obtained in large systems, show a decrease in the maximum temperature (best case, 8%) and improvement in the thermal distribution (best case, 50% lower standard deviation of processor temperatures). Such results demonstrate the effectiveness of the proposal. Also, a 45% increase in the lifetime of the system was achieved in the best case, using the proposed mapping. / As altas varia??es t?rmicas e de temperatura de opera??o podem ter um impacto significativo no desempenho do sistema, consumo de energia e na confiabilidade, uma m?trica cada vez mais cr?tica em sistema multiprocessados. As t?cnicas de gerenciamento t?rmico existentes dependem de sensores f?sicos para fornecer os valores de temperatura para regular a temperatura de opera??o e a varia??o t?rmica do sistema em tempo de execu??o. No entanto, os sensores t?rmicos em um chip apresentam limita??es (por exemplo, custo extra de pot?ncia e de ?rea), o que pode restringir seu uso em sistemas com uma grande quantidade de processadores. Neste contexto, esta Tese prop?e um modelo de temperatura baseado em software, realizado em tempo de execu??o, permitindo capturar informa??es detalhadas da distribui??o de temperatura de sistemas multiprocessados com custo m?nimo no desempenho das aplica??es. Para validar a proposta, o modelo foi inclu?do em uma plataforma MPSoC com mem?ria distribu?da, descrita no n?vel RTL. Al?m disso, os resultados mostram que o erro absoluto m?dio da estimativa de temperatura, em compara??o com a ferramenta HotSpot, ? menor do que 4% em sistemas com at? 36 elementos de processamento. O mapeamento de tarefas foi o processo escolhido para atuar no sistema, utilizando as informa??es de temperatura geradas pelo modelo proposto. O mapeamento de tarefas ? o processo de selecionar um elemento de processamento para executar uma determinada tarefa. O n?mero de n?cleos em sistemas multiprocessados, aumenta a complexidade do mapeamento de tarefas. As principais preocupa??es no mapeamento de tarefas em sistemas de grande porte incluem: (i) escalabilidade; (Ii) carga de trabalho din?mica; e (iii) confiabilidade. ? necess?rio distribuir a decis?o de mapeamento em todo o sistema para assegurar a escalabilidade. A carga de trabalho de sistemas multiprocessados pode ser din?mica, ou seja, novas aplica??es podem come?ar a qualquer momento, levando a diferentes cen?rios de mapeamento. Portanto, ? necess?rio executar o processo de mapeamento em tempo de execu??o para suportar carga din?mica de trabalho. A atribui??o de carga de trabalho desempenha um papel importante na confiabilidade do sistema. O desequil?brio de carga pode gerar zonas de hotspot e consequentemente implica??es t?rmicas. Recentemente, t?cnicas de mapeamento de tarefas com o objetivo de melhorar a confiabilidade do sistema foram propostas na literatura. No entanto, tais abordagens dependem de decis?es de mapeamento centralizado, que n?o s?o escal?veis. Para enfrentar esses desafios, esta Tese prop?e uma heur?stica de mapeamento hier?rquico realizado em tempo de execu??o, que ofere?a escalabilidade e uma melhor distribui??o t?rmica. A melhor distribui??o t?rmica dentro do sistema aumenta a confiabilidade do sistema a longo prazo, devido ? redu??o das varia??es t?rmicas e redu??o de zonas de hotspot. A heur?stica de mapeamento proposta considera a temperatura do PE como uma fun??o custo. A proposta adota um esquema hier?rquico de monitoramento de temperatura, capaz de estimar em tempo de execu??o a temperatura instant?nea de cada elemento de processamento. O mapeamento usa a temperatura estimada pelo m?todo de monitoramento para orientar a decis?o de mapeamento. Os resultados comparam a proposta com uma heur?stica de mapeamento cuja principal fun??o de custo minimiza a energia de comunica??o. Os resultados obtidos mostram diminui??o da temperatura m?xima (melhor caso, 8%) e melhora na distribui??o t?rmica (melhor caso, valor 50% menor do desvio padr?o das temperaturas dos processadores). Al?m disso, alcan?ou-se, no melhor caso, um aumento de 45% no tempo de vida do sistema utilizando o mapeamento proposto.
70

Stratégies de simulation rapides et algorithme adaptatif de contrôle de la tension et de la fréquence pour les MPSoCs basse consommation

Gligor, M. 09 September 2010 (has links) (PDF)
Les Systèmes sur Puce (SoC) ont vu leurs capacités en constante augmentation ce qui leur permet ainsi qu'aux applications s'exécutant dessus de devenir de plus en plus complexes grâce au pouvoir d'intégration de la technologie. Beaucoup de ces appareils fonctionnent sur batterie, mais puisque la technologie des batteries ne suit pas la même progression que l'intégration, à la fois le logiciel et le matériel de ces appareils doivent être économes en énergie. Nous proposons dans cette thèse un algorithme logiciel qui cherche à réduire la consommation énergétique en modifiant la fréquence et la tension des processeurs lorsque l'utilisation du système le permet. Cet algorithme n'a besoin d'aucune information sur les applications. Afin de tester et de déterminer l'efficacité de l'algorithme d'économie d'énergie proposé, nous avons besoin de plateformes de simulation rapides et précises qui supportent le changement de fréquence pour chaque processeur ou sous-système. Le bon niveau d'abstraction pour estimer la consommation d'énergie par la simulation n'est pas évident. Nous avons premièrement défini une stratégie de haut niveau de simulation qui combine la précision des simulateurs orientés matériel à la vitesse des simulateurs orientés comportement. Lorsque des estimations plus précises sont nécessaires, une simulation cycle accurate/bit accurate doit être utilisée. Toutefois, pour accélérer la simulation, des stratégies d'ordonnancement statique non compatibles avec le DVFS sont utilisées. Nous avons défini deux nouvelles approches supportant le DVFS dans ce contexte.

Page generated in 0.0354 seconds