Global ETD Search

91	Scheduling of tasks in multiprocessor system using hybrid genetic algorithms Varghese, B., Hossain, M. Alamgir, Dahal, Keshav P. January 2007 (has links) This paper presents an investigation into the optimal scheduling of realtime tasks of a multiprocessor system using hybrid genetic algorithms (GAs). A comparative study of heuristic approaches such as `Earliest Deadline First (EDF)¿ and `Shortest Computation Time First (SCTF)¿ and genetic algorithm is explored and demonstrated. The results of the simulation study using MATLAB is presented and discussed. Finally, conclusions are drawn from the results obtained that genetic algorithm can be used for scheduling of real-time tasks to meet deadlines, in turn to obtain high processor utilization. Optimal scheduling Hard real-time tasks Multiprocessor system Heuristics Genetic algorithms
92	Impact of Webpage Access on the Design of Single-Chip Heterogeneous Multiprocessors Somers, Marc Steven 25 May 2007 (has links) Mobile devices are currently designed similar to embedded systems where performance is derived from a specification that allows the device to interact in a periodic manner with the environment. However, as mobile devices increasingly interact with the Internet they exhibit a different style of computing that does not fit the embedded system model. At the same time, a mobile device designer needs to consider many different issues such as the number and types of processors, scheduling strategies, applications, power consumption, and dimensions of the device, which increase the total number of design decisions at an alarming rate. This research shows that by using a more realistic model of mobile devices using webpage-based benchmarks, customization can allow specialized architectures to improve performance up to 70 percent over a homogeneous multiprocessor composed of general purpose processors and 25 percent additional improvement over the next best architecture when individual user preferences were also considered. Webpage access, to include user profiling for individual utilization, is clearly a significant factor in the design of mobile devices — and thus should be included in future benchmarks based upon webpage content and webpage access patterns. When new evaluation techniques are developed, new design strategies can be discovered and employed. / Master of Science webpage profiling custom scheduling single chip heterogeneous multiprocessor mobile system benchmarks mobile computer architecture
93	On Best-Effort Utility Accrual Real-Time Scheduling on Multiprocessors Garyali, Piyush 09 August 2010 (has links) We consider the problem of scheduling real-time tasks on a multiprocessor system. Our primary focus is scheduling on multiprocessor systems where the total task utilization demand, U, is greater than m, the number of processors on a multiprocessor system---i.e., the total available processing capacity of the system. When U > m, the system is said to be overloaded; otherwise, the system is said to be underloaded. While significant literature exists on multiprocessor real-time scheduling during underloads, little is known about scheduling during overloads, in particular, in the presence of task dependencies---e.g., due to synchronization constraints. We consider real-time tasks that are subject to time/utility function (or TUF) time constraints, which allow task urgency to be expressed independently of task importance---e.g., the most urgent task being the least important. The urgency/importance decoupling allowed by TUFs is especially important during overloads, when not all tasks can be optimally completed. We consider the timeliness optimization objective of maximizing the total accrued utility and the number of deadlines satisfied during overloads, while ensuring task mutual exclusion constraints and freedom from deadlocks. This problem is NP-hard. We develop a class of polynomial-time heuristic algorithms, called the Global Utility Accrual (or GUA) class of algorithms. The algorithms construct a directed acyclic graph representation of the task dependency relationship, and build a global multiprocessor schedule of the zero in-degree tasks to heuristically maximize the total accrued utility and ensure mutual exclusion. Potential deadlocks are detected through a cycle-detection algorithm, and resolved by aborting a task in the deadlock cycle. The GUA class of algorithms include two algorithms, namely, the Non-Greedy Global Utility Accrual (or NG-GUA) and Greedy Global Utility Accrual (or G-GUA) algorithms. NG-GUA and G-GUA differ in the way schedules are constructed towards meeting all task deadlines, when possible to do so. We establish several properties of the algorithms including conditions under which all task deadlines are met, satisfaction of mutual exclusion constraints, and deadlock-freedom. We create a Linux-based real-time kernel called ChronOS for multiprocessors. ChronOS is extended from the PREEMPT_RT real-time Linux patch, which provides optimized interrupt service latencies and real-time locking primitives. ChronOS provides a scheduling framework for the implementation of a broad range of real-time scheduling algorithms, including utility accrual, non-utility accrual, global, and partitioned scheduling algorithms. We implement the GUA class of algorithms and their competitors in ChronOS and conduct experimental studies. The competitors include G-EDF, G-NP-EDF, G-FIFO, gMUA, P-EDF and P-DASA. Our study reveals that the GUA class of algorithms accrue higher utility and satisfy greater number of deadlines than the deadline-based scheduling algorithms by as much as 750% and 600%, respectively. In addition, we observe that G-GUA accrues higher utility than NG-GUA during overloads by as much as 25% while NG-GUA satisfies greater number of deadlines than G-GUA by as much as 5% during underloads. / Master of Science Time/Utility Functions Utility Accrual Scheduling Multiprocessor Real-Time Scheduling Real-Time Linux
94	Elliptic Curve Cryptography on Heterogeneous Multicore Platform Morozov, Sergey Victorovich 15 September 2010 (has links) Elliptic curve cryptography (ECC) is becoming the algorithm of choice for digital signature generation and authentication in embedded context. However, performance of ECC and the underlying modular arithmetic on embedded processors remains a concern. At the same time, more complex system-on-chip platforms with multiple heterogeneous cores are commonly available in mobile phones and other embedded devices. In this work we investigate the design space for ECC on TI's OMAP 3530 platform, with a focus of utilizing the on-chip DSP core to improve the performance and efficiency of ECC point multiplication on the target platform. We examine multiple aspects of ECC and heterogeneous design such as algorithm-level choices for elliptic curve operations and the effect of interprocessor communication overhead on the design partitioning. We observe how the limitations of the platform constrict the design space of ECC. However, by closely studying the platform and efficiently partitioning the design between the general purpose ARM core and the DSP, we demonstrate a significant speed-up of the resulting ECC implementation. Our system focused approach allows us to accurately measure the performance and power profiles of the resulting implementation. We conclude that heterogeneous multiprocessor design can significantly improve the performance and power consumption of ECC operations, but that the integration cost and the overhead of interprocessor communication cannot be ignored in any actual system. / Master of Science Binary Field DSP ARM Cryptography Elliptic Curve Prime Field Multiprocessor Point Multiplication Multicore
95	Compilation d'applications flot de données paramétriques pour MPSoC dédiés à la radio logicielle / Compilation of Parametric Dataflow Applications for Software-Defined-Radio-Dedicated MPSoCs Dardaillon, Mickaël 19 November 2014 (has links) Le développement de la radio logicielle fait suite à l’évolution rapide du domaine des télécommunications. Les besoins en performance et en dynamicité ont donné naissance à des MPSoC dédiés à la radio logicielle. La spécialisation de ces MPSoC rend cependant leur pro- grammation et leur vérification complexes. Des travaux proposent d’atténuer cette complexité par l’utilisation de paradigmes tels que le modèle de calcul flot de données. Parallèlement, le besoin de modèles flexibles et vérifiables a mené au développement de nouveaux modèles flot de données paramétriques. Dans cette thèse, j’étudie la compilation d’applications utilisant un modèle de calcul flot de données paramétrique et ciblant des plateformes de radio logicielle. Après un état de l’art du matériel et logiciel du domaine, je propose un raffinement de l’ordonnancement flot de données, et présente son application à la vérification des tailles mémoires. Ensuite, j’introduis un nouveau format de haut niveau pour définir le graphe et les acteurs flot de données, ainsi que le flot de compilation associé. J’applique ces concepts à la génération de code optimisé pour la plateforme de radio logicielle Magali. La compilation de parties du protocole LTE permet d’évaluer les performances du flot de compilation proposé. / The emergence of software-defined radio follows the rapidly evolving telecommunication domain. The requirements in both performance and dynamicity has engendered software- defined-radio-dedicated MPSoCs. Specialization of these MPSoCs make them difficult to program and verify. Dataflow models of computation have been suggested as a way to mi- tigate this complexity. Moreover, the need for flexible yet verifiable models has led to the development of new parametric dataflow models. In this thesis, I study the compilation of parametric dataflow applications targeting software-defined-radio platforms. After a hardware and software state of the art in this field, I propose a new refinement of dataflow scheduling, and outline its application to buffer size’s verification. Then, I introduce a new high-level format to define dataflow actors and graph, with the associated compilation flow. I apply these concepts to optimised code generation for the Magali software-defined-radio platform. Compilation of parts of the LTE protocol are used to evaluate the performances of the proposed compilation flow. Télécommunications Radio logicielle Flot de données Système embarqué Multiprocessor System-On-Chip - MPSoC Compilation de données Protocole LTE Telecommunications Software radio Data Flow Embedded System Multiprocessor System-On-Chip - MPSoC Data compilation LTE Protocol 621.384 028 507 2
96	Design and Multi-Technology Multi-objective Comparative Analysis of Families of MPSOC. Wang, Zhoukun 12 November 2009 (has links) (PDF) Multiprocessor system on chip (MPSOC) have strongly emerged in the past decade in communication, multimedia, networking and other embedded domains. MPSOC became a new paradigm of high performance embedded application design. This thesis addresses the design and the physical implementation of a Network on Chip (NoC) based Multiprocessor System on Chip. We studied several aspects at different design stages: high level synthesis, architecture design, FPGA implementation, application evaluation and ASIC physical implementation. We try to analysis and find the impacts of these aspects for the MPSOC's final performance, power consumption and area cost. We implemented a NoC based 16 processors embedded system on FPGA prototyping. Three NoCs provide different functionalities for sixteen PE tiles. We also demonstrated the use of our performance monitoring system for software debugging and tuning. With the bi-synchronous FIFO method, our GALS architecture successfully solves the long clock signal distribution problem and allows that each clock domain can run at its own clock frequency. On the other hand we successfully implemented AES and TDES block cipher cryptographic algorithms on this platform and results show linear speedup in computation time. The network part of our architecture has been implemented on ASIC technology and has been explored with different timing constraints and different library categories of STmicroelectronics' 65nm/45nm technologies. The experimental results of ASIC and FPGA are compared, and we inducted the discussion of technology change impact on parallel programming. Network-on-Chip Multiprocessor system on chip High level synthesis Fpga Asic
97	Exploration d'architectures et allocation/affectation mémoire dans les systèmes multiprocesseurs mono puce = Architectures exploration and memory allocation/assignment in multiprocessor SoC Meftali, S. 06 September 2002 (has links) (PDF) Les dernières années ont connu une grande évolution dans la technologie de fabrication des circuits intégrés. Ces derniers sont de plus en plus complexes. Ils intègrent des parties dites logicielles (processeurs + programmes) et des parties matérielles dédiées ou spécifiques de calcul ou de mémorisation. <br />De nombreuses applications dans les domaines du multimédia et des télécommunications sont apparues. Elles nécessitent l'intégration de mémoires de différents types et tailles dans ces modèles d'architectures multiprocesseurs. Dans ces applications embarquées, les performances du système sont étroitement liées à celles de la partie mémoire. Celle-ci occupe plus de 90% de la surface du système, et la consommation en énergie ainsi que les performances temporelles du système sont essentiellement dues au stockage et à l'échange de données entre les différents composants. <br />Avec cette présence croissante de la mémoire dans les systèmes monopuce, on note de nos jours l'absence d'une méthodologie systématique et optimisée pour la conception de tels systèmes avec une architecture mémoire spécifique. <br />Nous proposons dans cette thèse un flot de conception d'une architecture mémoire spécifique pour les systèmes monopuce. L'architecture mémoire est obtenue avec une méthode exacte basée sur un modèle de programmation linéaire en nombres entiers. Ce modèle permet d'obtenir une architecture mémoire distribuée partagée optimale pour l'application, minimisant le coût global des accès aux données partagées et le coût de la mémoire. On réalise ensuite automatiquement les transformations de l'architecture et du code de l'application en fonction de l'architecture mémoire choisie. Cette nouvelle spécification système (architecture + code applicatif) reste simulable.<br />La faisabilité et les performances de ce flot ont été testées sur l'application du VDSL. [INFO:INFO_OH] Computer Science/Other synthèse haut niveau
98	Energy-Aware Real-Time Scheduling in Embedded Multiprocessor Systems/Ordonnancement temps réel dans les systèmes embarqués multiprocesseurs contraints par l'énergie Nélis, Vincent M.P. 18 October 2010 (has links) Nowadays, computer systems are everywhere. From simple portable devices such as watches and MP3 players to large stationary installations that control nuclear power plants, computer systems are now present in all aspects of our modern and every-day life. In about only 70 years, they have completely perturbed our way of life and they reached a so high degree of sophistication that they will be soon capable of driving our cars and cleaning our houses without any human intervention. As computer systems gain in responsibilities, it becomes essential that they provide both safety and reliability. Indeed, a failure in systems such as the anti-lock braking system (ABS) in cars could threaten human lives and generate catastrophic and irreversible consequences. Hence, for many years, researchers have addressed these emerging problems of system safety and reliability which come along with this fulgurant evolution. This thesis provides a general overview of embedded real-time computer systems, i.e., a particular kind of computer system whose number grows daily. We provide the reader with some preliminary knowledge and a good understanding of the concepts that underlie this emerging technology. We focus especially on the theoretical problems related to the real-time issue and brieﬂy summarizes the main solutions, together with their advantages and drawbacks. This brings the reader through all the conceptual layers constituting a computer system, from the software level---the logical part---that speciﬁes both the system behavior and requirements to the hardware level---the physical part---that actually performs the expected treatments and reacts to the environment. In the meanwhile, we introduce the theoretical models that allow researchers for theoretical analyses which ensure that all the system requirements are fulﬁlled. Finally, we address the energy consumption problem in embedded systems. We describe the various factors of power dissipation in modern technologies and we introduce different solutions to reduce this consumption./Cette thèse se focalise sur un type de systèmes informatiques bien précis appelés “systèmes embarqués temps réel”. Un système est dit “embarqué” lorsqu’il est développé afin de servir un but bien précis. Un téléphone portable est un parfait exemple de système embarqué étant donné que toutes ses fonctionnalités sont rigoureusement définies avant même sa conception. Au contraire, un ordinateur personnel n’est généralement pas considéré comme un système embarqué, les concepteurs ne sachant pas à l’avance à quelles fins il sera utilisé. Une grande partie de ces systèmes embarqués ont des contraintes temporelles très fortes, ce qui les distingue encore plus des ordinateurs grand public. A titre d’exemple, lorsqu’un conducteur de voiture freine brusquement, l’ordinateur de bord déclenche l’application ABS et il est primordial que cette application soit traitée endéans une courte échéance. Autrement dit, cette fonctionnalité ABS doit être traitée prioritairement par rapport aux autres fonctionnalités du véhicule. Ce type de système embarqué est alors dit “temps réel”, dû à ces notions de temps et de priorités entre les applications. La problèmatique posée par les systèmes temps réel est la suivante. Comment déterminer, à tout moment, un ordre d’exécution des différentes fonctionnalités de telle sorte qu’elles soient toutes exécutées entièrement endéans leur échéance ? De plus, avec l’apparition récente des systèmes multiprocesseurs, cette problématique s’est fortement complexifiée, vu que le système doit à présent déterminer quelle fonctionnalité s’exécute à quel moment sur quel processeur afin que toutes les contraintes temporelles soient respectées. Pour finir, ces systèmes embarqués temp réel multiprocesseurs se sont rapidement retrouvés confrontés à un problème de consommation d’énergie. Leur demande en terme de performance (et donc en terme d’énergie) à évolué beaucoup plus rapidement que la capacité des batteries qui les alimentent. Ce problème est actuellement rencontré par de nombreux systèmes, tels que les téléphones portables par exemple. L’objectif de cette thèse est de parcourir les différents composants de tels système embarqués et de proposer des solutions afin de réduire leur consommation d’énergie. ordonnancement multiprocesseur systèmes embarqués ordonnancement temps réel energy-aware scheduling multiprocessor scheduling embedded systems real-time scheduling
99	Scheduling Algorithms for Instruction Set Extended Symmetrical Homogeneous Multiprocessor Systems-on-Chip Montcalm, Michael R. 10 June 2011 (has links) Embedded system designers face multiple challenges in fulfilling the runtime requirements of programs. Effective scheduling of programs is required to extract as much parallelism as possible. These scheduling algorithms must also improve speedup after instruction-set extensions have occurred. Scheduling of dynamic code at run time is made more difficult when the static components of the program are scheduled inefficiently. This research aims to optimize a program’s static code at compile time. This is achieved with four algorithms designed to schedule code at the task and instruction level. Additionally, the algorithms improve scheduling using instruction set extended code on symmetrical homogeneous multiprocessor systems. Using these algorithms, we achieve speedups up to 3.86X over sequential execution for a 4-issue 2-processor system, and show better performance than recent heuristic techniques for small programs. Finally, the algorithms generate speedup values for a 64-point FFT that are similar to the test runs. Scheduling ILP System on Chip SoC Instruction level parallelism Integer Linear Program Custom Instruction Instruction Set Extension Multiprocessor
100	An Interconnection Network for a Cache Coherent System on FPGAs Mirian, Vincent 12 January 2011 (has links) Field-Programmable Gate Arrays (FPGAs) systems now comprise many processing elements that are processors running software and hardware engines used to accelerate specific functions. To make the programming of such a system simpler, it is easiest to think of a shared-memory environment, much like in current multi-core processor systems. This thesis introduces a novel, shared-memory, cache-coherent infrastructure for heterogeneous systems implemented on FPGAs that can then form the basis of a shared-memory programming model for heterogeneous systems. With simulation results, it is shown that the cache-coherent infrastructure outperforms the infrastructure of Woods [1] with a speedup of 1.10. The thesis explores the various configurations of the cache interconnection network and the benefit of the cache-to-cache cache line data transfer with its impact on main memory access. Finally, the thesis shows the cache-coherent infrastructure has very little overhead when using its cache coherence implementation. FPGA Cache Coherence Interconnection Network Protocols Multiprocessor muti-thread Pthread (Posix Thread) programming model parrallel programming shared-memory model 0544

Search results