Global ETD Search

131	Machine virtuelle universelle pour codage vidéo reconfigurable / A universal virtual machine for reconfigurable video coding Gorin, Jérôme 22 November 2011 (has links) Cette thèse propose un nouveau paradigme de représentation d’applications pour les machines virtuelles, capable d’abstraire l’architecture des systèmes informatiques. Les machines virtuelles actuelles reposent sur un modèle unique de représentation d’application qui abstrait les instructions des machines et sur un modèle d’exécution qui traduit le fonctionnement de ces instructions vers les machines cibles. S’ils sont capables de rendre les applications portables sur une vaste gamme de systèmes, ces deux modèles ne permettent pas en revanche d’exprimer la concurrence sur les instructions. Or, celle-ci est indispensable pour optimiser le traitement des applications selon les ressources disponibles de la plate-forme cible. Nous avons tout d’abord développé une représentation « universelle » d’applications pour machine virtuelle fondée sur la modélisation par graphe flux de données. Une application est ainsi modélisée par un graphe orienté dont les sommets sont des unités de calcul (les acteurs) et dont les arcs représentent le flux de données passant au travers de ces sommets. Chaque unité de calcul peut être traitée indépendamment des autres sur des ressources distinctes. La concurrence sur les instructions dans l’application est alors explicite. Exploiter ce nouveau formalisme de description d'applications nécessite de modifier les règles de programmation. A cette fin, nous avons introduit et défini le concept de « Représentation Canonique et Minimale » d’acteur. Il se fonde à la fois sur le langage de programmation orienté acteur CAL et sur les modèles d’abstraction d’instructions des machines virtuelles existantes. Notre contribution majeure qui intègre les deux nouvelles représentations proposées, est le développement d’une « Machine Virtuelle Universelle » (MVU) dont la spécificité est de gérer les mécanismes d’adaptation, d’optimisation et d’ordonnancement à partir de l’infrastructure de compilation Low-Level Virtual Machine. La pertinence de cette MVU est démontrée dans le contexte normatif du codage vidéo reconfigurable (RVC). En effet, MPEG RVC fournit des applications de référence de décodeurs conformes à la norme MPEG-4 partie 2 Simple Profile sous la forme de graphe flux de données. L’une des applications de cette thèse est la modélisation par graphe flux de données d’un décodeur conforme à la norme MPEG-4 partie 10 Constrained Baseline Profile qui est deux fois plus complexe que les applications de référence MPEG RVC. Les résultats expérimentaux montrent un gain en performance en exécution de deux pour des plates-formes dotées de deux cœurs par rapport à une exécution mono-cœur. Les optimisations développées aboutissent à un gain de 25% sur ces performances pour des temps de compilation diminués de moitié. Les travaux effectués démontrent le caractère opérationnel et universel de cette norme dont le cadre d’utilisation dépasse le domaine vidéo pour s’appliquer à d’autres domaine de traitement du signal (3D, son, photo…) / This thesis proposes a new paradigm that abstracts the architecture of computer systems for representing virtual machines’ applications. Current applications are based on abstraction of machine’s instructions and on an execution model that reflects operations of these instructions on the target machine. While these two models are efficient to make applications portable across a wide range of systems, they do not express concurrency between instructions. Expressing concurrency is yet essential to optimize processing of application as the number of processing units is increasing in computer systems. We first develop a “universal” representation of applications for virtual machines based on dataflow graph modeling. Thus, an application is modeled by a directed graph where vertices are computation units (the actors) and edges represent the flow of data between vertices. Each processing units can be treated apart independently on separate resources. Concurrency in the instructions is then made explicitly. Exploit this new description formalism of applications requires a change in programming rules. To that purpose, we introduce and define a “Minimal and Canonical Representation” of actors. It is both based on actor-oriented programming and on instructions ‘abstraction used in existing Virtual Machines. Our major contribution, which incorporates the two new representations proposed, is the development of a “Universal Virtual Machine” (UVM) for managing specific mechanisms of adaptation, optimization and scheduling based on the Low-Level Virtual Machine (LLVM) infrastructure. The relevance of the MVU is demonstrated on the MPEG Reconfigurable Video Coding standard. In fact, MPEG RVC provides decoder’s reference application compliant with the MPEG-4 part 2 Simple Profile in the form of dataflow graph. One application of this thesis is a new dataflow description of a decoder compliant with the MPEG-4 part 10 Constrained Baseline Profile, which is twice as complex as the reference MPEG RVC application. Experimental results show a gain in performance close to double on a two cores compare to a single core execution. Developed optimizations result in a gain on performance of 25% for compile times reduced by half. The work developed demonstrates the operational nature of this standard and offers a universal framework which exceeds the field of video domain (3D, sound, picture...) Machine virtuelle Modèle flux de données Traitement parallèle Codage vidéo reconfigurable MPEG LLVM Décodeur vidéo DDF, SDF, PSDF Virtual machine Dataflow model Parallel treatment Reconfigurable video coding MPEG LLVM Video decoder DDF, SDF, PSDF
132	The Global Interconnection Scheme of Silago : RTL Design and Verification / Den globala sammankopplingsväven av Silago : RTL Design och Verifiering Lou, Tong January 2023 (has links) The Silago concept introduces a hardware-centric platform that is based on coarse-grained reconfigurable fabrics and networks on chips(NoCs). With the intra-region and inter-region NoC, the Silago platform can form resource clusters to host various applications. The conventional global interconnection is implemented with a two-level NoC, which potentially results in heavyweight hardware and unpredictable behavior. Targeting optimizing the global inter-region data transfer, we propose a mathematical model that clarifies the scheduling mechanism, and present a software-defined interconnection solution that exploits the awareness of access pattern. The solution requires a executor which is expected to be a programmable lightweight transmitter. Considering that existing instruction set architectures(ISAs) lack direct support for single-cycle loop instruction, we propose a self-defined instruction set, which reduces the program size and enhances the schedulability. Based on the instruction set, we implemented the transmitter in the abstraction level of register transfer level(RTL). We also established a constraint random stimulus-based verification environment. The design is verified by regression test and synthesized. The results show that the design is functionally correct and synthesizable. Overall, the programmable transmitter helps to enable a composable interconnect scheme to connect hard IPs. / Silago-konceptet introducerar en hårdvarucentrerad plattform som är baserad på grovkorniga omkonfigurerbara tyger och nätverk på chips. Med intra-region och interregion NoC kan Silago-plattformen bilda resurskluster för att vara värd för olika applikationer. Den konventionella globala sammankopplingen är implementerad med en tvånivås NoC, vilket potentiellt resulterar i tung hårdvara och oförutsägbart beteende. Med inriktning på att optimera den globala dataöverföringen mellan regioner, föreslår vi en matematisk modell som klargör schemaläggningsmekanismen och presenterar en mjukvarudefinierad sammankopplingslösning som utnyttjar medvetenheten om åtkomstmönster. Lösningen kräver en executor som förväntas till en programmerbar lättviktssändare. Med tanke på att befintliga instruktionsuppsättningsarkitekturer (ISA) saknar direkt stöd för enkelcykelslinginstruktioner, föreslår vi en självdefinierad instruktionsuppsättning, som minskar programstorleken och förbättrar schemaläggningsbarheten. Baserat på instruktionsuppsättningen implementerade vi sändaren i abstraktionsnivån för registeröverföringsnivå (RTL). Vi etablerade också en slumpmässig stimulansbaserad verifieringsmiljö. Designen verifieras genom regressionstest och syntetiseras. Resultaten visar att designen är funktionellt korrekt och syntetiserbar. Silago global interconnection synchronous dataflow network on chip Silago global sammankoppling synkront dataflöde nätverk på chip Computer Sciences Datavetenskap (datalogi) Computer Engineering Datorteknik Elektroteknik och elektronik
133	Supporting Distributed Fault Tolerance In A Real-Time Micro-Kernel Menon, Suraj S. 04 December 2006 (has links) Research into modular approaches for constructing power electronics control systems has provided a number of benefits, as well as new opportunities. Control systems composed of an interconnected collection of standardized parts makes distributed processing a realistic possibility. Unfortunately, current strategies to supporting software on such systems have a number of critical drawbacks. Many existing approaches rely on centralized control strategies, fail to support fault tolerance in the face of failures among processing nodes or communications links, and fail to robustly support live addition or removal of nodes from a running network. In this context, failure of a single element means failure of the entire system. This thesis describes research to extend the Dataflow Architecture Real-time Kernel (DARK) to support distributed, fault-tolerant execution of control algorithms for power electronics control systems. An appropriate scheme for fault-tolerant scheduling of processes on distributed processing nodes is described, added to DARK, and evaluated. Literature indicates that fault-tolerant multiprocessor scheduling for hard real-time tasks with task precedence constraints is an NP-hard problem. The new system is based on an off-line fault-tolerant scheduling strategy that generates a static schedule of tasks for each processing unit to follow. This algorithm handles both the task precedence constraints and the constraints imposed by the underlying network protocol(DRPESNET). Modifications to the underlying daisy-chained, packet-switched, time-triggered ring network protocol to support communications fault tolerance and plug-and-play addition or removal of live nodes from an existing control system are also described. / Master of Science fault tolerance power electronics control system dual ring fault tolerant protocol power electronics power converter fault tolerance fault tolerant micro-kernel dataflow architecture real-time offline precedence constraints
134	A model-based design approach for heterogeneous NoC-based MPSoCs on FPGA Robino, Francesco January 2014 (has links) Network-on-chip (NoC) based multi-processor systems-on-chip (MPSoCs) are promising candidates for future multi-processor embedded platforms, which are expected to be composed of hundreds of heterogeneous processing elements (PEs) to potentially provide high performances. However, together with the performances, the systems complexity will increase, and new high level design techniques will be needed to efficiently model, simulate, debug and synthesize them. System-level design (SLD) is considered to be the next frontier in electronic design automation (EDA). It enables the description of embedded systems in terms of abstract functions and interconnected blocks. A promising complementary approach to SLD is the use of models of computation (MoCs) to formally describe the execution semantics of functions and blocks through a set of rules. However, also when this formalization is used, there is no clear way to synthesize system-level models into software (SW) and hardware (HW) towards a NoC-based MPSoC implementation, i.e., there is a lack of system design automation (SDA) techniques to rapidly synthesize and prototype system-level models onto heterogeneous NoC-based MPSoCs. In addition, many of the proposed solutions require large overhead in terms of SW components and memory requirements, resulting in complex and customized multi-processor platforms. In order to tackle the problem, a novel model-based SDA flow has been developed as part of the thesis. It starts from a system-level specification, where functions execute according to the synchronous MoC, and then it can rapidly prototype the system onto an FPGA configured as an heterogeneous NoC-based MPSoC. In the first part of the thesis the HeartBeat model is proposed as a model-based technique which fills the abstraction gap between the abstract system-level representation and its implementation on the multiprocessor prototype. Then details are provided to describe how this technique is automated to rapidly prototype the modeled system on a flexible platform, permitting to adjust the system specification until the designer is satisfied with the results. Finally, the proposed SDA technique is improved defining a methodology to automatically explore possible design alternatives for the modeled system to be implemented on a heterogeneous NoC-based MPSoC. The goal of the exploration is to find an implementation satisfying the designer's requirements, which can be integrated in the proposed SDA flow. Through the proposed SDA flow, the designer is relieved from implementation details and the design time of systems targeting heterogeneous NoC-based MPSoCs on FPGA is significantly reduced. In addition, it reduces possible design errors proposing a completely automated technique for fast prototyping. Compared to other SDA flows, the proposed technique targets a bare-metal solution, avoiding the use of an operating system (OS). This reduces the memory requirements on the FPGA platform comparing to related work targeting MPSoC on FPGA. At the same time, the performance (throughput) of the modeled applications can be increased when the number of processors of the target platform is increased. This is shown through a wide set of case studies implemented on FPGA. / <p>QC 20140609</p> Engineering and Technology Teknik och teknologier

Page generated in 0.0493 seconds