• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 19
  • 5
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 30
  • 30
  • 7
  • 6
  • 6
  • 6
  • 6
  • 6
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

A Generalized Framework for Automatic Code Partitioning and Generation in Distributed Systems

Sairaman, Viswanath 05 February 2010 (has links)
In distributed heterogeneous systems the partitioning of application software to be executed in a distributed fashion is a challenge by itself. The task of code partitioning for distributed processing involves partitioning the code into clusters and mapping those code clusters to the individual processing elements interconnected through a high speed network. Code generation is the process of converting the code partitions into individually executable code clusters and satisfying the code dependencies by adding communication primitives to send and receive data between dependent code clusters. In this work, we describe a generalized framework for automatic code partitioning and code generation for distributed heterogeneous systems. A model for system level design and synthesis using transaction level models has also been developed and is presented. The application programs along with the partition primitives are converted into independently executable concrete implementations. The process consists of two steps, first translating the primitives of the application program into equivalent code clusters, and then scheduling the implementations of these code clusters according to the inherent data dependencies. Further, the original source code needs to be reverse engineered in order to create a meta-data table describing the program elements and dependency trees. The data gathered, is used along with Parallel Virtual Machine (PVM) primitives for enabling the communication between the partitioned programs in the distributed environment. The framework consists of profiling tools, partitioning methodology, architectural exploration and cost analysis tools. The partitioning algorithm is based on clustering, in which the code clusters are created to minimize communication overhead represented as data transfers in task graph for the code. The proposed approach has been implemented and tested for different applications and compared with simulated annealing and tabu search based partitioning algorithms. The objective of partitioning is to minimize the communication overhead. While the proposed approach performs comparably with simulated annealing and better than tabu search based approaches in most cases in terms of communication overhead reduction, it is conclusively faster than simulated annealing and tabu search by an order of magnitude as indicated by simulation results. The proposed framework for system level design/synthesis provides an end to end rapid prototyping approach for aiding in architectural exploration and design optimization. The level of abstraction in the design phase can be fine tuned using transaction level models.
12

Optimization of Heterogeneous Parallel Computing Systems using Machine Learning

Adurti, Devi Abhiseshu, Battu, Mohit January 2021 (has links)
Background: Heterogeneous parallel computing systems utilize the combination of different resources CPUs and GPUs to achieve high performance and, reduced latency and energy consumption. Programming applications that target various processing units requires employing different tools and programming models/languages. Furthermore, selecting the most optimal implementation, which may either target different processing units (i.e. CPU or GPU) or implement the various algorithms, is not trivial for a given context. In this thesis, we investigate the use of machine learning to address the selection problem of various implementation variants for an application running on a heterogeneous system. Objectives: This study is focused on providing an approach for optimization of heterogeneous parallel computing systems at runtime by building the most efficient machine learning model to predict the optimal implementation variant of an application. Methods: The six machine learning models KNN, XGBoost, DTC, Random Forest Classifier, LightGBM, and SVM are trained and tested using stratified k-fold on the dataset generated from the matrix multiplication application for square matrix input dimension ranging from 16x16 to 10992x10992. Results: The results of each machine learning algorithm’s finding are presented through accuracy, confusion matrix, classification report for parameters precision, recall, and F-1 score, and a comparison between the machine learning models in terms of accuracy, run-time training, and run-time prediction are provided to determine the best model. Conclusions: The XGBoost, DTC, SVM algorithms achieved 100% accuracy. In comparison to the other machine learning models, the DTC is found to be the most suitable due to its low time required for training and prediction in predicting the optimal implementation variant of the heterogeneous system application. Hence the DTC is the best suitable algorithm for the optimization of heterogeneous parallel computing.
13

Methods and Algorithms for Efficient Programming of FPGA-based Heterogeneous Systems for Object Detection

Kalms, Lester 14 March 2023 (has links)
Nowadays, there is a high demand for computer vision applications in numerous application areas, such as autonomous driving or unmanned aerial vehicles. However, the application areas and scenarios are becoming increasingly complex, and their data requirements are growing. To meet these requirements, it needs increasingly powerful computing systems. FPGA-based heterogeneous systems offer an excellent solution in terms of energy efficiency, flexibility, and performance, especially in the field of computer vision. Due to complex applications and the use of FPGAs in combination with other architectures, efficient programming is becoming increasingly difficult. Thus, developers need a comprehensive framework with efficient automation, good usability, reasonable abstraction, and seamless integration of tools. It should provide an easy entry point, and reduce the effort to learn new concepts, programming languages and tools. Additionally, it needs optimized libraries for the user to focus on developing applications without getting involved with the underlying details. These should be well integrated, easy to use, and cover a wide range of possible use cases. The framework needs efficient algorithms to execute applications on heterogeneous architectures with maximum performance. These algorithms should distribute applications across various nodes with low fragmentation and communication overhead and find a near-optimal solution in a reasonable amount of time. This thesis addresses the research problem of an efficient implementation of object detection applications, their distribution across FPGA-based heterogeneous systems, and methods for automation and integration using toolchains. Within this, the three contributions are the HiFlipVX object detection library, the DECISION framework, and the APARMAP application distribution algorithm. HiFlipVX is an open-source HLS-based FPGA library optimized for performance and resource efficiency. It contains 66 highly parameterizable computer vision functions including neural networks, ideally for design space exploration. It extends the OpenVX standard for feature extraction, which is challenging due to unknown element size at design time. All functions are streaming capable to achieve maximum performance by increasing parallelism and reducing off-chip memory access. It does not require external or vendor libraries, which eases project integration, device coverage, and vendor portability, as shown for Intel. The library consumed on average 0.39% FFs and 0.32% LUTs for a set of image processing functions compared to a vendor library. A HiFlipVX implementation of the AKAZE feature detector computes between 3.56 and 4.13 times more pixels per second than the related work, while its resource consumption is comparable to optimized VHDL designs. Its neural network extension achieved a speedup of 3.23 for an AlexNet layer in comparison to a related work, while consuming 73% less on-chip memory. Furthermore, this thesis proposes an improved feature extraction implementation that achieves a repeatability of 72.57% when weighting complex cases, while the next best algorithm only achieves 62.99 %. DECISION is a framework consisting of two toolchains for the efficient programming of FPGA-based heterogeneous systems. Both integrate HiFlipVX and use a joint OpenVXbased frontend to implement computer vision applications. It abstracts the underlying hardware and algorithm details while covering a wide range of architectures and applications. The first toolchain targets x86-based systems consisting of CPUs, GPUs, and FPGAs using OpenCL (Open Computing Language). To create a heterogeneous schedule, it considers device profiles, kernel profiles and estimates, and FPGA dataflow characteristics. It manages synchronization, memory transfers and data coherence at design time. It creates a runtime optimized program which excels by its high parallelism and a low overhead. Additionally, this thesis looks at the integration of OpenCL-based libraries, automatic OpenCL kernel generation, and OpenCL kernel optimization and comparison for different architectures. The second toolchain creates an application specific and adaptive NoC-based architecture. The streaming-optimized architecture enables the reusability of vision functions by multiple applications to improve the resource efficiency while maintaining high performance. For a set of example applications, the resource consumption was more than halved, while its overhead was only 0.015% in terms of performance. APARMAP is an application distribution algorithm for partition-based and mesh-like FPGA topologies. It uses a NoC (Network-on-Chip) as communication infrastructure to connect reconfigurable regions and generate an application-specific hardware architecture. The algorithm uses load balancing techniques to find reasonable solutions within a predictable and scalable amount of time. It optimizes solutions using various heuristics, such as Simulated Annealing and Tabu Search. It uses a multithreaded grid-based approach to prevent threads from calculating the same solution and getting stuck in local minimums. Its constraints and objectives are the FPGA resource utilization, NoC bandwidth consumption, NoC hop count, and execution time of the proposed algorithm. The evaluation showed that the algorithm can deal with heterogeneous and irregular host graph topologies. The algorithm showed a good scalability in terms of computation time for an increasing number of nodes and partitions. It was able to achieve an optimal placement for a set of example graphs up to a size of 196 nodes on host graphs of up to 49 partitions. For a real application with 271 nodes and 441 edges, it was able to achieve a distribution with low resource fragmentation in an average time of 149 ms.
14

Application et assurance autonomes de propriétés de sécurité dans un environnement d’informatique en nuage / Autonomic enforcement and assurance of security properties in a Cloud

Bousquet, Aline 02 December 2015 (has links)
Les environnements d’informatique en nuage sont des environnements hétérogènes et dynamiques, ce qui les rend complexes à sécuriser. Dans cette thèse, nous proposons un langage et une architecture permettant d’exprimer et d’appliquer des propriétés de sécurité dans un environnement en nuage. Le langage permet à un client de l’informatique en nuage d’exprimer des besoins de sécurité sans préciser comment ils seront appliqués. Le langage se base sur des contextes abstrayant les ressources et des propriétés correspondant aux besoins de sécurité. Les propriétés sont ensuite appliquées en utilisant les mécanismes de sécurité disponibles (tels que SELinux, PAM, iptables ou firewalld) via une architecture autonome. Cette architecture permet d’abstraire et de réutiliser les capacités de sécurité des mécanismes existants. Une propriété de sécurité est ainsi définie comme une combinaison de capacités et peut être appliquée grâce à la collaboration de plusieurs mécanismes. Les mécanismes sont alors automatiquement configurés en accord avec les propriétés établies par l’utilisateur. L’architecture dispose aussi d’un système d’assurance qui lui permet de détecter une défaillance d’un mécanisme ou une erreur d’application. L’architecture peut ainsi répondre aux problèmes rencontrés, par exemple en ré-appliquant des propriétés avec d’autres mécanismes. De plus, le système d’assurance fournit une évaluation de l’application des propriétés. La thèse propose ainsi un système autonome d’application et d’assurance de la sécurité dans des environnements hétérogènes. / Cloud environnements are heterogeneous and dynamic, which makes them difficult to protect. In this thesis, we introduce a language and an architecture that can be used to express and enforce security properties in a Cloud. The language allows a Cloud user to express his security requirements without specifying how they will be enforced. The language is based on contexts (to abstract the resources) and properties (to express the security requirements). The properties are then enforced through an autonomic architecture using existing and available security mechanisms (such as SELinux, PAM, iptables, or firewalld). This architecture abstracts and reuses the security capabilities of existing mechanisms. A security property is thus defined by a combination of capabilities and can be enforced through the collaboration of several mechanisms. The mechanisms are then automatically configured according to the user-defined properties. Moreover, the architecture offers an assurance system to detect the failure of a mechanism or an enforcement error. Therefore, the architecture can address any problem, for instance by re-applying a property using different mechanisms. Lastly, the assurance system provides an evaluation of the properties enforcement. This thesis hence offers an autonomic architecture to enforce and assure security in Cloud environnements.
15

Principes et réalisation d'un environnement de prototypage virtuel de systèmes hétérogènes composables / Design of a virtual prototyping framework for composable heterogeneous systems

Ben Aoun, Cédric 12 July 2017 (has links)
Les systèmes électroniques sont de plus en plus complexes et, dans le but de rapprocher le monde numérique et le monde physique, on observe l'émergence de systèmes multidisciplinaires interagissant de plus en plus avec leur environnement proche. Leur conception nécessite la connaissance de multiples disciplines scientifiques tendant à les définir comme systèmes hétérogènes. Le développement des systèmes à venir nécessite un environnement de conception et de simulation commun permettant de gérer la multidisciplinarité de ces composants de nature variés interagissant fortement les uns avec les autres. Nous explorons la possibilité de développer et déployer un environnement de prototypage virtuel unifié pour ces systèmes. Pour surpasser les contraintes liées à leur spécification et dimensionnement, cet environnement doit pouvoir simuler un système hétérogène dans son ensemble, où chaque composant est décrit et résolu en utilisant le MoC le plus approprié. Nous proposons un prototype de simulateur, SystemC MDVP, implémenté comme une extension de SystemC qui suit une approche correcte-par-construction, repose sur une représentation hiérarchique hétérogène et des interactions basées sur des sémantiques maitre-esclave afin de modéliser les systèmes hétérogènes. Des algorithmes génériques permettent l'élaboration, la simulation et le monitoring de tels systèmes. Une méthodologie pour incorporer de nouveaux MoCs est définie puis suivie afin d'ajouter le MoC SPH, qui permet la description de réseaux fluidique, à SystemC MDVP. Nous avons modélisé un système RFID passif en utilisant plusieurs MoCs. Les résultats sont comparés avec des mesures acquises sur un vrai prototype physique. / Current and future microelectronics systems are more and more complex. In a aim to bridge the gap between the cyber world and the physical world we observe the emergence of multi-disciplinary systems that interact more and more with their close surrounding environment. The conception of such systems requires the knowledge of multiple scientific disciplines which tends to define them as heterogeneous systems. Designers of the upcoming digital-centric systems are lacking a common design and simulation environment able to manage all the multi-disciplinary aspects of its components of various nature, which closely interact with each other. We explore the possibilities of developing and deploying a unified SystemC-based design environment for virtual prototyping of heterogeneous systems. To overcome the challenges related to their specification and dimensioning this environment must be able to simulate a heterogeneous system as a whole, for which each component is described and solved using the most appropriate MoC. We propose a simulator prototype called SystemC MDVP which is implemented as an extension of SystemC. It follows a correct-by-construction approach, relies on a hierarchical heterogeneity representation and interaction mechanisms with master-slave semantics in order to model heterogeneous systems. Generic algorithms allow for the elaboration, simulation and monitoring of such systems. A methodology to incorporate new MoCs within SystemC MDVP is defined and followed to add a SPH MoC that enables the description of fluidic network. We modeled a passive RFID reading system using several MoCs. We compare the results with measures acquired on a real physical prototype.
16

Specifying Safety-Critical Heterogeneous Systems Using Contracts Theory

Westman, Jonas January 2016 (has links)
Requirements engineering (RE) is a well-established practice that is also emphasized in safety standards such as IEC 61508 and ISO 26262. Safety standards advocate a particularly stringent RE where requirements must be structured in an hierarchical manner in accordance with the system architecture; at each level, requirements must be allocated to heterogeneous (SW, HW, mechanical, electrical, etc.) architecture elements and trace links must be established between requirements. In contrast to the stringent RE in safety standards, according to previous studies, RE in industry is in general of poor quality. Considering a typical RE tool, other than basic impact analysis, the tool neither gives feedback nor guides a user  when specifying, allocating, and structuring requirements. In practice, for industry to comply with the stringent RE in safety standards, better support for RE is needed, not only from tools, but also from principles and methods. Therefore, a foundation is presented consisting of an underlying theory for specifying heterogeneous systems and complementary principles and methods to specifically support the stringent RE in safety standards. This foundation is indeed suitable as a base for implementing guidance- and feedback-driven tool support for such stringent RE; however, the fact is that the proposed theory, principles, and methods provide essential support  regardless if tools are used or not. The underlying theory is a formal compositional contracts theory for heterogeneous systems. This contracts theory embodies the essential RE property of separating requirements on a system from assumptions on its environment. Moreover, the contracts theory formalizes the stringent RE effort of structuring requirements hierarchically with respect to the system architecture. Thus, the proposed principles and methods for supporting the stringent RE in safety standards are well-rooted in formal concepts and conditions, and are thus, theoretically sound. Not only that, but the foundation is indeed also tailored to be enforced by both existing and new tools considering that the support is based on precise mathematical expressions that can be interpreted unambiguously by machines. Enforcing the foundation in a tool entails support that guides and gives feedback when specifying heterogeneous systems in general, and safety-critical ones in particular. / Kravhantering är en väletablerad praxis som ocksåbetonas i säkerhetsstandarder såsom IEC 61508 och ISO 26262. Säkerhetsstandarder förespråkar en särskilt noggrann kravhantering där krav skall struktureras på ett hierarkiskt sätt i enlighet med systemarkitekturen; på varje nivå så skall krav allokeras till heterogena (SW, HW, mekaniska, elektriska, etc.) arkitekturelement och spårlänkar skall upprättas mellan kraven. I motsats till den noggranna kravhanteringen i säkerhetsstandarder så är kravhantering i industrin av allmänt dålig kvalitet enligt tidigare studier. Ett typisk kravverktyg ger inte mycket mer stöd än grundläggande konsekvensanalyser, d.v.s.\ verktyget ger varken återkoppling eller vägledning för att formulera, allokera, och strukturera krav. Bättre stöd behövs för att industrin i praktiken skall kunna förverkliga den noggranna kravhanteringen i säkerhetsstandarder -- inte bara stöd från verktyg, men också från kravhanteringsprinciper och metoder. Därför presenteras ett fundament bestående av en underliggande teori för specifiering av heterogena system, samt kompletterande principer och metoder för att stödja den noggranna kravhanteringen i säkerhetsstandarder. Detta fundament är lämplig som en bas för att kunna implementera verktyg som ger återkoppling och vägledning för kravhantering; likväl så ger den föreslagna teorin, principerna och metoderna essentiellt stöd oavsett om verktyg används eller inte. Den underliggande teorin är en kompositionell och formell kontraktsteori för heterogena system. Denna kontraktsteori ger konkret form åt den centrala kravhanteringsegenskapen att separera kraven på ett system från antaganden på dess omgivning. Dessutom så formaliserar kontraksteorin den noggranna uppgiften att hierarkiskt strukturera krav i enlighet med systemarkitekturen. Således så är de föreslagna principerna och metoderna för att stödja den noggranna kravhanteringen i säkerhetsstandarder välförankrade i formella begrepp och villkor och är därmed också teoretiskt sunda. Det erbjudna stödet är dessutom välanpassat för att kunna verkställas av såväl befintliga som nyaverktyg med tanke på att stödet är grundat på exakta matematiska uttryck som kan tolkas entydigt av maskiner. Verkställandet av fundamentet av ett verktyg medför stöd i form av vägledning och återkoppling vid specifiering av heterogena system i allmänhet, och säkerhetskritiska sådana i synnerhet. / <p>QC 20160909</p> / ESPRESSO
17

Lastgetriebene Validierung Dienstbereitstellender Systeme / Load-Driven Validation of Service Providing Systems

Caspar, Mirko 07 January 2014 (has links) (PDF)
Mit steigender Komplexität heterogener, verteilter Systeme nehmen auch die Anforderungen an deren Validierung zu. In dieser Arbeit wird ein Konzept vorgestellt, mit dem eine bestimmte Klasse komplexer Systeme, so genannte Dienstbereitstellende Systeme, durch automatisiertes Testen validiert werden kann. Mit Hilfe heterogener Klienten, bspw. eingebetteter Systeme, wird die Systemfunktionalität getestet. Hierzu wird das zu testende System auf die nach außen zur Verfügung gestellten Dienste reduziert und die Nutzung dieser Dienste durch Klienten mit einer Last quantifiziert. Eine Validierung wird durch die Vorgabe zeitlich veränderlicher Lasten für jeden Dienst definiert. Diese Lasten werden zielgerichtet den verfügbaren Klienten zugeteilt und durch diese im zu testenden System erzeugt. Zur praktikablen Anwendung dieses Konzeptes ist eine Automatisierung des Validierungsprozesses notwendig. In der Arbeit wird die Architektur einer Testbench vorgestellt, die zum einen die Heterogenität der Klienten berücksichtigt und zum anderen Einflüsse durch die Dynamik der Klienten während der Laufzeit der Validierung ausgleicht. Das hierbei zu lösende algorithmische Problem der Dynamischen Testpartitionierung wird ebenso definiert, wie ein Modell zur Beschreibung aller notwendigen Parameter. Die Testpartitionierung kann mittels einer eigens entwickelten Heuristik in Polynomialzeit gelöst werden. Zur Bestimmung der Leistungsfähigkeit des entwickelten Verfahrens wird die Heuristik aufwendigen Untersuchungen unterzogen. Am Beispiel eines zu testenden Mobilfunknetzwerkes wird die beschriebene Testbench umgesetzt und Kernparameter mittels Simulation ermittelt. Das Ergebnis dieser Arbeit ist ein Konzept zur Systemvalidierung, das generisch auf jede Art von dienstbereitstellenden Systemen angewandt werden kann und damit zur Verbesserung des Entwicklungsprozesses von komplexen verteilten Systemen beiträgt.
18

Simulation de fautes pour l'évaluation du test en ligne de systèmes RFID / Test and diagnostic of RFID Systems

Fritz, Gilles 10 December 2012 (has links)
Les systèmes RFID – pour RadioFrequency Identification – sont capables d’identifier des objets ou des personnes sans contact ni vision direct. Ainsi, leur utilisation grandit de manière exponentielle dans différents secteurs : nucléaire, aviation, ferroviaire, médical, traçabilité, contrôle d’accès… Mais ce sont surtout des systèmes fortement hétérogènes, composés de matériel analogique ou digital, et de systèmes informatique : le tag, attaché à l’objet à identifier, contenant l’identifiant de ce dernier ; le lecteur, appareil capable de venir lire les informations contenus dans les tags ; et le système informatique gérant l’ensemble des données générées par le système. Ces systèmes sont de plus en plus utilisés dans des domaines critiques ou dans des environnements difficiles, alors qu’ils sont basés uniquement sur des équipements bas coût et peu performant – les tags – ne permettant alors pas de garantir des communications robustes. Tous ces points font que le test en ligne des systèmes RFID est une tâche complexe.Cette thèse s’intéresse donc à la sûreté de fonctionnement des systèmes RFID : comment être certains que le système fonctionne comme il faut au moment où on en à besoin ? En premier, les défaillances et leurs causes ont été étudiées à l’aide d’une méthode classique : AMDE – Analyse des modes de défaillances et de leurs effets. Cette étude a permis d’identifier les points faibles des systèmes RFID. Après cela et grâce à cette analyse, il nous a été possible de définir et d’implémenter un simulateur de systèmes RFID appelé SERFID, pour Simulation et Evaluation des systèmes RFID. Ce simulateur est capable de simuler différents systèmes RFID multi-équipements (HF ou UHF, normes actuellement implémentées : ISO15693 et EPC Classe 1 Génération 2), du tag au lecteur, en passant par le canal de communication permettant aux tags et aux lecteurs de communiquer. SERFID permet aussi de connecter les lecteurs simulés à des middlewares existants ou nouveau afin des les évaluer. Pour permettre l’évaluation de la sûreté de fonctionnement des systèmes RFID, SERFID permet l’injection de fautes dynamiquement au sein des tags, lecteurs ou du canal de communication permettant de simuler différentes défaillances pouvant apparaître : diminution de la qualité de la communication ou de l’alimentation du tag, erreurs au sein de la mémoire du tag, bruit… SERFID a été notamment utilisé pour simuler et observer le comportement de systèmes RFID HF et UHF face à du bruit et des perturbations dans le canal de communication entre le tag et le lecteur. Finalement, cette thèse propose une nouvelle méthode pour détecter les tags fautifs ou vieillissants dans les applications de logistiques. Cette méthode, non intrusive et en ligne, est basée sur l’observation des performances du système au cours de son fonctionnement : le logiciel de gestion analyse les résultats des différentes identifications. A partir du taux d’erreur de lecture par tag, et en le comparant aux taux de lecture par tag précédemment observés, cette méthode est capable de déterminer quel groupe de tags est fautif ou non. Cette méthode a été évaluée par expérimentation et par simulation grâce à SERFID. Cette évaluation a permis de mettre en évidence les points forts et les faiblesses de la méthode. / RFID systems – for RadioFrequency Identification – are able to identify object or person without any contact or direct vision. For this reason, their use grows exponentially in many different fields: nuclear, avionics, railways, medical, warehouse inventories, access control… However they are complex heterogeneous systems, consisting of analog and digital hardware components and software components: the tag, closed on the object to identified, which contains its identifier; the reader which able to read identifiers on tags; and finally the IT infrastructure to manage data. RFID technologies are often used into critical domains or within harsh environments. But as RFID systems are only based on low cost and low-performance equipments, they do not always ensure robust communications. All these points make the on-line testing of RFID systems a very complex task.This thesis focuses on dependability of RFID systems: how to be sure that this system works correctly when we need to use it? Firstly, failures and their causes have been studied using a common method called FMEA – Failure Modes and Effects Analysis – This study allows to identify weakness aspects of RFID systems. After that and thanks to this analysis, a new simulator was designed and implemented. This simulator, called SERFID for Simulation and Evaluation of RFID systems, is able to simulate various RFID systems with many devices (HF or UHF, actually implemented standards: ISO15693 or EPC Class 1 Generation 2), from tag to reader, together with the RF channel between them and the physic aspect which permit to tags and readers to communicate. SERFID also permits to connect an existing or new middleware to simulated reader to evaluate new software approach. To analyze dependability of RFID systems, SERFID allows us to inject fault in tag, channel or readers dynamically, to simulate different failures which can be appear: decrease of quality of communication or tag supply, memory errors in tag, noises… SERFID was in particular use to simulate HF and UHF RFID systems to observe their reaction according noises and disturbances in communication between tag and reader. Finally, a new method to detect faulty or aging tags or readers in traceability application was proposed. This non-intrusive on-line method is based on performance observation of the system during operation: the managing software analyzes results of an identification round. According read error rate per tag of an inventory, and comparing it with previous obtained read error rates per tag, this method is able to determine which group of tags is faulty or not. This method has been analyzed with to method: by experimentations and by simulation using SERFID. This analyze brings out weakness and strength of this method.
19

Scalable Applications on Heterogeneous System Architectures: A Systematic Performance Analysis Framework

Dietrich, Robert 15 November 2019 (has links)
The efficient parallel execution of scientific applications is a key challenge in high-performance computing (HPC). With growing parallelism and heterogeneity of compute resources as well as increasingly complex software, performance analysis has become an indispensable tool in the development and optimization of parallel programs. This thesis presents a framework for systematic performance analysis of scalable, heterogeneous applications. Based on event traces, it automatically detects the critical path and inefficiencies that result in waiting or idle time, e.g. due to load imbalances between parallel execution streams. As a prerequisite for the analysis of heterogeneous programs, this thesis specifies inefficiency patterns for computation offloading. Furthermore, an essential contribution was made to the development of tool interfaces for OpenACC and OpenMP, which enable a portable data acquisition and a subsequent analysis for programs with offload directives. At present, these interfaces are already part of the latest OpenACC and OpenMP API specification. The aforementioned work, existing preliminary work, and established analysis methods are combined into a generic analysis process, which can be applied across programming models. Based on the detection of wait or idle states, which can propagate over several levels of parallelism, the analysis identifies wasted computing resources and their root cause as well as the critical-path share for each program region. Thus, it determines the influence of program regions on the load balancing between execution streams and the program runtime. The analysis results include a summary of the detected inefficiency patterns and a program trace, enhanced with information about wait states, their cause, and the critical path. In addition, a ranking, based on the amount of waiting time a program region caused on the critical path, highlights program regions that are relevant for program optimization. The scalability of the proposed performance analysis and its implementation is demonstrated using High-Performance Linpack (HPL), while the analysis results are validated with synthetic programs. A scientific application that uses MPI, OpenMP, and CUDA simultaneously is investigated in order to show the applicability of the analysis.
20

A New System Architecture for Heterogeneous Compute Units

Asmussen, Nils 09 August 2019 (has links)
The ongoing trend to more heterogeneous systems forces us to rethink the design of systems. In this work, I study a new system design that considers heterogeneous compute units (general-purpose cores with different instruction sets, DSPs, FPGAs, fixed-function accelerators, etc.) from the beginning instead of as an afterthought. The goal is to treat all compute units (CUs) as first-class citizens, enabling (1) isolation and secure communication between all types of CUs, (2) a direct interaction of all CUs, removing the conventional CPU from the critical path, and (3) access to operating system (OS) services such as file systems and network stacks for all CUs. To study this system design, I am using a hardware/software co-design based on two key ideas: 1) introduce a new hardware component next to each CU used by the OS as the CUs' common interface and 2) let the OS kernel control applications remotely from a different CU. The hardware component is called data transfer unit (DTU) and offers the minimal set of features to reach the stated goals: secure message passing and memory access. The OS is called M³ and runs its kernel on a dedicated CU and runs the OS services and applications on the remaining CUs. The kernel is responsible for establishing DTU-based communication channels between services and applications. After a channel has been set up, services and applications communicate directly without involving the kernel. This approach allows to support arbitrary CUs as aforementioned first-class citizens, ranging from fixed-function accelerators to complex general-purpose cores.

Page generated in 0.074 seconds