Global ETD Search

61	Design space exploration using HLS in relation to code structuring / Utforskning av design space med HLS i förhållande till kodstrukturering Das, Debraj January 2022 (has links) High Level Synthesis (HLS) is a methodology to translate a model developed in a high abstraction layer, e.g. C/C++/SystemC, that describes the algorithm into a Register-Transfer level (RTL) description like Verilog or VHDL. The resulting RTL description from the translation is subject to multiple user-controlled directives and an internal design space exploration algorithm specific to the toolchain used. HLS allow designers to focus on the behaviour of the design at a higher abstraction compared to the behavioural modelling available within the Hardware Description Language (HDL) as the compiler decides the movement of data and timing in the resulting design. Ericsson uses a legacy Advanced Peripheral Bus (APB) like interface called Memory/Register Interface (MIRI) interface for data movement in a subsystem of one of their Application-Specific Integrated Circuit (ASIC). The thesis attempts to upgrade the protocol to the more performant ARM Advanced Microcontroller Bus Architecture (AMBA) protocols’ Advanced High-performance Bus (AHB) or Advanced eXtensible Interface (AXI) interfaces. SystemC provides a host of functionalities to define the complete behaviour of the circuit at a high level of abstraction. This thesis will explore the effect of the structuring SystemC models on their synthesis, and perform design space exploration to understand the best design methodology to adopt in a SystemC model design and compare the models based on the final synthesis metrics like area, timing, and register counts. The toolchain for the thesis will be the Stratus HLS compiler developed by Cadence. Stratus supports all synthesizable constructs of SystemC. Most HLS research focuses on improving Design Space Exploration algorithms used internally in the HLS tools. However, designers can utilize algorithm structuring to provide the HLS engines with a better starting point. In this thesis, the Stratus toolchain will be used to experiment with different models with equivalent behaviour and performance. Thereafter, extract which constructs used in the models are optimal for allowing the internal design space exploration algorithm to perform in the best way possible. / HLS är en metod för att översätta en modell utvecklad på hög abstraktionsnivå t.ex. C/C++/SystemC som beskriver algoritmen på registeröverföringsnivå (RTL) som Verilog eller VHDL. Den resulterande RTL-beskrivningen utsätts för flera användarkontrollerade direktiv och en intern Design Space Exploration (DSE) algoritm, vilken är specifik för den verktygskedja som används. Detta gör det möjligt för en designer att fokusera på konstruktion beteende på en högre abstraktionsnivå jämfört med den beteendemodellering som finns tillgänglig inom det hårdvarubeskrivande språket (HDL:en) när kompilatorn bestämmer tidpunkten för utbytet av data i den resulterande designen. Ericsson använder ett äldre gränssnitt för Advanced Peripheral Bus (APB) som kallas Memory/Register Interface (MIRI), vilket är ett gränssnitt för utbyte av data i ett delsystem i en av deras Application-Specific Integrated Circuit (ASIC:ar). Avhandlingen försöker uppgradera protokollet till ett av de det mer högpresterande ARM Advanced Microcontroller Bus Architecture – protokollen Advanced High-Performance Bus (AHB) eller Advanced eXtensible Interface (AXI). SystemC tillhandahåller en mängd funktioner för att definiera kretsens fullständiga beteende vid en hög abstraktionsnivå. Denna avhandling utforskar effekten av strukturerade SystemC-modeller och deras syntesresultat samt konstruktionsrymden, för att förstå den bästa designmetodiken i ett SystemC-modelleringsdesignflöde och jämföra modellerna baserade på de slutliga syntesmätvärdena som storlek, timing, etc. Verktygskedjan för avhandlingen kommer att vara Stratus HLS -kompilatorn som utvecklats av Cadence. Stratus stöder alla syntetiserbara konstruktioner av SystemC. HLS-forskningen fokuserar främst på att förbättra Design Space Exploration, dvs de algoritmer som används internt i HLS-verktygen för att komma fram till lösningar. För att ge HLS -motorerna en bättre utgångspunkt. I denna avhandling kommer Stratus att användas för att utvärdera olika modeller med ekvivalent beteende och nästan samma prestanda efter Syntes, för att komma fram till vilka konstruktioner är optimala för att den interna DSE-algoritmen skall fungera bäst. Design Space Exploration (DSE) High level Synthesis (HLS) Design Methodology Elektroteknik och elektronik
62	High Level Power Estimation and Reduction Techniques for Power Aware Hardware Design Ahuja, Sumit 14 June 2010 (has links) The unabated continuation of the Moore's law has allowed the doubling of the number of transistors per unit area of a silicon die every 2 years or so. At the same time, an increasing demand on consumer electronics and computing equipments to run sophisticated applications has led to an unprecedented complexity of hardware designs. These factors have necessitated the abstraction level of design-entry of hardware systems to be raised beyond the Register-Transfer-Level (RTL) to Electronic System Level (ESL). However, power envelope on the designs due to packaging and other thermal limitations, and the energy envelope due to battery life-time considerations have also created a need for power/energy efficient design. The confluence of these two technological issues has created an urgent need for solving two problems: (i) How do we enable a power-aware design flow with a design entry point at the Electronic System Level? (ii) How do we enable power aware High Level Synthesis to automatically synthesize RTL implementation from ESL? This dissertation distinguishes itself by addressing the following two issues: (i) Since power/energy consumption of electronic systems largely depends on implementation details, and high-level models abstract away from such details, power/energy estimation at such levels has not been addressed thoroughly. (ii) A lot of work has been done in applying various techniques on control-data-flow graphs (CDFG) to find power/area/latency pareto points during behavioral synthesis. However, high level C-based functional models of various compute-intensive components, which could be easily synthesized as co-processors, have many opportunities to reduce power. Some of these savings opportunities are traditional such as clock-gating, operand-isolation etc. The exploration of alternate granularities of these techniques with target applications in mind, opens the door for traditional power reduction opportunities at the high-level. This work therefore concentrates on the aforementioned two areas of inadequacy of hardware design methodologies. Our proposed solutions include utilizing ESL simulation traces and mapping those to lower abstraction levels for power estimation, derivation of statistical power models using regression based learning for power estimation at early design stages, etc. On the HLS front, techniques that insert the power saving features during the synthesis process using exploration of granularity and scope of clock-gating, sequential clock-gating are proposed. Finally, this work shows how to marry two domains, that is estimation and reduction. In this regard, a power model is proposed, which helps in predicting power savings obtained using clock-gating and further guiding HLS to selectively insert clock-gating. / Ph. D. High Level Synthesis Model Checking Power Estimation Clock-gating Power Reduction Coprocessors. Hardware Software Co-design Electronic System Level
63	FPGA Reservoir Computing Networks for Dynamic Spectrum Sensing Shears, Osaze Yahya 14 June 2022 (has links) The rise of 5G and beyond systems has fuelled research in merging machine learning with wireless communications to achieve cognitive radios. However, the portability and limited power supply of radio frequency devices limits engineers' ability to combine them with powerful predictive models. This hinders the ability to support advanced 5G applications such as device-to-device (D2D) communication and dynamic spectrum sharing (DSS). This challenge has inspired a wave of research in energy efficient machine learning hardware with low computational and area overhead. In particular, hardware implementations of the delayed feedback reservoir (DFR) model show promising results for meeting these constraints while achieving high accuracy in cognitive radio applications. This thesis answers two research questions surrounding the applicability of FPGA DFR systems for DSS. First, can a DFR network implemented on an FPGA run faster and with lower power than a purely software approach? Second, can the system be implemented efficiently on an edge device running at less than 10 watts? Two systems are proposed that prove FPGA DFRs can achieve these feats: a mixed-signal circuit, followed by a high-level synthesis circuit. The implementations execute up to 58 times faster, and operate at more than 90% lower power than the software models. Furthermore, the lowest recorded average power of 0.130 watts proves that these approaches meet typical edge device constraints. When validated on the NARMA10 benchmark, the systems achieve a normalized error of 0.21 compared to state-of-the-art error values of 0.15. In a DSS task, the systems are able to predict spectrum occupancy with up to 0.87 AUC in high noise, multiple input, multiple output (MIMO) antenna configurations compared to 0.99 AUC in other works. At the end of this thesis, the trade-offs between the approaches are analyzed, and future directions for advancing this study are proposed. / Master of Science / The rise of 5G and beyond systems has fuelled research in merging machine learning with wireless communications to achieve cognitive radios. However, the portability and limited power supply of radio frequency devices limits engineers' ability to combine them with powerful predictive models. This hinders the ability to support advanced 5G and internet-of-things (IoT) applications. This challenge has inspired a wave of research in energy efficient machine learning hardware with low computational and area overhead. In particular, hardware implementations of a low complexity neural network model, called the delayed feedback reservoir, show promising results for meeting these constraints while achieving high accuracy in cognitive radio applications. This thesis answers two research questions surrounding the applicability of field-programmable gate array (FPGA) delayed feedback reservoir systems for wireless communication applications. First, can this network implemented on an FPGA run faster and with lower power than a purely software approach? Second, can the network be implemented efficiently on an edge device running at less than 10 watts? Two systems are proposed that prove the FPGA networks can achieve these feats. The systems demonstrate lower power consumption and latency than the software models. Additionally, the systems maintain high accuracy on traditional neural network benchmarks and wireless communications tasks. The second implementation is further demonstrated in a software-defined radio architecture. At the end of this thesis, the trade-offs between the approaches are analyzed, and future directions for advancing this study are proposed. field-programmable gate array high-level synthesis machine learning reservoir computing software-defined radio spectrum sensing neuromorphic computing
64	Modélisation, exploration et estimation de la consommation pour les architectures hétérogènes reconfigurables dynamiquement / Model, exploration and estimation of consumption in dynamically reconfigurable heterogeneous architectures Bonamy, Robin 12 July 2013 (has links) L'utilisation des accélérateurs reconfigurables, pour la conception de system-on-chip hétérogènes, offre des possibilités intéressantes d'augmentation des performances et de réduction de la consommation d'énergie. En effet, ces accélérateurs sont couramment utilisés en complément d'un (ou de plusieurs) processeur(s) pour permettre de décharger celui-ci (ceux-ci) des calculs intensifs et des traitements de flots de données. Le concept de reconfiguration dynamique, supporté par certains constructeurs de FPGA, permet d'envisager des systèmes beaucoup plus flexibles en offrant notamment la possibilité de séquencer temporellement l'exécution de blocs de calcul sur la même surface de silicium, réduisant alors les besoins en ressources d'exécution. Cependant, la reconfiguration dynamique n'est pas sans impact sur les performances globales du système et il est difficile d'estimer la répercussion des décisions de configuration sur la consommation d'énergie. L'objectif principal de cette thèse consiste à proposer une méthodologie d'exploration permettant d'évaluer l'impact des choix d'implémentation des différentes tâches d'une application sur un system-on-chip contenant une ressource reconfigurable dynamiquement, en vue d'optimiser la consommation d'énergie ou le temps d'exécution. Pour cela, nous avons établi des modèles de consommation des composants reconfigurables, en particulier les FPGAs, qui permettent d'aider le concepteur dans son design. À l'aide d'une méthodologie de mesure sur Virtex-5, nous montrons dans un premier temps qu'il est possible de générer des accélérateurs matériels de tailles variées ayant des performances temporelles et énergétiques diverses. Puis, afin de quantifier les coûts d'implémentation de ces accélérateurs, nous construisons trois modèles de consommation de la reconfiguration dynamique partielle. Finalement, à partir des modèles définis et des accélérateurs produits, nous développons un algorithme d'exploration des solutions d'implémentation pour un système complet. En s'appuyant sur une plate-forme de modélisation à haut niveau, celui-ci analyse les coûts d'implémentation des tâches et leur exécution sur les différentes ressources disponibles (processeur ou région configurable). Les solutions offrant les meilleures performances en fonction des contraintes de conception sont retenues pour être exploitées. / The use of reconfigurable accelerators when designing heterogeneous system-on-chip has the potential to increase performance and reduce energy consumption. Indeed, these accelerators are commonly a adjunct to one (or more) processor(s) and unload intensive computations and treatments. The concept of dynamic reconfiguration, supported by some FPGA vendors, allows to consider more flexible systems including the ability to sequence the execution of accelerators on the same silicon area, while reducing resource requirements. However, dynamic reconfiguration may impact overall system performance and it is hard to estimate the impact of configuration decisions on energy consumption.. The main objective of this thesis is to provide an exploration methodology to assess the impact of implementation choices of tasks of an application on a system-on-chip containing a dynamically reconfigurable resource, to optimize the energy consumption or the processing time. Therefore, we have established consumption models of reconfigurable components, particularly FPGAs, which assists the designer. Using a measurement methodology on Virtex-5, we first show the possibility to generate hardware accelerators of various sizes, execution time and energy consumption. Then, in order to quantify the implementation costs of these accelerators, we build three power models of the dynamic and partial reconfiguration. Finally, from these models, we develop an algorithm for the exploration of implementation and allocation possibilities for a complete system. Based on a high-level modeling platform, the implementation costs of the tasks and their performance on various resources (CPU or reconfigurable region) are analyzed. The solutions with the best characteristics, based on design constraints, are extracted. Consommation d'énergie Synthèse de haut niveau (informatique) Reconfiguration ( informatique) Energy consumption Field programmable gate arrays High-level synthesis
65	Fast Code Exploration for Pipeline Processing in FPGA Accelerators / Exploração Rápida de Códigos para Processamento Pipeline em Aceleradores FPGA Rosa, Leandro de Souza 31 May 2019 (has links) The increasing demand for energy efficient computing has endorsed the usage of Field-Programmable Gate Arrays to create hardware accelerators for large and complex codes. However, implementing such accelerators involve two complex decisions. The first one lies in deciding which code snippet is the best to create an accelerator, and the second one lies in how to implement the accelerator. When considering both decisions concomitantly, the problem becomes more complicated since the code snippet implementation affects the code snippet choice, creating a combined design space to be explored. As such, a fast design space exploration for the accelerators implementation is crucial to allow the exploration of different code snippets. However, such design space exploration suffers from several time-consuming tasks during the compilation and evaluation steps, making it not a viable option to the snippets exploration. In this work, we focus on the efficient implementation of pipelined hardware accelerators and present our contributions on speeding up the pipelines creation and their design space exploration. Towards loop pipelining, the proposed approaches achieve up to 100× speed-up when compared to the state-uf-the-art methods, leading to 164 hours saving in a full design space exploration with less than 1% impact in the final results quality. Towards design space exploration, the proposed methods achieve up to 9:5× speed-up, keeping less than 1% impact in the results quality. / A demanda crescente por computação energeticamente eficiente tem endossado o uso de Field- Programmable Gate Arrays para a criação de aceleradores de hardware para códigos grandes e complexos. Entretanto, a implementação de tais aceleradores envolve duas decisões complexas. O primeiro reside em decidir qual trecho de código é o melhor para se criar o acelerador, e o segundo reside em como implementar tal acelerador. Quando ambas decisões são consideradas concomitantemente, o problema se torna ainda mais complicado dado que a implementação do trecho de código afeta a seleção dos trechos de código, criando um espaço de projeto combinatorial a ser explorado. Dessa forma, uma exploração do espaço de projeto rápida para a implementação de aceleradores é crucial para habilitar a exploração de diferentes trechos de código. Contudo, tal exploração do espaço de projeto é impedida por várias tarefas que consumem tempo durante os passos de compilação a análise, o que faz da exploração de trechos de códigos inviável. Neste trabalho, focamos na implementação eficiente de aceleradores pipeline em hardware e apresentamos nossas contribuições para o aceleramento da criações de pipelines e de sua exploração do espaço de projeto. Referente à criação de pipelines, as abordagens propostas alcançam uma aceleração de até 100× quando comparadas às abordagens do estado-da-arte, levando à economia de 164 horas em uma exploração de espaço de projeto completa com menos de 1% de impacto na qualidade dos resultados. Referente à exploração do espaço de projeto, as abordagens propostas alcançam uma aceleração de até 9:5×, mantendo menos de 1% de impacto na qualidade dos resultados. Design space exploration Exploração do espaço de projeto Field Programmable Gate Arrays Field-Programmable Gate Array High-level synthesis Pipeline Pipeline Síntese em alto nível
66	High-Level Synthesis Framework for Crosstalk Minimization in VLSI ASICs Sankaran, Hariharan 31 October 2008 (has links) Capacitive crosstalk noise can affect the delay of a switching signal or induce a glitch on a static signal causing timing violations or chip failure. Crosstalk noise depends on coupling parasitics, driver strength, signal timing characteristics, and signal transition patterns. Layout level crosstalk analysis techniques are generally pessimistic and computationally expensive for large designs due to lack of design flexibility at lower-levels of design hierarchy. The architectural decisions such as type of interconnect architecture, number of storage and execution units, network of communicating units, data bus width, etc., have a major impact on the quality of design attributes such as area, speed, power, and noise. To address all these concerns, we propose a high-level synthesis framework to optimize for worst-case crosstalk patterns on coupled nets, a floorplan driven high-level synthesis framework to minimize coupling capacitance, and an on-chip technique to dynamically detect and eliminate worst-case crosstalk pattern on bus-based macro-cell designs. Due to Miller coupling effect, the switching activity pattern on adjacent nets may increase the effective capacitance seen by a victim net and thereby it may cause a worst-case signal delay on the victim net. However, signal activity pattern on coupled nets are dependent on data correlations which in turn depend on resource sharing. The resource sharing in turn depends on scheduling, allocation, and binding during high-level synthesis flow. Therefore, we propose a Simulated Annealing (SA) based design space exploration of HLS design subspace, bus line re-ordering, and encoding subspaces to optimize for worst-case crosstalk pattern in bus-based macro-cell designs. We demonstrate that the proposed framework will aid layout level techniques in eliminating false positive violations. We also propose an SA based algorithm to explore floorplan and HLS subspaces to optimize coupling capacitances in bus-based macro-cell designs. We have integrated an RTL floorplanner in HLS flow to estimate coupling capacitances between bus lines. Crosstalk analysis using Cadence Celtic shows that the designs generated by the proposed framework results in less number of crosstalk violations compared to designs generated through traditional ASIC design flow. We also propose an on-chip crosstalk detection and elimination technique that dynamically detects and eliminates worst-case crosstalk pattern with minimum area penalty compared to other layout level techniques reported in the literature. Crosstalk noise Simulated Annealing Floorplan driven high-level synthesis Bus-based design Macro-cell based design Coupling parasitics American Studies Arts and Humanities
67	Parallel Hardware- and Software Threads in a Dynamically Reconfigurable System on a Programmable Chip Rößler, Marko 06 December 2013 (has links) (PDF) Today’s embedded systems depend on the availability of hybrid platforms, that contain heterogeneous computing resources such as programmable processors units (CPU’s or DSP’s) and highly specialized hardware cores. These platforms have been scaled down to integrated embedded system-on-chip. Modern platform FPGAs enhance such systems by the flexibility of runtime configurable silicon. One of the major advantages that arises is the ability to use hardware (HW) and software (SW) resources in a time-shared manner. Though the ability to dynamically assign computing resources based on decisions taken at runtime is given. High-Level-Synthese HW/SW Codesign FPGA heterogene Computersysteme High-Level-Synthesis HW/SW Codesign FPGA heterogenous computer systems ddc:006 Field programmable gate array Computer Hardware Software
68	Implantation matérielle de chiffrements homomorphiques / Hardware implementation of homomorphic encryption Mkhinini, Asma 14 December 2017 (has links) Une des avancées les plus notables de ces dernières années en cryptographie est sans contredit l’introduction du premier schéma de chiffrement complètement homomorphe par Craig Gentry. Ce type de système permet de réaliser des calculs arbitraires sur des données chiffrées, sans les déchiffrer. Cette particularité permet de répondre aux exigences de sécurité et de protection des données, par exemple dans le cadre en plein développement de l'informatique en nuage et de l'internet des objets. Les algorithmes mis en œuvre sont actuellement très coûteux en temps de calcul, et généralement implantés sous forme logicielle. Les travaux de cette thèse portent sur l’accélération matérielle de schémas de chiffrement homomorphes. Une étude des primitives utilisées par ces schémas et la possibilité de leur implantation matérielle est présentée. Ensuite, une nouvelle approche permettant l’implantation des deux fonctions les plus coûteuses est proposée. Notre approche exploite les capacités offertes par la synthèse de haut niveau. Elle a la particularité d’être très flexible et générique et permet de traiter des opérandes de tailles arbitraires très grandes. Cette particularité lui permet de viser un large domaine d’applications et lui autorise d’appliquer des optimisations telles que le batching. Les performances de notre architecture de type co-conception ont été évaluées sur l’un des cryptosystèmes homomorphes les plus récents et les plus efficaces. Notre approche peut être adaptée aux autres schémas homomorphes ou plus généralement dans le cadre de la cryptographie à base de réseaux. / One of the most significant advances in cryptography in recent years is certainly the introduction of the first fully homomorphic encryption scheme by Craig Gentry. This type of cryptosystem allows performing arbitrarily complex computations on encrypted data, without decrypting it. This particularity allows meeting the requirements of security and data protection, for example in the context of the rapid development of cloud computing and the internet of things. The algorithms implemented are currently very time-consuming, and most of them are implemented in software. This thesis deals with the hardware acceleration of homomorphic encryption schemes. A study of the primitives used by these schemes and the possibility of their hardware implementation is presented. Then, a new approach allowing the implementation of the two most expensive functions is proposed. Our approach exploits the high-level synthesis. It has the particularity of being very flexible and generic and makes possible to process operands of arbitrary large sizes. This feature allows it to target a wide range of applications and to apply optimizations such as batching. The performance of our co-design was evaluated on one of the most recent and efficient homomorphic cryptosystems. It can be adapted to other homomorphic schemes or, more generally, in the context of lattice-based cryptography. Systèmes intégrés numériques Systèmes sécurisés Chiffrement homomorphique Synthèse de haut niveau Integrated digital systems Secured systems Homomorphic encryption High level synthesis 620
69	Génération rapide d'accélerateurs matériels par synthèse d'architecture sous contraintes de ressources / High-level synthesis for fast generation of hardware accelerators under resource constraints Prost-Boucle, Adrien 08 January 2014 (has links) Dans le domaine du calcul générique, les circuits FPGA sont très attrayants pour leur performance et leur faible consommation. Cependant, leur présence reste marginale, notamment à cause des limitations des logiciels de développement actuels. En effet, ces limitations obligent les utilisateurs à bien maîtriser de nombreux concepts techniques. Ils obligent à diriger manuellement les processus de synthèse, de façon à obtenir une solution à la fois rapide et conforme aux contraintes des cibles matérielles visées.Une nouvelle méthodologie de génération basée sur la synthèse d'architecture est proposée afin de repousser ces limites. L'exploration des solutions consiste en l'application de transformations itératives à un circuit initial, ce qui accroît progressivement sa rapidité et sa consommation en ressources. La rapidité de ce processus, ainsi que sa convergence sous contraintes de ressources, sont ainsi garanties. L'exploration est également guidée vers les solutions les plus pertinentes grâce à la détection, dans les applications à synthétiser, des sections les plus critiques pour le contexte d'utilisation réel. Cette information peut être affinée à travers un scénario d'exécution transmis par l'utilisateur.Un logiciel démonstrateur pour cette méthodologie, AUGH, est construit. Des expérimentations sont menées sur plusieurs applications reconnues dans le domaine de la synthèse d'architecture. De tailles très différentes, ces applications confirment la pertinence de la méthodologie proposée pour la génération rapide et autonome d'accélérateurs matériels complexes, sous des contraintes de ressources strictes. La méthodologie proposée est très proche du processus de compilation pour les microprocesseurs, ce qui permet son utilisation même par des utilisateurs non spécialistes de la conception de circuits numériques. Ces travaux constituent donc une avancée significative pour une plus large adoption des FPGA comme accélérateurs matériels génériques, afin de rendre les machines de calcul simultanément plus rapides et plus économes en énergie. / In the field of high-performance computing, FPGA circuits are very attractive for their performance and low consumption. However, their presence is still marginal, mainly because of the limitations of current development tools. These limitations force the user to have expert knowledge about numerous technical concepts. They also have to manually control the synthesis processes in order to obtain solutions both fast and that fulfill the hardware constraints of the targeted platforms.A novel generation methodology based on high-level synthesis is proposed in order to push these limits back. The design space exploration consists in the iterative application of transformations to an initial circuit, which progressively increases its rapidity and its resource consumption. The rapidity of this process, along with its convergence under resource constraints, are thus guaranteed. The exploration is also guided towards the most pertinent solutions thanks to the detection of the most critical sections of the applications to synthesize, for the targeted execution context. This information can be refined with an execution scenarion specified by the user.A demonstration tool for this methodology, AUGH, has been built. Experiments have been conducted with several applications known in the field of high-level synthesis. Of very differen sizes, these applications confirm the pertinence of the proposed methodology for fast and automatic generation of complex hardware accelerators, under strict resource constraints. The proposed methodology is very close to the compilation process for microprocessors, which enable it to be used even by users non experts about digital circuit design. These works constitute a significant progress for a broader adoption of FPGA as general-purpose hardware accelerators, in order to make computing machines both faster and more energy-saving. Méthodologie de conception Synthèse d'architecture Accélérateurs matériels FPGA Exploration de l'espace de conception Design flow High-level synthesis Hardware accelerator FPGA Design space exploration 620
70	Implémentation sur SoC des réseaux Bayésiens pour l'état de santé et la décision dans le cadre de missions de véhicules autonomes / SoC implementation of Bayesian networks for health management and decision making for autonomous vehicles missions Zermani, Sara 21 November 2017 (has links) Les véhicules autonomes, tels que les drones, sont utilisés dans différents domaines d'application pour exécuter des missions simples ou complexes. D’un côté, ils opèrent généralement dans des conditions environnementales incertaines, pouvant conduire à des conséquences désastreuses pour l'humain et l'environnement. Il est donc nécessaire de surveiller continuellement l’état de santé du système afin de pouvoir détecter et localiser les défaillances, et prendre la décision en temps réel. Cette décision doit maximiser les capacités à répondre aux objectifs de la mission, tout en maintenant les exigences de sécurité. D’un autre côté, ils sont amenés à exécuter des tâches avec des demandes de calcul important sous contraintes de performance. Il est donc nécessaire de penser aux accélérateurs matériels dédiés pour décharger le processeur et répondre aux exigences de la rapidité de calcul.C’est ce que nous cherchons à démontrer dans cette thèse à double objectif. Le premier objectif consiste à définir un modèle pour l’état de santé et la décision. Pour cela, nous utilisons les réseaux Bayésiens, qui sont des modèles graphiques probabilistes efficaces pour le diagnostic et la décision sous incertitude. Nous avons proposé un modèle générique en nous basant sur une analyse de défaillance de type FMEA (Analyse des Modes de Défaillance et de leurs Effets). Cette analyse prend en compte les différentes observations sur les capteurs moniteurs et contextes d’apparition des erreurs. Le deuxième objectif était la conception et la réalisation d’accélérateurs matériels des réseaux Bayésiens d’une manière générale et plus particulièrement de nos modèles d’état de santé et de décision. N’ayant pas d’outil pour l’implémentation embarqué du calcul par réseaux Bayésiens, nous proposons tout un atelier logiciel, allant d’un réseau Bayésien graphique ou textuel jusqu’à la génération du bitstream prêt pour l’implémentation logicielle ou matérielle sur FPGA. Finalement, nous testons et validons nos implémentations sur la ZedBoard de Xilinx, incorporant un processeur ARM Cortex-A9 et un FPGA. / Autonomous vehicles, such as drones, are used in different application areas to perform simple or complex missions. On one hand, they generally operate in uncertain environmental conditions, which can lead to disastrous consequences for humans and the environment. Therefore, it is necessary to continuously monitor the health of the system in order to detect and locate failures and to be able to make the decision in real time. This decision must maximize the ability to meet the mission objectives while maintaining the security requirements. On the other hand, they are required to perform tasks with large computation demands and performance requirements. Therefore, it is necessary to think of dedicated hardware accelerators to unload the processor and to meet the requirements of a computational speed-up.This is what we tried to demonstrate in this dual objective thesis. The first objective is to define a model for the health management and decision making. To this end, we used Bayesian networks, which are efficient probabilistic graphical models for diagnosis and decision-making under uncertainty. We propose a generic model based on an FMEA (Failure Modes and Effects Analysis). This analysis takes into account the different observations on the monitors and the appearance contexts. The second objective is the design and realization of hardware accelerators for Bayesian networks in general and more particularly for our models of health management and decision-making. Having no tool for the embedded implementation of computation by Bayesian networks, we propose a software workbench covering graphical or textual Bayesian networks up to the generation of the bitstream ready for the software or hardware implementation on FPGA. Finally, we test and validate our implementations on the Xilinx ZedBoard, incorporating an ARM Cortex-A9 processor and an FPGA. Réseaux Bayésiens Etat de santé Décision FMEA FPGA Synthèse de haut niveau Implémentation matérielle/logicielle Bayesian network Health management Decision making FMEA FPGA High level synthesis HW/SW implementation

Search results