Global ETD Search

81	Hardware/Software Co-Verification Using the SystemVerilog DPI Freitas, Arthur 08 June 2007 (has links) (PDF) During the design and verification of the Hyperstone S5 flash memory controller, we developed a highly effective way to use the SystemVerilog direct programming interface (DPI) to integrate an instruction set simulator (ISS) and a software debugger in logic simulation. The processor simulation was performed by the ISS, while all other hardware components were simulated in the logic simulator. The ISS integration allowed us to filter many of the bus accesses out of the logic simulation, accelerating runtime drastically. The software debugger integration freed both hardware and software engineers to work in their chosen development environments. Other benefits of this approach include testing and integrating code earlier in the design cycle and more easily reproducing, in simulation, problems found in FPGA prototypes. Hardware/Software-Co simulation Hyperstone S5 Flash memory controller logic simulation ddc:004 ddc:500 Systementwurf Verifikation
82	A microprocessor performance and reliability simulation framework using the speculative functional-first methodology Yuan, Yi 13 February 2012 (has links) With the high complexity of modern day microprocessors and the slow speed of cycle-accurate simulations, architects are often unable to adequately evaluate their designs during the architectural exploration phases of chip design. This thesis presents the design and implementation of the timing partition of the cycle-accurate, microarchitecture-level SFFSim-Bear simulator. SFFSim-Bear is an implementation of the speculative functional-first (SFF) methodology, and utilizes a hybrid software-FPGA platform to accelerate simulation throughput. The timing partition, implemented in FPGA, features throughput-oriented, latency-tolerant designs to cope with the challenges of the hybrid platform. Furthermore, a fault injection framework is added to this implementation that allows designers to study the reliability aspects of their processors. The result is a simulator that is fast, accurate, flexible, and extensible. / text Computer architecture Microarchitecture-level simulator Microprocessor simulation Multicore simulation FPGA acceleration Hardware-software co-design Speculative functional-first Timing model F-T divergence Fault injection Microprocessor reliability
83	Software Techniques for Distributed Shared Memory Radovic, Zoran January 2005 (has links) In large multiprocessors, the access to shared memory is often nonuniform, and may vary as much as ten times for some distributed shared-memory architectures (DSMs). This dissertation identifies another important nonuniform property of DSM systems: nonuniform communication architecture, NUCA. High-end hardware-coherent machines built from large nodes, or from chip multiprocessors, are typical NUCA systems, since they have a lower penalty for reading recently written data from a neighbor's cache than from a remote cache. This dissertation identifies node affinity as an important property for scalable general-purpose locks. Several software-based hierarchical lock implementations exploiting NUCAs are presented and evaluated. NUCA-aware locks are shown to be almost twice as efficient for contended critical sections compared to traditional lock implementations. The shared-memory “illusion”' provided by some large DSM systems may be implemented using either hardware, software or a combination thereof. A software-based implementation can enable cheap cluster hardware to be used, but typically suffers from poor and unpredictable performance characteristics. This dissertation advocates a new software-hardware trade-off design point based on a new combination of techniques. The two low-level techniques, fine-grain deterministic coherence and synchronous protocol execution, as well as profile-guided protocol flexibility, are evaluated in isolation as well as in a combined setting using all-software implementations. Finally, a minimum of hardware trap support is suggested to further improve the performance of coherence protocols across cluster nodes. It is shown that all these techniques combined could result in a fairly stable performance on par with hardware-based coherence. synchronization distributed shared memory write permission cache nonuniform communication architecture node affinity locality hardware-software trade-off profiling flexibility trap-based memory architecture Computer engineering Datorteknik
84	Power network in the loop : subsystem testing using a switching amplifier Goyal, Sachin January 2009 (has links) “Hardware in the Loop” (HIL) testing is widely used in the automotive industry. The sophisticated electronic control units used for vehicle control are usually tested and evaluated using HIL-simulations. The HIL increases the degree of realistic testing of any system. Moreover, it helps in designing the structure and control of the system under test so that it works effectively in the situations that will be encountered in the system. Due to the size and the complexity of interaction within a power network, most research is based on pure simulation. To validate the performance of physical generator or protection system, most testing is constrained to very simple power network. This research, however, examines a method to test power system hardware within a complex virtual environment using the concept of the HIL. The HIL testing for electronic control units and power systems protection device can be easily performed at signal level. But performance of power systems equipments, such as distributed generation systems can not be evaluated at signal level using HIL testing. The HIL testing for power systems equipments is termed here as ‘Power Network in the Loop’ (PNIL). PNIL testing can only be performed at power level and requires a power amplifier that can amplify the simulation signal to the power level. A power network is divided in two parts. One part represents the Power Network Under Test (PNUT) and the other part represents the rest of the complex network. The complex network is simulated in real time simulator (RTS) while the PNUT is connected to the Voltage Source Converter (VSC) based power amplifier. Two way interaction between the simulator and amplifier is performed using analog to digital (A/D) and digital to analog (D/A) converters. The power amplifier amplifies the current or voltage signal of simulator to the power level and establishes the power level interaction between RTS and PNUT. In the first part of this thesis, design and control of a VSC based power amplifier that can amplify a broadband voltage signal is presented. A new Hybrid Discontinuous Control method is proposed for the amplifier. This amplifier can be used for several power systems applications. In the first part of the thesis, use of this amplifier in DSTATCOM and UPS applications are presented. In the later part of this thesis the solution of network in the loop testing with the help of this amplifier is reported. The experimental setup for PNIL testing is built in the laboratory of Queensland University of Technology and the feasibility of PNIL testing has been evaluated using the experimental studies. In the last section of this thesis a universal load with power regenerative capability is designed. This universal load is used to test the DG system using PNIL concepts. This thesis is composed of published/submitted papers that form the chapters in this dissertation. Each paper has been published or submitted during the period of candidature. Chapter 1 integrates all the papers to provide a coherent view of wide bandwidth switching amplifier and its used in different power systems applications specially for the solution of power systems testing using PNIL.
85	Integer Occupancy Grids : a probabilistic multi-sensor fusion framework for embedded perception / Grille d'occupation entière : une méthode probabiliste de fusion multi-capteurs pour la perception embarquée Rakotovao Andriamahefa, Tiana 21 February 2017 (has links) Pour les voitures autonomes, la perception est une fonction principale où la sécurité est de la plus haute importance. Un système de perception construit un modèle de l'environnement de conduite en fusionnant plusieurs capteurs de perception incluant les LIDARs, les radars, les capteurs de vision, etc. La fusion basée sur les grilles d'occupation construit un modèle probabiliste de l'environnement en prenant en compte l'incertitude des capteurs. Cette thèse vise à intégrer le calcul des grilles d'occupation dans des systèmes embarqués à bas-coût et à basse-consommation. Cependant, les grilles d'occupation effectuent des calculs de probabilité intenses et difficilement calculables en temps-réel par les plateformes matérielles embarquées.Comme solution, cette thèse introduit une nouvelle méthode de fusion probabiliste appelée Grille d'Occupation Entière. Les Grilles d'Occupation Entières se reposent sur des principes mathématiques qui permettent de calculer la fusion de capteurs grâce à des simple addition de nombre entiers. L'intégration matérielle et logicielle des Grilles d'Occupation Entière est sûre et fiable. Les erreurs numériques engendrées par les calculs sont connues, majorées et paramétrées par l'utilisateur. Les Grilles d'Occupation Entière permettent de calculer en temps-réel la fusion de multiple capteurs sur un système embarqué bas-coût et à faible consommation dédié pour les applications pour l'automobile. / Perception is a primary task for an autonomous car where safety is of utmost importance. A perception system builds a model of the driving environment by fusing measurements from multiple perceptual sensors including LIDARs, radars, vision sensors, etc. The fusion based on occupancy grids builds a probabilistic environment model by taking into account sensor uncertainties. This thesis aims to integrate the computation of occupancy grids into embedded low-cost and low-power platforms. Occupancy Grids perform though intensive probability calculus that can be hardly processed in real-time on embedded hardware.As a solution, this thesis introduces the Integer Occupancy Grid framework. Integer Occupancy Grids rely on a proven mathematical foundation that enables to process probabilistic fusion through simple addition of integers. The hardware/software integration of integer occupancy grids is safe and reliable. The involved numerical errors are bounded and is parametrized by the user. Integer Occupancy Grids enable a real-time computation of multi-sensor fusion on embedded low-cost and low-power processing platforms dedicated for automotive applications. Grille d'occupation entière Fusion de capteur Grille d'occupation Modèle d'environnement Perception Intégration matérielle/logicielle Integer occupancy grid Sensor fusion Occupancy grid Environment model Perception Hardware/software integration 004 510
86	Uma metodologia para estimativa de área baseada em redes de Petri temporizadas para ambientes de sistemas de hardware/software co-design Portela Machado, Albano January 2004 (has links) Made available in DSpace on 2014-06-12T15:58:27Z (GMT). No. of bitstreams: 2 arquivo4484_1.pdf: 6966497 bytes, checksum: 24a281b3de8ed514a81a117af5c76238 (MD5) license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5) Previous issue date: 2004 / A maioria dos sistemas electrônicos modernos consiste em hardware dedicado e componentes programáveis (chamados componentes de software). Ao longo dos últimos anos, o número de metodologias que aplicaram simultaneamente técnicas de diferentes áreas para desenvolver sistemas mistos de hardware e software tem crescido consideravelmente. Projetos concorrentes de sistemas mistos de hardware/software têm mostrado ser vantajoso quando considerado como um todo ao invés de se considerar entidades independentes. Hoje em dia, o mercado eletrônico demanda sistemas de alto desempenho e de baixo custo. Estes requisitos são essenciais para a competitividade de mercado. Além disso, um curto time-to-market é um fator importante. A demora no lançamento do produto causa sérias reduções no lucro, desde que é mais simples vender um produto quando se tem pouca ou nenhuma competição. Isto significa que facilitando o re-uso de projetos anteriores, uma rápida exploração de projeto, análise/verificação qualitativa em fases iniciais do projeto, prototipação e a redução do tempo requerido para testes, reduzem o tempo global exigido de uma especificação até o produto final. Ao projetar tais sistemas mistos de hardware/software, a análise de alternativas de projeto e a decisão de onde implementar cada parte de sistema, isto é, em hardware ou em software, são tarefas muito importantes. A estimativa de métricas de qualidade permite a exploração do espaço de projeto e pode guiar a decisão de implementação de partes do sistema. Tais métricas são calculadas no nível de sistema, ou seja, sem implementação real. Conseqüentemente, tais estimativas também aceleraram o projeto do sistema e permitem a análise de restrições de projeto, fornecendo uma retroalimetação para decisões de projeto. As redes de Petri são técnicas de especificação formal que permitem uma representação gráfica e matemática. Têm métodos poderosos que permitem aos projetistas realizar análises qualitativa e quantitativa. Redes de Petri Timed, são extensões de redes de Petri nas quais as informações de tempo são expressas por duração (rede com tempo determinístico, política de disparo em três fases) e são associadas às transições. Para uma descrição comportamental de alto nível, o projeto de hardware é dividido em classes de blocos funcionais: caminho de dados e controladores. O caminho de dados consiste em três tipos de componentes RT: unidades de armazenamento (registradores e latches), unidades funcionais (ALUS e comparadores), e unidades de interconexão (multiplexadores e barramentos). As unidades de armazenamento são requeridas para armazenar valores de dados como constantes, variáveis e vetores no comportamento. As unidades funcionais são necessárias para implementar as operações no comportamento. Após todas as variáveis e operações no comportamento terem sido mapeadas às unidades de armazenamento e funcionais, respectivamente, podemos estimar o número de unidades de interconexão, como os barramentos e multiplexadores, os quais são requeridos para interligar as unidades de armazenamento e funcionais. Este trabalho propõe uma abordagem para estimar a área de hardware a partir do número de unidades de armazenamento, funcionais e de interconexão, levando-se em consideração restrições de tempo e dependência de dados, e estende alguns trabalhos anteriores com o objetivo de melhorar a precisão dos métodos de estimativa de área. Isto é, o método proposto considera uma rede de fluxo de dados que captura dependência de dados e calcula a área do caminho de dados a partir do número e tipo dos seus componentes, considerando a relação de dependência temporal Redes de petri Redes de petri temporizadas Estimativas Hardware/software co-design Dependência de dados Caminho de dados Unidades de armazenamento Unidades funcionais Unidades de interconexão Modelos intermediários
87	Field Programmable Gate Array Based Target Detection and Gesture Recognition Mekala, Priyanka 12 October 2012 (has links) The move from Standard Definition (SD) to High Definition (HD) represents a six times increases in data, which needs to be processed. With expanding resolutions and evolving compression, there is a need for high performance with flexible architectures to allow for quick upgrade ability. The technology advances in image display resolutions, advanced compression techniques, and video intelligence. Software implementation of these systems can attain accuracy with tradeoffs among processing performance (to achieve specified frame rates, working on large image data sets), power and cost constraints. There is a need for new architectures to be in pace with the fast innovations in video and imaging. It contains dedicated hardware implementation of the pixel and frame rate processes on Field Programmable Gate Array (FPGA) to achieve the real-time performance. The following outlines the contributions of the dissertation. (1) We develop a target detection system by applying a novel running average mean threshold (RAMT) approach to globalize the threshold required for background subtraction. This approach adapts the threshold automatically to different environments (indoor and outdoor) and different targets (humans and vehicles). For low power consumption and better performance, we design the complete system on FPGA. (2) We introduce a safe distance factor and develop an algorithm for occlusion occurrence detection during target tracking. A novel mean-threshold is calculated by motion-position analysis. (3) A new strategy for gesture recognition is developed using Combinational Neural Networks (CNN) based on a tree structure. Analysis of the method is done on American Sign Language (ASL) gestures. We introduce novel point of interests approach to reduce the feature vector size and gradient threshold approach for accurate classification. (4) We design a gesture recognition system using a hardware/ software co-simulation neural network for high speed and low memory storage requirements provided by the FPGA. We develop an innovative maximum distant algorithm which uses only 0.39% of the image as the feature vector to train and test the system design. Database set gestures involved in different applications may vary. Therefore, it is highly essential to keep the feature vector as low as possible while maintaining the same accuracy and performance TARGET DETECTION GESTURE RECOGNITION NEURAL NETWORKS BACK PROPAGATION AMERICAN SIGN LANGUAGE OCCLUSION DETECTION FIELD PROGRAMMABLE GATE ARRAY VERILOG HARDWARE SOFTWARE CO-SIMULATION PLATFORM
88	Analýzy síťového provozu na procesoru NXP a FPGA / Network Traffic Analysis Using NXP Processor and FPGA Orsák, Michal January 2018 (has links) The primary goal of this thesis is to exploit possibilites of aa entirely new hardware based on NXP LS2088 and FPGA. The secondary goal is to create firmware for this processor working out-of-box and perform optimisations of existing software for L7 analysis. This software was deeply bound to a previous hardware platform. The network processor NXP LS2088 contains many hardware accellerators and a virtual reconfigurable network. This thesis exploits all hardware parts of on this platform. Many tweaks and optimizations were performed based on this analysis to achieve maximum efficieny of software for L7 analysis. There were many intensive optimisations like rewriting for the DPDK library and new hardware or hardware synchronization of worker threads of this application. The main result of this thesis is working platform with efficient L7 analysis software which actively uses accelerators in FPGA and NXP network processor. SDK for new platform is also prepared.
89	Kostenmodellierung mit SystemC/System-AMS Markert, Erik, Wang, Hailu, Herrmann, Göran, Heinkel, Ulrich 08 June 2007 (has links) In diesem Beitrag wird eine Methode zur Beschreibung von Kostenfaktoren und deren Verknüpfung über Hierarchiegrenzen hinweg dargestellt. Sie eignet sich sowohl für rein digitale Systeme mit Softwareanteilen als auch für gemischt analog/digitale Systeme. Damit ist sie im Hardware-Software Codesign und im Analog-Digital Codesign zum Vergleich verschiedener Systemkompositionen anwendbar. Die Implementierung mit C++ ermöglicht neben einer Nutzung mit digitalem SystemC auch den Einsatz mit der analogen SystemC-Erweiterung SystemC-AMS und vereinfacht die Nutzung gegenüber einer vorhandenen VHDL-Implementierung. Als Anwendungsbeispiel fungieren Komponenten eines Systems zur Inertialnavigation. info:eu-repo/classification/ddc/004 ddc:004 info:eu-repo/classification/ddc/500 ddc:500 Mikrosystemtechnik Systementwurf Analog/Digital-Design Hardware/Software- Codesign Inertialnavigation SystemC/System-AMS
90	Souběžný evoluční návrh hardwaru a softwaru / Concurrent evolutionary design of hardware and software Minařík, Miloš January 2018 (has links) Genetické programování (GP) je v určitém rozsahu schopno automaticky generovat požadované programy, aniž by uživatel musel určit, jakým způsobem má program postupovat. GP bylo s úspěchem použito k řešení široké škály praktických problémů z různých oblastí, přičemž výsledky byly často srovnatelné s řešeními vytvořenými člověkem. Doposud však nebyla zodpovězena otázka, zda GP dokáže generovat vysoce optimalizovaný výpočetní model (platformu) spolu s programem spustitelným na této platformě, který by řešil daný problém při dodržení všech omezení (například na plochu na čipu a zpoždění). V případě scénářů, kdy je optimalizováno více kritérií, by uživatelským výstupem měla být množina nedominovaných řešení s různými kombinacemi úrovně využití zdrojů (plocha, příkon) a výkonu (rychlosti provádění). Tento problém může být chápán jako souběžný návrh hardwaru a softwaru, zkráceně HW/SW codesign. Tato práce zkoumá způsoby, jakými lze souběžně evolučně vyvíjet platformu a programy v případě, že je problém zadán množinou vektorů vstupů a jim odpovídajících výstupů. Nejprve byl vytvořen model architektury a evoluční platforma zajišťující zpracování a evoluční vývoj těchto architektur. Kandidátní mikroprogramové architektury byly evolvovány spolu s programy pomocí lineárního genetického programování. Následně byla provedena série jednodušších experimentů. Navržená platforma dosahovala výsledků srovnatelných s nejnovějšími metodami. Na základě slabých míst objevených během počátečních experimentů byla platforma rozšířena. Rozšířená platforma byla poté ověřena na několika složitějších experimentech. Jeden z nich byla zaměřen na efektivní implementaci aproximace sigmoidální funkce. Platforma v tomto případě našla řadu různých řešení implementujících aproximaci sigmoidy, z nichž některá byla sekvenční a jiná čistě kombinační. V rámci experimentu byly evolučně nalezeny i známé algoritmy, přičemž některé z nich byly evolucí dokonce optimalizovány pro podmnožinu definičního oboru zvolenou pro daný experiment. Poslední sada experimentů byla zaměřena na evoluční návrh obrazových filtrů pro redukci šumu typu sůl a pepř. Platforma v tomto případě znovuobjevila koncept přepínaných filtrů a naezla variantu přepínaného mediánového filtru, která byla z hlediska výsledků filtrace srovnatelná s běžně používanými metodami. Tato práce prokázala, že pomocí genetického programování lze navrhovat a optimalizovat malé HW/SW systémy. Automatizovaný evoluční návrh složitějších HW/SW systémů zůstává otevřeným problémem vhodným k dalšímu výzkumu.

Search results