Global ETD Search

1561	Wordlength inference in the Spade HDL : Seven implementations of wordlength inference and one implementation that actually works / Ordlängdsinferans i Spade HDL : Sju olika implementationer av ordlängdsinferens och en implementation som faktiskt fungerar Thörnros, Edvard January 2023 (has links) Compilers, complex programs with the potential to greatly facilitate software and hardware design. This thesis focuses on enhancing the Spade hardware description language, known for its user-friendly approach to hardware design. In the realm of hardware development data size - for numerical values data size is known as "wordlength" - plays a critical role for reducing the hardware resources. This study presents an innovative approach that seamlessly integrates wordlength inference directly into the Spade language, enabling the over-estimation of numeric data sizes solely from the program's source code. The methodology involves iterative development, incorporating various smaller implementations and evaluations, reminiscent of an agile approach. To assess the efficacy of the wordlength inference, multiple place and route operations are performed on identical Spade code using various versions of nextpnr. Surprisingly, no discernible impact on hardware resource utilization emerges from the modifications introduced in this thesis. Nonetheless, the true significance of this endeavor lies in its potential to unlock more advanced language features within the Spade compiler. It is important to note that while the wordlength inference proposed in this thesis shows promise, it necessitates further integration efforts to realize its full potential. FPGA spade wordlength inference word length word-length compiler hdl hardware description language LUT lookup tables resource usage compiler FPGA spade ordlängdsinferens kompilator hdl ordlängd Computer Engineering Datorteknik Embedded Systems Inbäddad systemteknik
1562	Characterization of Partial and Run-Time Reconfigurable FPGAs Fazzoletto, Emilio January 2016 (has links) FPGA based systems have been heavily used to prototype and test Application Specic Integrated Circuit (ASIC) designs with much lower costs and development time compared to hardwired prototypes. In recentyears, thanks to both the latest technology nodes and a change in the architecture of reconfigurable integrated circuits (from traditional Complex Programmable Logic Device (CPLD) to full-CMOS FPGA), FPGAs have become more popular in embedded systems, both as main computation resources and as hardware accelerators. A new era is beginning for FPGA based systems: the partial run-time reconguration of a FPGA is a feature now available in products already on the market and hardware designers and software developers have to exploit this capability. Previous works show that, when designed properly, a system can improve both its power efficiency and its performance taking advantage of a partial run-time reconfigurable architecture. Unfortunately, taking advantage of run-time reconfigurable hardware is very challenging and there are several problems to face: the reconfiguration overhead is not negligible compared to nowadays CPUs performance,the reconfiguration time is not easily predictable, and the software has to be re-though to work with a time-evolving platform. This thesis project aims to investigate the performance of a modern run-time reconfigurable SoC (a Xilinx Zynq 7020), focusing on the reconfiguration overhead and its predictability, on the achievable speedup, and the trade-off and limits of this kind of platform. Since it is not always obvious when an application (especially a real-time one) is really able to use at its own advantage a partial run-time reconfigurable platform, the data collected during this project could be a valid help for hardware designers that use reconfigurable computing. / FPGA-baserade system har tidigare främst använts för snabb och kostnadseffektiv konstruktion av prototyper vid framtagandet av applikationsspecika integrerade kretsar (ASIC). På senare år har användandet av FPGA:er i inbyggda system för implementation av hårdvaruacceleratorers såväl som huvudsaklig beräkningsenhet ökat. Denna ökning har möjliggjorts mycket tack vare den utveckling som har skett av rekonfigurerbara integrerade kretsar: från de mer traditionella Complex Programmable Logic Devices (CPLD) till helt CMOS-baserade FPGA:er. Nu inleds en ny era för FPGA-baserade system tack vare möjligheten att under körning rekonfigurera delar av FPGA:n genom så kallad partial run-time reconguration(RTR) - en teknik som redan idag finns tillgänglig i produkter på marknaden. Tidigare forskning visar att användandet av en RTR-baserad hårdvaruarkitektur kan ha en positiv effekt med avseende på prestanda såväl som strömförbrukning. Att använda RTR-baserad hårdvara innebär dock flera utmaningar: En ej försumbar rekonfigurationstid måste tas i beaktning, så även den icke-deterministiska exekveringstiden som en rekonfiguration kan innebära. Vidare måste anpassningar av mjukvaran göras för att fungera med en hårdvaruplattform som förändras över tid. Denna uppsats syftar till att undersöka prestandan hos ett modernt RTRbaserat SoC (Xilinx Zynq 7020) med fokus på rekonfigurationstider och dess förutsägbarhet, prestanda ökning, begränsningar samt nödvändiga kompromisser som denna arkitektur innebär. Huruvida en applikation kan dra nytta av en RTR-baserad arkitektur eller inte kan vara svårt att avgöra. Den insamlade datan som presenteras i denna rapport kan dock fungera som stöd för hårdvarukonstruktörer som önskar använda en RTR-baserad plattform. FPGA Reconfigurable Computing Partial Reconfiguration Hardware Architectures Embedded Systems Computation Efficiency Xilinx FPGA Rekonfigurerbar hårdvaruarkitektur Partiell rekonfiguration Hårdvaruarkitektur Inbyggda system Beräkningseffektivitet Xilinx Engineering and Technology Teknik och teknologier Elektroteknik och elektronik
1563	Enhancing Drone Spectra Classification : A Study on Data-Adaptive Pre-processing and Efficient Hardware Deployment Del Gaizo, Dario January 2023 (has links) Focusing on the problem of Drone vs. Unknown classification based on radar frequency-amplitude spectra using Deep Learning (DL), especially 1-Dimensional Convolutional Neural Networks (1D-CNNs), this thesis aims at reducing the current gap in the research related to adequate pre-processing techniques for hardware deployment. The primary challenge tackled in this work is determining a pipeline that facilitates industrial deployment while maintaining high classification metrics. After presenting a comprehensive review of existing research on radar signal classification and the application of DL techniques in this domain, the technical background of signal processing is described to provide a practical scenario where the solutions could be implemented. A thorough description of technical constraints, such as Field Programmable Gate Array (FPGA) data type requirements, follows the entire project justifying the necessity of a learning-based pre-processing technique for highly skewed distributions. The results demonstrate that data-adaptive preprocessing eases hardware deployment and maintains high classification metrics, while other techniques contribute to noise and information loss. In conclusion, this thesis contributes to the field of radar frequency-amplitude spectra classification by identifying effective methods to support efficient hardware deployment of 1D-CNNs, without sacrificing performance. This work lays the foundation for future studies in the field of DL for real-world signal processing applications. / Med fokus på problemet med klassificering av drönare kontra okänt baserat på radarfrekvens-amplitudspektra med Deep Learning (DL), särskilt 1-Dimensional Convolutional Neural Networks (1D-CNNs), syftar denna avhandling till att minska det nuvarande gapet i forskningen relaterad till adekvata förbehandlingstekniker för hårdvarudistribution. Den främsta utmaningen i detta arbete är att fastställa en pipeline som underlättar industriell driftsättning samtidigt som höga klassificeringsmått bibehålls. Efter en omfattande genomgång av befintlig forskning om klassificering av radarsignaler och tillämpningen av DL-tekniker inom detta område, beskrivs den tekniska bakgrunden för signalbehandling för att ge ett praktiskt scenario där lösningarna kan implementeras. En grundlig beskrivning av tekniska begränsningar, såsom krav på datatyper för FPGA (Field Programmable Gate Array), följer hela projektet och motiverar nödvändigheten av en inlärningsbaserad förbehandlingsteknik för mycket skeva fördelningar. Resultaten visar att dataanpassad förbehandling underlättar hårdvaruimplementering och bibehåller höga klassificeringsmått, medan andra tekniker bidrar till brus och informationsförlust. Sammanfattningsvis bidrar denna avhandling till området klassificering av radarfrekvens-amplitudspektra genom att identifiera effektiva metoder för att stödja effektiv hårdvarudistribution av 1D-CNN, utan att offra prestanda. Detta arbete lägger grunden för framtida studier inom området DL för verkliga signalbehandlingstillämpningar. Deep Learning Adaptive Pre-processing 1D-CNN Radar Spectrum micro-Doppler Signal Processing Hardware Deployment Drone Unmanned Aerial Vehicle FPGA Djupinlärning Adaptiv Förbehandling 1D-CNN Radar Spektrum mikro-Doppler Signalbehandling Hårdvarudistribution Drönare Obemannad Luftfarkost FPGA Computer and Information Sciences Data- och informationsvetenskap
1564	Hybrid Hardware/Software Architectures for Network Packet Processing in Security Applications Fießler, Andreas Christoph Kurt 14 June 2019 (has links) Die Menge an in Computernetzwerken verarbeiteten Daten steigt stetig, was Netzwerkgeräte wie Switches, Bridges, Router und Firewalls vor Herausfordungen stellt. Die Performance der verbreiteten, CPU/softwarebasierten Ansätze für die Implementierung dieser Aufgaben ist durch den inhärenten Overhead in der sequentiellen Datenverarbeitung limitiert, weshalb solche Funktionalitäten vermehrt auf dedizierten Hardwarebausteinen realisiert werden. Diese bieten eine schnelle, parallele Verarbeitung mit niedriger Latenz, sind allerdings aufwendiger in der Entwicklung und weniger flexibel. Nicht jede Anwendung kann zudem für parallele Verarbeitung optimiert werden. Diese Arbeit befasst sich mit hybriden Ansätzen, um eine bessere Ausnutzung der jeweiligen Stärken von Soft- und Hardwaresystemen zu ermöglichen, mit Schwerpunkt auf der Paketklassifikation. Es wird eine Firewall realisiert, die sowohl Flexibilität und Analysetiefe einer Software-Firewall als auch Durchsatz und Latenz einer Hardware-Firewall erreicht. Der Ansatz wird auf einem Standard-Rechnersystem, welches für die Hardware-Klassifikation mit einem rekonfigurierbaren Logikbaustein (FPGA) ergänzt wird, evaluiert. Eine wesentliche Herausforderung einer hybriden Firewall ist die Identifikation von Abhängigkeiten im Regelsatz. Es werden Ansätze vorgestellt, welche den redundanten Klassifikationsaufwand auf ein Minimum reduzieren, wie etwa die Wiederverwendung von Teilergebnissen der hybriden Klassifikatoren oder eine exakte Abhängigkeitsanalyse mittels Header Space Analysis. Für weitere Problemstellungen im Bereich der hardwarebasierten Paketklassifikation, wie dynamisch konfigurierbare Filterungsschaltkreise und schnelle, sichere Hashfunktionen für Lookups, werden Machbarkeit und Optimierungen evaluiert. Der hybride Ansatz wird im Weiteren auf ein System mit einer SDN-Komponente statt einer FPGA-Erweiterung übertragen. Auch hiermit können signifikante Performancegewinne erreicht werden. / Network devices like switches, bridges, routers, and firewalls are subject to a continuous development to keep up with ever-rising requirements. As the overhead of software network processing already became the performance-limiting factor for a variety of applications, also former software functions are shifted towards dedicated network processing hardware. Although such application-specific circuits allow fast, parallel, and low latency processing, they require expensive and time-consuming development with minimal possibilities for adaptions. Security can also be a major concern, as these circuits are virtually a black box for the user. Moreover, the highly parallel processing capabilities of specialized hardware are not necessarily an advantage for all kinds of tasks in network processing, where sometimes a classical CPU is better suited. This work introduces and evaluates concepts for building hybrid hardware-software-systems that exploit the advantages of both hardware and software approaches in order to achieve performant, flexible, and versatile network processing and packet classification systems. The approaches are evaluated on standard software systems, extended by a programmable hardware circuit (FPGA) to provide full control and flexibility. One key achievement of this work is the identification and mitigation of challenges inherent when a hybrid combination of multiple packet classification circuits with different characteristics is used. We introduce approaches to reduce redundant classification effort to a minimum, like re-usage of intermediate classification results and determination of dependencies by header space analysis. In addition, for some further challenges in hardware based packet classification like filtering circuits with dynamic updates and fast hash functions for lookups, we describe feasibility and optimizations. At last, the hybrid approach is evaluated using a standard SDN switch instead of the FPGA accelerator to prove portability. Paketklassifikation FPGA Netzwerk Firewall packet classification FPGA computer networks firewall security 004 Informatik ST 277 ST 200 ST 230 ST 150 ddc:004 ddc:000
1565	Graphical Support for the Design and Evaluation of Configurable Logic Blocks Erxleben, Fredo 15 January 2016 (has links) (PDF) Developing a tool supporting humans to design and evaluate CLB-based circuits requires a lot of know-how and research from different fields of computer science. In this work, the newly developed application q2d, especially its design and implementation will be introduced as a possible tool for approaching CLB circuit development with graphical UI support. Design decisions and implementation will be discussed and a workflow example will be given. rekonfigurierbare Schaltkreise Hardwareentwurf Werkzeuge CLB FPGA hardware design reconfigurable hardware tooling ddc:004 rvk:ST 190 rvk:ZN 4950
1566	FIELD PROGRAMMABLE GATE ARRAY BASED MINIATURISED REMOTE UNIT FOR A DECENTRALISED BASE-BAND TELEMETRY SYSTEM FOR SATELLITE LAUNCH VEHICLES M., Krishnakumar, G., Padma, S., Sreelal, V., Narayana T., P., Anguswamy, S., Singh U. 11 1900 (has links) International Telemetering Conference Proceedings / October 30-November 02, 1995 / Riviera Hotel, Las Vegas, Nevada / The Remote Unit (RU) for a decentralised on-board base-band telemetry system is designed for use in launch vehicle missions of the Indian Space Research Organisation (ISRO). This new design is a highly improved and miniaturised version of an earlier design. The major design highlights are as follows. Usage of CMOS Field Programmable Gate Array (FPGA) technology in place of LS TTL devices, the ability to acquire various types of data like high level single ended or differential analog, bi-level events and two channels of high speed asynchronous serial data from On-Board Computers (OBCs), usage of HMC technology for the reduction of discrete parts etc. The entire system is realised on a single 6 layer MLB and is packaged on a stackable modular frame. This paper discusses the design approach, tools used, simulations carried out, implementation details and the results of detailed qualification tests done on the realised qualification model. Remote Unit (RU) Field Programmable Gate Array (FPGA) Serial links Hybrid Micro Circuit (HMC) On-Board Computer (OBC)
1567	A novel parallel algorithm for surface editing and its FPGA implementation Liu, Yukun January 2013 (has links) Surface modelling and editing is one of important subjects in computer graphics. Decades of research in computer graphics has been carried out on both low-level, hardware-related algorithms and high-level, abstract software. Success of computer graphics has been seen in many application areas, such as multimedia, visualisation, virtual reality and the Internet. However, the hardware realisation of OpenGL architecture based on FPGA (field programmable gate array) is beyond the scope of most of computer graphics researches. It is an uncultivated research area where the OpenGL pipeline, from hardware through the whole embedded system (ES) up to applications, is implemented in an FPGA chip. This research proposes a hybrid approach to investigating both software and hardware methods. It aims at bridging the gap between methods of software and hardware, and enhancing the overall performance for computer graphics. It consists of four parts, the construction of an FPGA-based ES, Mesa-OpenGL implementation for FPGA-based ESs, parallel processing, and a novel algorithm for surface modelling and editing. The FPGA-based ES is built up. In addition to the Nios II soft processor and DDR SDRAM memory, it consists of the LCD display device, frame buffers, video pipeline, and algorithm-specified module to support the graphics processing. Since there is no implementation of OpenGL ES available for FPGA-based ESs, a specific OpenGL implementation based on Mesa is carried out. Because of the limited FPGA resources, the implementation adopts the fixed-point arithmetic, which can offer faster computing and lower storage than the floating point arithmetic, and the accuracy satisfying the needs of 3D rendering. Moreover, the implementation includes Bézier-spline curve and surface algorithms to support surface modelling and editing. The pipelined parallelism and co-processors are used to accelerate graphics processing in this research. These two parallelism methods extend the traditional computation parallelism in fine-grained parallel tasks in the FPGA-base ESs. The novel algorithm for surface modelling and editing, called Progressive and Mixing Algorithm (PAMA), is proposed and implemented on FPGA-based ES’s. Compared with two main surface editing methods, subdivision and deformation, the PAMA can eliminate the large storage requirement and computing cost of intermediated processes. With four independent shape parameters, the PAMA can be used to model and edit freely the shape of an open or closed surface that keeps globally the zero-order geometric continuity. The PAMA can be applied independently not only FPGA-based ESs but also other platforms. With the parallel processing, small size, and low costs of computing, storage and power, the FPGA-based ES provides an effective hybrid solution to surface modelling and editing. 006.6
1568	Approche multi-processeurs homogènes sur System-on-Chip pour le traitement d'image Damez, Lionel 17 December 2009 (has links) (PDF) La conception de prototypes de systèmes de vision en temps réel embarqué est sujet à de multiples contraintes sévères et fortement contradictoires. Dans le cas de capteurs dits "intelligents", il est nécessaire de fournir une puissance de traitement suffisante pour exécuter les algorithmes à la cadence des capteurs d'images avec un dispositif de taille minimale et consommant peu d'énergie. La conception d'un système monopuce (ou SoC) et l'implantation d'algorithmes de plus en plus complexes pose problème si on veut l'associer avec une approche de prototypage rapide d'applications scientifiques. Afin de réduire de manière significative le temps et les différents coûts de conception, le procédé de conception est fortement automatisé. La conception matérielle est basée sur la dérivation d'un modèle d'architecture multiprocesseur générique de manière à répondre aux besoins de capacité de traitement et de communication spécifiques à l'application visée. Les principales étapes manuelles se réduisent au choix et au paramétrage des différents composants matériels synthétisables disponibles. La conception logicielle consiste en la parallélisation des algorithmes, qui est facilitée par l'homogénéité et la régularité de l'architecture de traitement parallèle et la possibilité d'employer des outils d'aide à la parallélisation. Avec l'approche de conception sont présentés les premiers éléments constitutifs qui permettent de la mettre en oeuvre.Ceux ci portent essentiellement sur les aspects de conception matérielle. L'approche proposée est illustrée par l'implantation d'un traitement de stabilisation temps réel vidéo sur technologie SoPC Architectures de vision Système monopuce FPGA architecture parallèle mémoire distribuée passage de message prototypage rapide
1569	Design and Multi-Technology Multi-objective Comparative Analysis of Families of MPSOC. Wang, Zhoukun 12 November 2009 (has links) (PDF) Multiprocessor system on chip (MPSOC) have strongly emerged in the past decade in communication, multimedia, networking and other embedded domains. MPSOC became a new paradigm of high performance embedded application design. This thesis addresses the design and the physical implementation of a Network on Chip (NoC) based Multiprocessor System on Chip. We studied several aspects at different design stages: high level synthesis, architecture design, FPGA implementation, application evaluation and ASIC physical implementation. We try to analysis and find the impacts of these aspects for the MPSOC's final performance, power consumption and area cost. We implemented a NoC based 16 processors embedded system on FPGA prototyping. Three NoCs provide different functionalities for sixteen PE tiles. We also demonstrated the use of our performance monitoring system for software debugging and tuning. With the bi-synchronous FIFO method, our GALS architecture successfully solves the long clock signal distribution problem and allows that each clock domain can run at its own clock frequency. On the other hand we successfully implemented AES and TDES block cipher cryptographic algorithms on this platform and results show linear speedup in computation time. The network part of our architecture has been implemented on ASIC technology and has been explored with different timing constraints and different library categories of STmicroelectronics' 65nm/45nm technologies. The experimental results of ASIC and FPGA are compared, and we inducted the discussion of technology change impact on parallel programming. Network-on-Chip Multiprocessor system on chip High level synthesis Fpga Asic
1570	FPGA Prototyping of a Watermarking Algorithm for MPEG-4 Cai, Wei 05 1900 (has links) In the immediate future, multimedia product distribution through the Internet will become main stream. However, it can also have the side effect of unauthorized duplication and distribution of multimedia products. That effect could be a critical challenge to the legal ownership of copyright and intellectual property. Many schemes have been proposed to address these issues; one is digital watermarking which is appropriate for image and video copyright protection. Videos distributed via the Internet must be processed by compression for low bit rate, due to bandwidth limitations. The most widely adapted video compression standard is MPEG-4. Discrete cosine transform (DCT) domain watermarking is a secure algorithm which could survive video compression procedures and, most importantly, attacks attempting to remove the watermark, with a visibly degraded video quality result after the watermark attacks. For a commercial broadcasting video system, real-time response is always required. For this reason, an FPGA hardware implementation is studied in this work. This thesis deals with video compression, watermarking algorithms and their hardware implementation with FPGAs. A prototyping VLSI architecture will implement video compression and watermarking algorithms with the FPGA. The prototype is evaluated with video and watermarking quality metrics. Finally, it is seen that the video qualities of the watermarking at the uncompressed vs. the compressed domain are only 1dB of PSNR lower. However, the cost of compressed domain watermarking is the complexity of drift compensation for canceling the drifting effect. MPEG-4 watermarking FPGA MPEG (Video coding standard) Video compression. Digital watermarking. Field programmable gate arrays.

Search results