• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 5
  • 1
  • 1
  • Tagged with
  • 7
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Evaluation of AXI-Interfaces for Hardware Software Communication

Sharma, Ankit 01 February 2019 (has links)
A SoC design approach is implemented for the MERGE project which features Machine Learning (ML) interface for the hardware design. This setup deals with detection and localization of impact on a piezo metal composite. Development of the project is executed on Digilent ZYBO board. ZYBO incorporates Xilinx ZYNQ architecture. This architecture provides Processing System (PS) and Programmable Logic (PL) that communicate with each other via AMBA Standard AXI4 Interface. Communication cost have major inuence on the system performance. A optimized hardware software partitioning solution will reduce the communication costs. Therefore, best fitting interface for the provided design is needed to be evaluated to trade-off between cost and performance. High performance of AXI Interface will provide efficient localization of impact, especially for real-time scenario. In the thesis, the performance of three different AXI4 interface are evaluated. Evaluation is performed on the basis of the amount of data transferred and the time taken to process it. Evaluation of interfaces are done through implementation of test cases in Xilinx SDK. Hardware design for AXI4-Interfaces is implemented in Vivado and later tested on Digilent ZYBO board. To test the performance of interfaces, read and write operations are initiated by PS on interface design. Each operation is performed for multiple data lengths. Average execution time is calculated that highlights time taken to transfer the corresponding input data length. Through these tests, it is found that AXI4-Stream is the best choice for a continuous set of data. Preferably, it provides unlimited burst length which is useful for the current project. Among other two interfaces, AXI4-Full performed better in terms of execution time as compared to AXI4-Lite.
2

Bus System for Coresonic SIMT DSP

Svensk, Gustav January 2016 (has links)
This thesis consists of designing and implementing a bus system for a specific computersystem for MediaTek Sweden AB . The focus of the report is to show the considerations andchoices made in the design of a suitable bus system. Implementation details describe howthe system is constructed. The results show that it is possible to maintain a high bandwidthin many parts of the system if an appropriate topology is chosen. If all units in a bus systemare synchronous it is difficult to reach low latency in the communication.
3

SYSTEM ON CHIP : Fördelar i konstruktion med system on chip i förhållande till fristående FPGA och processor / SYSTEM ON CHIP : Advantages of the design of system-on-chip compared to independent FPGA and processor

Ljungberg, Jan January 2015 (has links)
In this exam project the investigation has been done to determine, which profits that can be made by switching an internal bus between two chips, one FPGA and a processor, to an internal bus implemented on only one chip, System on Chip. The work is based on measurements made in real time in Xilinx’s development tools on different buses, AXI4 and AXI4-Light connected to AXI3. The port that is used is FPGA’s own GP-port. Besides measuring the time of transactions also physical aspects have been investigated in this project: space, costs and time. Based on those criteria a comparison to the original construction was made to determine which benefits that can be achieved. The work has shown a number of results that are in comparison with the original construction. The System on Chip has turned out to be a better solution in most cases. When using the AXI4-Light-bus the benefits were not as obvious. Cosmic radiation, temperature or humidity are beyond the scope of this investigation. In the work the hypothetic deductive method has been used to prove that the System on Chip is faster than the original design. In this method three statements must be set up against each other; one statement that ought to be true, one statement that is a contradiction and a conclusion of what is proved. The pre-study pointed out that the System on Chip is a faster solution than the original construction. The method is useful since it proves that the pre-study is comparable to the measured results. / I detta examensarbete har undersökningar gjorts för att fastställa vilka vinster som går att göra genom att byta en internbuss mellan två chip, en FPGA och en processor, mot en intern buss implementerat på ett enda chip, System on Chip. Arbetet bygger på mätningar gjorda i realtid i Xilinx utvecklingsverktyg på olika bussar, AXI4 och AXI4‑Lite som är kopplade internt mot AXI3. Den port som används är FPGAs egen GP‑port. Förutom att mäta överföringshastigheterna, har även fysiska aspekter som utrymme, kostnader och utvecklingstid undersökts. Utifrån dessa kriterier har en jämförelse gjorts med den befintliga konstruktionen för att fastställa vilka vinster som går att uppnå. Arbetet har resulterat i ett antal resultat som är ställda mot de förutsättningar som fanns i den ursprungliga lösningen. I de flesta fall visar resultatet att ett System on Chip är en bättre lösning. De fall som var tveksamma var vid viss typ av överföring med AXI4‑Lite bussen. I arbetet har inte undersökning av kosmisk strålning, temperatur eller luftfuktighet betraktas. I arbetet med att försöka att bevisa att ett System on Chip är snabbare än den ursprungliga uppsättningen har utvecklingsmetoden hypotetisk deduktiv använts. Denna metod bygger på att man från början sätter upp ett påstående, som man förutsätter är sant, följt av en konjunktion, som inte får inträffa, för att slutligen dra en slutsats, som konstaterar fakta. Eftersom fakta som lästes in i början av arbetet pekade på att ett System on Chip var en snabbare och billigare lösning kändes metoden användbar. Under arbetets gång har det visat sig vara en bra metod som också ger ett resultat där sannolikheten för att det är en snabbare lösning ökar. Däremot säger inte metoden att det är helt säkert att den i alla situationer är bättre, vilket kan ändras om man använder andra förutsättningar eller tar med andra aspekter.
4

A Cycle-Accurate Simulator for Accelerating Convolution on AXI4-based Network-on-Chip Architecture / En cykelexakt simulator för att accelerera konvolution på AXI4-baserad nätverk-på-chip-arkitektur

Liu, Mingrui January 2024 (has links)
Artificial intelligence is probably one of the most prevalent research topics in computer science area, because the technology, if well developed and used properly, is promising to affect the daily lives of ordinaries or even reshape the structure of society. In the meantime, the end of Moore’s Law has promoted the development trend towards domain-specific architectures. The upsurge in researching specific architectures for artificial intelligence applications is unprecedented. Network-on-Chip (NoC) was proposed to address the scalability problem of multi-core system. Recently, NoC has gradually appeared in deep learning computing engines. NoC-based deep learning accelerator is an area worthy of research and currently understudied. Simulating a system is an important step in computer architecture research because it not only allows for rapid verification and measurement of design’s performance, but also provides guidance for subsequent hardware design. In this thesis, we present CNNoCaXiM, a flexible and cycle-accurate simulator for accelerating 2D convolution based on NoC interconnection and AXI4 protocol. We demonstrate its ability by simulating and measuring a convolution example with two different data flows. This simulator can be very useful for upcoming research, either as a baseline case or as a building block for further research. / Artificiell intelligens är förmodligen ett av de vanligaste forskningsämnena inom datavetenskap, eftersom tekniken, om den väl utvecklas och används på rätt sätt, lovar att påverka vanliga människors vardag eller till och med omforma samhällets struktur. Under tiden har slutet av Moores lag främjat utvecklingstrenden mot domänspecifika arkitekturer. Uppsvinget i forskning om specifika arkitekturer för tillämpningar av artificiell intelligens är utan motstycke. Network-on-Chip (NoC) föreslogs för att ta itu med skalbarhetsproblemet med flerkärniga system. Nyligen har NoC gradvis dykt upp i djuplärande datormotorer. NoC-baserad accelerator för djupinlärning är ett område som är värt forskning och för närvarande understuderat. Simulering av ett system är ett viktigt steg i forskning om datorarkitektur eftersom det inte bara möjliggör snabb verifiering och mätning av designens prestanda, utan också ger vägledning för efterföljande hårdvarudesign. I detta examensarbete presenterar vi CNNoCaXiM, en flexibel och cykelnoggrann simulator för att accelerera 2D-faltning baserad på NoC-interconnection och AXI4-protokoll. Vi visar dess förmåga genom att simulera och mäta ett faltningsexempel med två olika dataflöden. Denna simulator kan vara mycket användbar för kommande forskning, antingen som ett grundfall eller som en byggsten för vidare forskning.
5

Paměťový subsystém v SystemC / SystemC Memory Subsystem

Michl, Kamil January 2020 (has links)
This thesis deals with the design and implementation of a processor simulation memory subsystem. The memory subsystem is designed using the Transaction Level Modeling approach. The implementation is done in C++ language utilizing the SystemC library. The processor simulation is adopted from the Codasip company simulator. The objective is to create a functional connection between the processor and the memory inside the simulator. This connection supports communication protocols of AHB3-lite, AXI4-lite, CPB, and CPB-lite buses. The new implementation of the aforementioned connection and the memory is integrated into the original simulator. The resulting simulator is tested using unit tests.
6

High-Performance Network-on-Chip Design for Many-Core Processors

Wang, Boqian January 2020 (has links)
With the development of on-chip manufacturing technologies and the requirements of high-performance computing, the core count is growing quickly in Chip Multi/Many-core Processors (CMPs) and Multiprocessor System-on-Chip (MPSoC) to support larger scale parallel execution. Network-on-Chip (NoC) has become the de facto solution for CMPs and MPSoCs in addressing the communication challenge. In the thesis, we tackle a few key problems facing high-performance NoC designs. For general-purpose CMPs, we encompass a full system perspective to design high-performance NoC for multi-threaded programs. By exploring the cache coherence under the whole system scenario, we present a smart communication service called Advance Virtual Channel Reservation (AVCR) to provide a highway to target packets, which can greatly reduce their contention delay in NoC. AVCR takes advantage of the fact that we can know or predict the destination of some packets ahead of their arrival at the Network Interface (NI). Exploiting the time interval before a packet is ready, AVCR establishes an end-to-end highway from the source NI to the destination NI. This highway is built up by reserving the Virtual Channel (VC) resources ahead of the target packet transmission and offering priority service to flits in the reserved VC in the wormhole router, which can avoid the target packets’ VC allocation and switch arbitration delay. Besides, we also propose an admission control method in NoC with a centralized Artificial Neural Network (ANN) admission controller, which can improve system performance by predicting the most appropriate injection rate of each node using the network performance information. In the online control process, a data preprocessing unit is applied to simplify the ANN architecture and make the prediction results more accurate. Based on the preprocessed information, the ANN predictor determines the control strategy and broadcasts it to each node where the admission control will be applied. For application-specific MPSoCs, we focus on developing high-performance NoC and NI compatible with the common AMBA AXI4 interconnect protocol. To offer the possibility of utilizing the AXI4 based processors and peripherals in the on-chip network based system, we propose a whole system architecture solution to make the AXI4 protocol compatible with the NoC based communication interconnect in the many-core system. Due to possible out-of-order transmission in the NoC interconnect, which conflicts with the ordering requirements specified by the AXI4 protocol, in the first place, we especially focus on the design of the transaction ordering units, realizing a high-performance and low cost solution to the ordering requirements. The microarchitectures and the functionalities of the transaction ordering units are also described and explained in detail for ease of implementation. Then, we focus on the NI and the Quality of Service (QoS) support in NoC. In our design, the NI is proposed to make the NoC architecture independent from the AXI4 protocol via message format conversion between the AXI4 signal format and the packet format, offering high flexibility to the NoC design. The NoC based communication architecture is designed to support high-performance multiple QoS schemes. The NoC system contains Time Division Multiplexing (TDM) and VC subnetworks to apply multiple QoS schemes to AXI4 signals with different QoS tags and the NI is responsible for traffic distribution between two subnetworks. Besides, a QoS inheritance mechanism is applied in the slave-side NI to support QoS during packets’ round-trip transfer in NoC. / Med utvecklingen av tillverkningsteknologi av on-chip och kraven på högpresterande da-toranläggning växer kärnantalet snabbt i Chip Multi/Many-core Processors (CMPs) ochMultiprocessor Systems-on-Chip (MPSoCs) för att stödja större parallellkörning. Network-on-Chip (NoC) har blivit den de facto lösningen för CMP:er och MPSoC:er för att mötakommunikationsutmaningen. I uppsatsen tar vi upp några viktiga problem med hög-presterande NoC-konstruktioner.Allmänna CMP:er omfattas ett fullständigt systemperspektiv för att design högprester-ande NoC för flertrådad program. Genom att utforska cachekoherensen under hela system-scenariot presenterar vi en smart kommunikationstjänst, AVCR (Advance Virtual ChannelReservation) för att tillhandahålla en motorväg till målpaket, vilket i hög grad kan min-ska deras förseningar i NoC. AVCR utnyttjar det faktum att vi kan veta eller förutsägadestinationen för vissa paket före deras ankomst till nätverksgränssnittet (Network inter-face, NI). Genom att utnyttja tidsintervallet innan ett paket är klart, etablerar AVCRen ände till ände motorväg från källan NI till destinationen NI. Denna motorväg byggsupp genom att reservera virtuell kanal (Virtual Channel, VC) resurser före målpaket-söverföringen och erbjuda prioriterade tjänster till flisar i den reserverade VC i wormholerouter. Dessutom föreslår vi också en tillträdeskontrollmetod i NoC med en centraliseradartificiellt neuronät (Artificial Neural Network, ANN) tillträdeskontroll, som kan förbättrasystemets prestanda genom att förutsäga den mest lämpliga injektionshastigheten för varjenod via nätverksprestationsinformationen. I onlinekontrollprocessen används en förbehan-dlingsenhet på data för att förenkla ANN-arkitekturen och göra förutsägningsresultatenmer korrekta. Baserat på den förbehandlade informationen bestämmer ANN-prediktornkontrollstrategin och sänder den till varje nod där tillträdeskontrollen kommer att tilläm-pas.För applikationsspecifika MPSoC:er fokuserar vi på att utveckla högpresterande NoCoch NI kompatibla med det gemensamma AMBA AXI4 protokoll. För att erbjuda möj-ligheten att använda AXI4-baserade processorer och kringutrustning i det on-chip baseradenätverkssystemet föreslår vi en hel systemarkitekturlösning för att göra AXI4 protokolletkompatibelt med den NoC-baserade kommunikation i det multikärnsystemet. På grundav den out-of-order överföring i NoC, som strider mot ordningskraven som anges i AXI4-protokollet, fokuserar vi i första hand på utformningen av transaktionsordningsenheterna,för att förverkliga en hög prestanda och låg kostnad-lösning på ordningskraven. Sedanfokuserar vi på NI och Quality of Service (QoS)-stödet i NoC. I vår design föreslås NI attgöra NoC-arkitekturen oberoende av AXI4-protokollet via meddelandeformatkonverteringmellan AXI4 signalformatet och paketformatet, vilket erbjuder NoC-designen hög flexi-bilitet. Den NoC-baserade kommunikationsarkitekturen är utformad för att stödja fleraQoS-schema med hög prestanda. NoC-systemet innehåller Time-Division Multiplexing(TDM) och VC-subnät för att tillämpa flera QoS-scheman på AXI4-signaler med olikaQoS-taggar och NI ansvarar för trafikdistribution mellan två subnät. Dessutom tillämpasen QoS-arvsmekanism i slav-sidan NI för att stödja QoS under paketets tur-returöverföringiNoC / <p>QC 20201008</p>
7

Modeling, Simulation, and Injection of Camera Images/Video to Automotive Embedded ECU : Image Injection Solution for Hardware-in-the-Loop Testing

Lind, Anton January 2023 (has links)
Testing, verification and validation of sensors, components and systems is vital in the early-stage development of new cars with computer-in-the-car architecture. This can be done with the help of the existing technique, hardware-in-the-loop (HIL) testing which, in the close loop testing case, consists of four main parts: Real-Time Simulation Platform, Sensor Simulation PC, Interface Unit (IU), and unit under test which is, for instance, a Vehicle Computing Unit (VCU). The purpose of this degree project is to research and develop a proof of concept for in-house development of an image injection solution (IIS) on the IU in the HIL testing environment. A proof of concept could confirm that editing, customizing, and having full control of the IU is a possibility. This project was initiated by Volvo Cars to optimize the use of the HIL testing environment currently available, making the environment more changeable and controllable while the IIS remains a static system. The IU is an MPSoC/FPGA based design that uses primarily Xilinx hardware and software (Vivado/Vitis) to achieve the necessary requirements for image injection in the HIL testing environment. It consists of three stages in series: input, image processing, and output. The whole project was divided in three parts based on the three stages and carried out at Volvo Cars in cooperation by three students, respectively. The author of this thesis was responsible for the output stage, where the main goal was to find a solution for converting, preferably, AXI4 RAW12 image data into data on CSI2 format. This CSI2 data can then be used as input to serializers, which in turn transmit the data via fiber-optic cable on GMSL2 format to the VCU. Associated with the output stage, extensive simulations and hardware tests have been done on a preliminary solution that partially worked on the hardware, producing signals in parts of the design that could be read and analyzed. However, a final definite solution that fully functions on the hardware has not been found, because the work is at the initial phase of an advanced and very complex project. Presented in this thesis is: important theory regarding, for example, protocols CSI2, AXI4, GMSL2, etc., appropriate hardware selection for an IIS in HIL (FPGA, MPSoC, FMC, etc.), simulations of AXI4 and CSI2 signals, comparisons of those simulations with the hardware signals of an implemented design, and more. The outcome was heavily dependent on getting a certain hardware (TEF0010) to transmit the GMSL2 data. Since the wrong card was provided, this was the main problem that hindered the thesis from reaching a fully functioning implementation. However, these results provide a solid foundation for future work related to image injection in a HIL environment.

Page generated in 0.0338 seconds