Global ETD Search

1	RISC-V Compiler Performance:A Comparison between GCC and LLVM/clang Bjäreholt, Johan January 2017 (has links) RISC-V is a new open-source instruction set architecture (ISA) that in De-cember 2016 manufactured its rst mass-produced processors. It focuses onboth eciency and performance and diers from other open-source architec-tures by not having a copyleft license permitting vendors to freely design,manufacture and sell RISC-V chips without any fees nor having to sharetheir modications on the reference implementations of the architecture.The goal of this thesis is to evaluate the performance of the GCC andLLVM/clang compilers support for the RISC-V target and their ability tooptimize for the architecture. The performance will be evaluated from ex-ecuting the CoreMark and Dhrystone benchmarks are both popular indus-try standard programs for evaluating performance on embedded processors.They will be run on both the GCC and LLVM/clang compilers on dierentoptimization levels and compared in performance per clock to the ARM archi-tecture which is mature yet rather similar to RISC-V. The compiler supportfor the RISC-V target is still in development and the focus of this thesis willbe the current performance dierences between the GCC and LLVM com-pilers on this architecture. The platform we will execute the benchmarks onwil be the Freedom E310 processor on the SiFive HiFive1 board for RISC-Vand a ARM Cortex-M4 processor by Freescale on the Teensy 3.6 board. TheFreedom E310 is almost identical to the reference Berkeley Rocket RISC-Vdesign and the ARM Coretex-M4 processor has a similar clock speed and isaimed at a similar target audience.The results presented that the -O2 and -O3 optimization levels on GCCfor RISC-V performed very well in comparison to our ARM reference. Onthe lower -O1 optimization level and -O0 which is no optimizations and -Oswhich is -O0 with optimizations for generating a smaller executable code sizeGCC performs much worse than ARM at 46% of the performance at -O1,8.2% at -Os and 9.3% at -O0 on the CoreMark benchmark with similar resultsin Dhrystone except on -O1 where it performed as well as ARM. When turn-ing o optimizations (-O0) GCC for RISC-V was 9.2% of the performanceon ARM in CoreMark and 11% in Dhrystone which was unexpected andneeds further investigation. LLVM/clang on the other hand crashed whentrying to compile our CoreMark benchmark and on Dhrystone the optimiza-tion options made a very minor impact on performance making it 6.0% theperformance of GCC on -O3 and 5.6% of the performance of ARM on -O3, soeven with optimizations it was still slower than GCC without optimizations.In conclusion the performance of RISC-V with the GCC compiler onthe higher optimization levels performs very well considering how young theRISC-V architecture is. It does seems like there could be room for improvement on the lower optimization levels however which in turn could also pos-sibly increase the performance of the higher optimization levels. With theLLVM/clang compiler on the other hand a lot of work needs to be done tomake it competetive in both performance and stability with the GCC com-piler and other architectures. Why the -O0 optimization is so considerablyslower on RISC-V than on ARM was also very unexpected and needs furtherinvestigation. RISC-V Compiler Benchmarking Code Optimization Microprocessors Computer Systems Datorsystem
2	Generování objektových souborů pro RISC-V / Generation of Object Files for RISC-V Benna, Filip January 2017 (has links) This master’s thesis deals with the topic of program source code compilation for RISC-V processor architecture. The generated object files need to be compatible with GNU binutils open source tools which are already available for the architecture. The focus is on relocations which must be correctly detected in Codasip Studio tools and transformed into RISC-V platform specific relocation types.
3	Formaln verifikace RISC-V procesoru s vyuitm Questa PropCheck / Formal verification of RISC-V processor with Questa PropCheck Javor, Adrin January 2020 (has links) The topic of this master thesis is Formal verification of RISC-V processor with Questa PropCheck using SystemVerilog assertions. The theoretical part writes about the RISC-V architecture, furthermore, selected components of Codix Berkelium 5 processor used for formal verification are described, communication protocol AHB-lite, formal verification and its methods and tools are also studied. Experimental part consists of verification planning of selected components, subsequent formal verification, analysing of results and evaluating a benefits of formal technics.
4	XBT: FPGA Accelerated Binary Translation Chai, Ke 01 September 2021 (has links) No description available. Computer Engineering FPGA binary translation MIPS RISC-V Zynq
5	Development of Classroom Tools for a RISC-V Embedded System Phillips, Lucas 01 May 2022 (has links) RISC-V is an open-source instruction set that has been gaining popularity in recent years, and, with support from large chip manufacturers like Intel and the benefits of its open-source nature, RISC-V devices are likely to continue gaining momentum. Many courses in a computer science program involve development on an embedded device. Usually, this device is of the ARM architecture, like a Raspberry Pi. With the increasing use of RISC-V, it may be beneficial to use a RISC-V embedded device in one of these classroom environments. This research intends to assist development on the SiFive HiFive1 RevB, which is a RISC-V embedded device. This device was chosen because of its ease of use, functionality-rich API, and affordability. In order to make developing with this board very approachable for a student, this research involved the development of a small suite of tools. These tools support common functionality like: building a source file into an executable ELF file, converting that ELF executable into an Intel HEX executable format that is required to run on the device, uploading the Intel HEX executable onto the device, and attaching a debug session to the program that is running on the device. With the help of this toolchain, developing on this RISC-V embedded device should be very approachable for most students. RISC-V ISA Embedded System Instruction Set Architecture Systems Architecture
6	Viability and Implementation of a Vector Cryptography Extension for Risc-V Skelly, Jonathan W 01 June 2022 (has links) (PDF) RISC-V is an open-source instruction-set architecture (ISA) forming the basis of thousands of commercial and experimental microprocessors. The Scalar Cryptography extension ratified in December 2021 added scalar instructions that target common hashing and encryption algorithms, including SHA2 and AES. The next step forward for the RISC-V ISA in the field of cryptography and digital security is the development of vector cryptography instructions. This thesis examines if it is viable to add vector implementations of existing RISC-V scalar cryptography instructions to the existing vector instruction format, and what improvements they can make to the execution of SHA2 and AES algorithms. Vector cryptography instructions vaeses, vaesesm, vaesds, vaesdsm, vsha256sch, and vsha256hash are proposed to optimize AES encryption and decryption, SHA256 message scheduling, and SHA256 hash rounds, with pseudocode, assembly examples, and a full 32-bit instruction format for each. Both algorithms stand to benefit greatly from vector instructions in reduction of computation time, code length, and instruction memory utilization due to large operand sizes and frequently repeated functions. As a proof of concept for the vector cryptography operations proposed, a full vector-based AES-128 encryption and SHA256 message schedule generation are performed on the 32-bit RISC-V Ibex processor and 128-bit Vicuna Vector Coprocessor in the Vivado simulation environment. Not counting stores or loads for fair comparison, the new Vector Cryptography extension completes a full encryption round in a single instruction compared to sixteen with the scalar extension, and can generate eight SHA256 message schedule double-words in a single instruction compared to the forty necessary on the scalar extension. These represent a 93.75% and 97.5% reduction in required instructions and memory for these functions respectively, at a hardware cost of 19.4% more LUTs and 1.44% more flip-flops on the edited Vicuna processor compared to the original. Cryptography RISC-V ISA AES SHA Verilog Computer and Systems Architecture
7	Adding native support for task scheduling to a Linux-capable RISC-V multicore system / Adicionando suporte nativo a paralelismo de tarefas a um sistema RISC-V multicore com suporte a Linux Morais, Lucas Henrique 22 August 2019 (has links) The Task Scheduling Paradigm is a general technique for leveraging fine and coarse grain parallelism from applications of several domains with minimum impact on code readability, relying on the automatic inference of data dependencies among tasks. The performance of Task Parallel applications is correlated with the speed at which the underlying Task Scheduling System is able to detect such dependencies, something that is critical for fine-granularity workloads, which cannot amortize scheduling overheads with long periods of useful computation. That being the case, several groups have recently been developing FPGA-accelerated Task Scheduling Systems architectures where a software Task Scheduling Runtime is able to offload its bookkeeping computations to an FPGA-based accelerator with the goal of efficiently scheduling fine-grained tasks to CPU cores. Even though these FPGA-accelerated systems offer substantial gains over the software-only baseline, it is also true that FPGA-CPU communication bottlenecks prevent such designs from handling scenarios with either large number of cores or very fine-grained tasks. With that in mind, we proposed the implementation of a Native Task Scheduling System that is, a processor with native support for task scheduling embedded into its architecture with the goal of substantially reducing these overheads. More specifically, this project aimed at embedding the HW logic of Picos, a mature Task Scheduling Accelerator developed by the Barcelona Supercomputing Center (BSC), into Rocket Chip, an open-source, silicon-proven, multi-core implementation of RISC-V. The ISA of the resulting system provides special instructions for Task Applications to interact with this Task Scheduling Logic, ruling out all FPGA-CPU communication latencies. To evaluate the prototype performance, we both (1) adapted Nanos, a mature Task Scheduling runtime, to benefit from the new task-scheduling-accelerating instructions; and (2) developed Phentos, a new HW-accelerated light weight Task Scheduling runtime. Our experiments show that task parallel programs using Nanos-RV the Nanos version ported to our system are on average 2.13 times faster than those being serviced by baseline Nanos, while programs running on Phentos are 13.19 times faster, considering geometric means. Using eight cores, Nanos-RV is able to deliver speedups with respect to serial execution of up to 5.62 times, while Phentos produces speedups of up to 5.72 times. / Paralelismo por Tarefas é uma técnica genérica de extração de paralelismo de granularidade arbitrária aplicável a programas de vários domínios, com mínimo impacto sobre legibilidade de código, baseada na inferência automática de dependências de dados entre tarefas. O desempenho de aplicações paralelas baseadas nesse paradigma depende da velocidade com a qual o runtime de Paralelismo por Tarefas que lhe dá suporte é capaz de detectar tais dependências, fato que é ainda mais crítico para aplicações envolvendo tarefas de granularidade fina, já que nesse cenário o overhead de escalonamento não é amortizado por períodos significativamente maiores de computação útil. Recentemente, diversos grupos têm desenvolvido Sistemas de Suporte a Paralelismo por Tarefas acelerados por FPGAs, os quais são capazes de fazer offload das operações de inferência de dependências para um acelerador em FPGA de modo a melhorar o seu desempenho ao lidar com tarefas de granularidade fina. Por outro lado, ainda que esses sistemas acelerados por FPGA apresentem ganhos substanciais com relação às alternativas baseadas puramente em software, o desempenho dessas soluções é prejudicado por gargalos de comunicação entre a CPU e a FPGA, os quais limitam a capacidade desses sistemas de lidar com cenários envolvendo grande número de núcleos ou tarefas muito finas. Motivados por isso, implementamos um Sistema de Suporte Nativo a Paralelismo por Tarefas isto é, um processador com suporte arquitetural nativo a Paralelismo por Tarefas com o objetivo de reduzir consideravelmente tais overheads de comunicação. Mais especificamente, integramos a lógica em hardware do Picos, um acelerador de Paralelismo por Tarefas desenvolvido pelo Barcelona Supercomputing Center (BSC), ao Rocket Chip, uma implementação multi-core de código livre do RISC-V desenvolvida pela Universidade da Califórnia, Berkeley. O sistema resultante contém em sua ISA (Instruction Set Architecture) as instruções necessárias para que aplicações baseadas em tarefas possam interagir diretamente com essa lógica de escalonamento, minimizando os overheads associados ao uso de runtimes intermediários e eliminando toda a latência de comunicação FPGA-CPU. Para avaliar a performance do protótipo que então se construiu, nós tanto (1) adaptamos o runtime de escalonamento de tarefas Nanos para que ele pudesse ser acelerado pelas novas instruções de escalonamento de tarefas, quanto (2) criamos um novo runtime leve de escalonamento de tarefas a que demos o nome de Phentos. Nossos experimentos mostram que programas baseados em paralelismo por tarefas usando o runtime Nanos-RV a versão do runtime Nanos com suporte ao sistema que produzimos são executados em média 2,13 vezes mais rapidamente do que versões dos mesmos programas utilizando a versão básica do Nanos, enquanto programas executados com o Phentos são em média 13,19 vezes mais rápidos do que suas versões correspondentes baseadas na mesma versão básica do Nanos. Tais valores médios correspondem à média geométrica dos conjuntos de dados pertinentes. Usando oito núcleos, Nanos-RV entrega ganhos de desempenho com relação a execuções seriais de até 5,62 vezes, enquanto Phentos entrega ganhos de até 5,72 vezes. Chisel Chisel Paralelismo por tarefas Parallel programming Programação paralela RISC-V RISC-V Rocket chip Rocket chip Task scheduling
8	Evaluation of embedded processors for next generation asic : Evaluation of open source Risc-V processors and tools ability to perform packet processing operations compared to Arm Cortex M7 processors / Utvärdering av inbyggda processorer för nästa generation asic : Utvärdering av öppen källkod Risc-V processorer och verktyg’s förmåga att utföra databehandlingsfunktioner i jämförelse med en Arm Cortex M7 processor Musasa Mutombo, Mike January 2021 (has links) Nowadays, network processors are an integral part of information technology. With the deployment of 5G network ramping up around the world, numerous new devices are going to take advantage of their processing power and programming flexibility. Contemporary information technology providers of today such as Ericsson, spend a great amount of financial resources on licensing deals to use processors with proprietary instruction set architecture designs from companies like Arm holdings. There is a new non-proprietary instruction set architecture technology being developed known as Risc-V. There are many open source processors based on Risc-V architecture, but it is still unclear how well an open-source Risc-V processor performs network packet processing tasks compared to an Arm-based processor. The main purpose of this thesis is to design a test model simulating and evaluating how well an open-source Risc-V processor performs packet processing compared to an Arm Cortex M7 processor. This was done by designing a C code simulating some key packet processing functions processing 50 randomly generated 72 bytes data packets. The following functions were tested: framing, parsing, pattern matching, and classification. The code was ported and executed in both an Arm Cortex M7 processor and an emulated open source Risc-V processor. A working packet processing test code was built, evaluated on an Arm Cortex M7 processor. Three different open-source Risc-V processors were tested, Arianne, SweRV core, and Rocket-chip. The execution time of both cases was analyzed and compared. The execution time of the test code on Arm was 67, 5 ns. Based on the results, it can be argued that open source Risc-V processor tools are not fully reliable yet and ready to be used for packet processing applications. Further evaluation should be performed on this topic, with a more in-depth look at the SweRV core processor, at physical open-source Risc-V hardware instead of emulators. / Nätverksprocessorer är en viktig byggsten av informationsteknik idag. I takt med att 5G nätverk byggs ut runt om i världen, många fler enheter kommer att kunna ta del av deras kraftfulla prestanda och programerings flexibilitet. Informationsteknik företag som Ericsson, spenderarmycket ekonomiska resurser på licenser för att kunna använda proprietära instruktionsuppsättnings arkitektur teknik baserade processorer från ARM holdings. Det är väldigt kostam att fortsätta köpa licenser då dessa arkitekturer är en byggsten till designen av många processorer och andra komponenter. Idag finns det en lovande ny processor instruktionsuppsättnings arkitektur teknik som inte är licensierad så kallad Risc-V. Tack vare Risc-V har många propietära och öppen källkod processor utvecklats idag. Det finns dock väldigt lite information kring hur bra de presterar i nätverksapplikationer är känt idag. Kan en öppen-källkod Risc-V processor utföra nätverks databehandling funktioner lika bra som en proprietär Arm Cortex M7 processor? Huvudsyftet med detta arbete är att bygga en test model som undersöker hur väl en öppen-källkod Risc-V baserad processor utför databehandlings operationer av nätverk datapacket jämfört med en Arm Cortex M7 processor. Detta har utförts genom att ta fram en C programmeringskod som simulerar en mottagning och behandling av 72 bytes datapaket. De följande funktionerna testades, inramning, parsning, mönster matchning och klassificering. Koden kompilerades och testades i både en Arm Cortex M7 processor och 3 olika emulerade öppen källkod Risc-V processorer, Arianne, SweRV core och Rocket-chip. Efter att ha testat några öppen källkod Risc-V processorer och använt test koden i en ArmCortex M7 processor, kan det hävdas att öppen-källkod Risc-V processor verktygen inte är tillräckligt pålitliga än. Denna rapport tyder på att öppen-källkod Risc-V emulatorer och verktygen behöver utvecklas mer för att användas i nätverks applikationer. Det finns ett behov av ytterligare undersökning inom detta ämne i framtiden. Exempelvis, en djupare undersökning av SweRV core processor, eller en öppen-källkod Risc-V byggd hårdvara krävs. Network processing Risc-V Packet processing Instruction set architecture Open-source Nätverksprocessorer instruktionsuppsättnings arkitektur Risc-V öppen-källkod processorer Computer Sciences Datavetenskap (datalogi)
9	RISC-V Thread Isolation : Using Zephyr RTOS / RISC-V Trådisolering : Med Zephyr RTOS Midéus, Gustav, Morales Chavez, Antonio January 2020 (has links) Many embedded systems lack a memory management unit (MMU) and thus often also lack protection of memory. This causes these systems to be less robust since the operating system (OS), processes, and threads are no longer isolated from each other. This is also a potential security issue and with the number of embedded systems rapidly increasing as a result of the rise of Internet of things (IoT), vulnerabilities like this could become a major problem. However, with a recent update to the RISC-V processor architecture, a possibility to isolate regions of memory without an MMU was introduced. This study aims to identify problems and possibilities of implementing such memory protection with RISC-V. Based on a study of literature and documentation on memory protection and the RISC-V architecture, a prototype was designed and implemented to determine potential problems and evaluate performance in terms of execution time and memory cost. The developed prototype showed aworking implementation of memory protection for the memory regions with RISC-V. The evaluation of the prototype demonstrated an increase in context switch execution time and memory usage. The results indicate that the implemented memory protection comes with an increased cost in performance with a constant factor and a small memory overhead. Therefore, it is recommended that implementations that wish to implement memory protection with RISC-V on smaller embedded systems where time and memory may be crucial takes the overhead in consideration. Further research and testing is needed to identify optimizations that could improve the performance as well as discover security flaws. / Många inbyggda system saknar en enhet för minneshantering (s.k. MMU) och saknar därför oftast minnesskydd. Detta leder till att dessa system blir mindre robusta eftersom operativsystemet, processer och trådar inte längre är isolerade från varandra. Detta är också en säkerhetsbrist och med antalet inbyggda system som snabbt ökar på grund av tillväxten av Internet of things (IoT), så kan sårbarheter som denna bli ett stort problem. Med en nyligen introducerad uppdatering av RISC-Vprocessor arkitekturen, så introducerades en möjlighet till att isolera minne utan hjälp av en MMU. Denna studie syftar till att identifiera problem och möjligheter av att implementera sådant minneskydd med RISC-V. Baserat på en studie av litteratur och dokumentation om minnesskydd och RISC-V arkitekturen designades och implementerades en prototyp för att hjälpa till att fastställa problem och möjligheter samt göra en utvärdering med avseende på prestanda- och minneskostnader. Den utvecklade prototypen visade en fungerande implementering av minneskydd för minnesregioner med RISC-V. Utvärderingen av prototypen visade en ökad exekveringstid för kontextbyten och ökad minnesanvändning. Resultaten indikerar att det implementerade minneskyddet kommer med en ökad kostnad i prestanda med en konstant faktor och en liten omkostnad i minne. Därför rekommenderas att implementeringar som vill implementera minneskydd med RISC-V på mindre inbyggda system där tid och minne kan vara avgörande tar hänsyn till omkostnaderna. Ytterligare studier och tester behövs för att identifiera optimeringar som kan förbättra prestandan och upptäcka säkerhetsbrister. RISC-V memory protection embedded systems thread isolation real-time operating system (RTOS) IoT RISC-V minnesskydd inbyggda system trådisolering RTOS IoT Computer Engineering Datorteknik
10	Design a Three-Stage Pipelined RISC-V Processor Using SystemVerilog He, Ziyan January 2022 (has links) RISC-V is growing in popularity as a free and open RISC Instruction Set Architecture (ISA) in academia and research. Also, the openness, simplicity, extensibility, and modularity, among its advantages, make it more and more used by designers in industry. The aim of this thesis is to design an open-source RISC-V processor. The development of this RISC-V processor was based on the prototype which was made in the course IL2232 Embedded Systems Design Project (SoI-CMOS Design group), against an experimental high-temperature SoC CMOS process. SystemVerilog was used for RTL coding. ModelSim was used for RTL simulation. Genus was used for digital synthesis and Innovus was used for digital place & route. The thesis concludes that this RISC-V processor can run the compiled C-code which has been produced by the virtual platform tool Imperas OVP. The instruction set RV32IM is the Instruction Set base for this processor. Through simulation, the CPI of this RISC-V processor can be collected while running different benchmark programs developed in two parallel Master thesis to this one. To a certain extent, it can reflect the performance of the processor. However, the actual execution time needs to be tested by loading the processor to the hardware. This part will not be discussed in this thesis but is left for future work. The gate count is collected by digital synthesis and the corresponding area is collected after digital place & route. / RISC-V växer i popularitet som en gratis och öppen RISC ISA inom akademi och forskning. Öppenheten, enkelheten, utbyggbarheten och modulariteten, bland dess fördelar, gör att den används mer och mer av designers inom industrin. Syftet med denna avhandling är att designa en RISC-V-processor med öppen källkod. Utvecklingen av denna RISC-V-processor baserades på prototypen som gjordes i kursen IL2232 Embedded Systems Design Project (SoI-CMOS Design group). Mot en experimentell högtemperatur, SoC CMOS-process diskuteras. SystemVerilog användes för RTL-kodning. ModelSim användes för RTL-simulering. Genus användes för digital syntes och Innovus användes för digital plats & rutt. Avhandlingen drar slutsatsen att denna RISC-V-processor kan köra den kompilerade C-koden som har producerats av det virtuella plattformsverktyget Imperas OVP. Instruktionsuppsättningen RV32IM är instruktionsuppsättningens bas för denna processor. Genom simulering kan CPI för denna RISC-V-processor samlas in samtidigt som man kör olika benchmarkprogram utvecklade i två parallella masteruppsatser till denna. Till viss del kan det spegla processorns prestanda. Den faktiska exekveringstiden måste dock testas genom att ladda processorn till hårdvaran. Denna del kommer att diskuteras i denna uppsats men lämnas för framtida arbete. Grindräkningen samlas in genom digital syntes och motsvarande yta samlas in efter den digitala platsen & rutten. RISC RISC-V ISA SystemVerilog RTL simulation RV32IM CPI RISC RISC-V ISA SystemVerilog RTL simulering RV32IM CPI Elektroteknik och elektronik

Search results