• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 15
  • 6
  • 1
  • 1
  • Tagged with
  • 27
  • 27
  • 12
  • 9
  • 8
  • 6
  • 5
  • 5
  • 5
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Security vs performance in a real-time separation kernel : An analysis for multicore RISC-V architecture / Säkerhet vs prestanda i en realtidsseparationskärna : En analys för multicore RISC-V arkitektur

Kultala, Henrik January 2022 (has links)
In this thesis, we explored the possibility of introducing a few vulnerabilities to a separation kernel to increase its performance. We made modifications to S3K, an open-source separation kernel that is in the final stages of being designed. To test the viability of our modifications we benchmarked both the unmodified and the modified versions and compared the results. We changed the scheduler and the inter-process communication used for time sharing: we introduced side-channel vulnerabilities to allow the modified functionalities to complete their work faster. The changes to the scheduler increased performance notably when having a high scheduling overhead, but not so much with low overhead. The changes to the inter-process communication proved to have limited usefulness, as the default version was already rather quick, and the new version had the drawback of increasing the time needed for scheduling. We also tested our scheduler modifications in the inter-process communication benchmarks. This greatly improved performance in all scenarios, and it made our modifications to the inter-process communication slightly more viable. To see how our results held up in a scenario closer to a real use case we also implemented a simple cryptographic application and designed tests based on it. When we ran the tests with different combinations of including or excluding our modifications we got similar results to our previous benchmarks. Overall, our modifications to the scheduler seem like a promising change to the separation kernel, given that one is willing to introduce the side-channels that come with the changes. The modifications to the inter-process communication on the other hand are more questionable and are likely only useful in specific scenarios. / I detta arbete utforskade vi möjligheten att introducera några sårbarheter till en separationskärna för att öka dess prestanda. Vi modifierade S3K, en separationskärna med öppen källkod som är i slutstadiet av att designas. För att testa hur praktiskt användbara våra modifikationer var så körde vi benchmarks på både den ursprungliga versionen och den modifierade versionen och jämförde resultaten. Vi ändrade schemaläggaren och interprocesskommunikationen som används för att dela tid: sidokanalssårbarheter introducerades för att tillåta de ändrade funktionerna att göra färdigt sina arbeten snabbare. Ändringarna till schemaläggaren visade sig öka prestandan noterbart när man hade en hög schemaläggnings-overhead, men skillnaden var inte så stor med låg overhead. Ändringarna till interprocesskommunikationen visade sig ha begränsad användbarhet, då standardversionen redan var ganska snabb och den nya versionen hade nackdelen att den ökade schemaläggningstiden. Vi testade också våra schemaläggningsmodifikationer i våra benchmarks för interprocesskommunikationen. Detta ökade prestandan mycket i alla scenarion, och gjorde våra modifikationer till interprocesskommunikationen något mer praktiskt användbara. För att se hur våra resultat stod sig i ett mer verkligt scenario så implementerade vi också en simpel kryptografisk applikation, och utformade test runt den. När vi testade olika kombinationer av att inkludera eller exkludera våra modifikationer fick vi liknande resultat som vi fick i tidigare benchmarks. Överlag så verkar våra modifikationer till schemaläggaren lovande, givet att man är villig att introducera de sidokanalssårbarheter som kommer med ändringarna. Modifikationerna till interprocesskommunikationen är dock mer tveksamma, och är sannolikt bara användbara i specifika scenarion.
12

Specifikace scénářů portovatelných stimulů pro moduly procesoru RISC-V / Portable Stimulus Scenarios Specification for RISC-V Processor Modules

Bardonek, Petr January 2018 (has links)
The thesis is focused on the design and implementation of the portable stimulus verification scenarios for selected Berkelium processor modules based on RISC-V architecture from Codasip. The aim of this work is to use new standard for Portable Stimulus developed by Accellera organization to design and implement portable stimulus scenarios using the Questa InFact tool from Mentor. The proposed portable stimulus scenarios are then linked to the already existing verification environments of the UVM methodology and then they are used for verification of the Berkelium processor modules based on RISC-V architecture. The last part of the thesis is the evaluation of portability of the implemented scenarios to the individual levels of the Berkelium processor based on RISC-V architecture (IP blocks, subsystems, system level), in which it tries to use the proposed scenarios across all verificated levels.
13

Implementing the Load Slice Core on a RISC-V based microarchitecture

Dalbom, Axel, Svensson, Tim January 2020 (has links)
As cores have become better at exposing Instruction-Level Parallelism (ILP), they have become bigger, more complex, and consumes more power. These cores are approaching the Power- and Memory-wall quickly. A new microarchitecture proposed by Carlson et. al claims to solve these problems. They claim that the new microarchitecture, the Load Slice Core, is able to outperform both In-Order and Out-of-Order designs in an area and power restricted environment. Based on Carlson et. al.’s work, we have implemented and evaluated a prototype version of their Load Slice Core using the In-Order Core Ariane. We evaluated the Load Slice Core by comparing the LSC to an IOC when running a microbenchmark designed by us, and when running a set of Application Benchmarks. The results from the Microbenchmark are promising, the LSC outperformed the comparable IOC in each test but problems related to the configuration of the design were found. The results from the Application Benchmarks are inconclusive. Due to time constraints, only a partially functioning LSC were compared to a comparable IOC. From these results we found that the LSC performed comparably or slightly worse than its IOC counterpart. More research on the subject is required for any conclusive statement on the microarchitecture can be made, but it is the opinion of this paper’s authors that it does show promise.
14

RISC-V Based Application-Specific Instruction Set Processor for Packet Processing in Mobile Networks

Södergren, Oskar January 2021 (has links)
This thesis explores the use of an ASIP for handling O-RAN control data. A model application was constructed, optimized and profiled on a simple RV32-IMC core. The compiled code was analyzed, and the instructions “byte swap”, “pack”, bitwise extract/deposit” and “bit field place” were implemented. Synthesis of the core, and profiling of the model application, was done with and without each added instruction. Byte swap had the largest impact on performance (14% improvement per section, and 100% per section extension), followed by bitwise extract/deposit (10% improvement per section but no impact on section extensions). Pack and bit field place had no impact on performance. All instructions had negligible impact on core size, except for bitwise extract/deposit, which increased size by 16%. Further studies, with respect to both overall architecture and further evaluation of instructions to implement, would be necessary to design an ideal ASIP for the application.
15

Compiler Testing of C11 Atomics for Arm and RISC-V

Adolfsson, Hampus January 2022 (has links)
The C11 standard introduced atomic types and operations, with an accompanying memory model, to enable the use of shared variables in concurrent programs. In this thesis, I demonstrate how compilers can be tested, in a way that is deterministic and covers the entire set of atomic operations, to ensure they correctly implement C11 atomics and the C11 memory model.  I use a large set of short concurrent programs (”litmus tests”), generated from a model written in a specification language and based on a formalized C11 memory model. Each test program is compiled and run with a model checker, to determine the possible outcomes; any program with an outcome that is possible after compilation but not allowed by C11 is a failed test case. As an alternative to model checking, I also test a nondeterministic, hardware-based method for running tests, but I find that this method is too inaccurate to be useful.  I test IAR and gcc compilers for Arm and RISC-V; all of these compilers pass all tests. Out of three compilers with purposefully inserted bugs, all are correctly identified as faulty. This testing process thus shows some promise, but further evaluation is needed.
16

Accelerating Graphics Rendering on RISC-V GPUs

Simpson, Joshua 01 June 2022 (has links) (PDF)
Graphics Processing Units (GPUs) are commonly used to accelerate massively parallel workloads across a wide range of applications from machine learning to cryptocurrency mining. The original application for GPUs, however, was to accelerate graphics rendering which remains popular today through video gaming and video rendering. While GPUs began as fixed function hardware with minimal programmability, modern GPUs have adopted a design with many programmable cores and supporting fixed function hardware for rasterization, texture sampling, and render output tasks. This balance enables GPUs to be used for general purpose computing and still remain adept at graphics rendering. Previous work at the Georgia Institute of Technology has been done to implement a general purpose GPU (GPGPU) in the open source RISC-V ISA. The implementation features many programmable cores and texture sampling support. However, creating a truly modern GPU based on the RISC-V ISA requires the addition of fixed function hardware units for rasterization and render output tasks in order to meet the demands of current graphics APIs such as OpenGL or Vulkan. This thesis discusses the work done by students at the Georgia Institute of Technology and California Polytechnic State University SLO to accelerate graphics rendering on RISC-V GPUs including the specific contributions made to implement and connect fixed function graphics hardware for the render output unit (ROP) to the programmable cores in a RISC-V GPU. This thesis also explores the performance and area cost of different hardware configurations within the implemented GPU.
17

Hybrid Debugger Software on RISC-V MCU : A no cost debugging solution foreducational use / Hybriddebugger för RISC-V MCU : En kostnadsfri debuglösning för utbildningssyfte

Remahl, Linus January 2022 (has links)
This work details the implementation of a debugger for a small embedded RISC-V system. KTH uses an in-house designed microcontroller development board for computer and electronics design courses. The boards did not incorporate hardware debugging capabilities and no prior software implementation fulfilled the requirements for the specific target system. The debugger used a hybrid software and hardware approach for achievingbasic debugging features such as breakpoints, stepping and break signals. The hybrid approach repurposed the microcontrollers debug module to enable debugging with no external hardware. The debugger implementation met all of the requirements for being ableto be used in the intended educational setting, and had a limited footprint withregard to resource usage, but with room for further optimization. / Detta arbete beskriver implementationen av en debugger för ett mindre RISC-V system. KTH använder ett internt framtaget utvecklingskort med en mikrokontroller för kurser inom programmering för inbyggda system och elektronikdesign. Korten inkluderade inte stöd för hårdvarubaserad debugging och inga befintliga mjukvarulösningar mötte kraven för det specifika systemet. Debuggern använde en blandad hårdvaru- och mjukvarulösning för att uppnå debug-funktionalitet som brytpunkter, stegning och brytsignaler. Implementationen nyttjade den i mikrokontrollern inbyggda debugmodulen(debug module) för att tillgängliggöra debugging utan någon extern hårdvara. Implementationen mötte alla krav för att kunna användas i den tilltänkta studiemiljön, och hade en begränsad resursanvändning, men med rum för ytterligare optimeringar.
18

Leveraging Posits for the Conjugate Gradient Linear Solver on an Application-Level RISC-V Core

Mallasén Quintana, David January 2022 (has links)
Emerging floating-point arithmetics provide a way to optimize the execution of computationally-intensive algorithms. This is the case with scientific computational kernels such as the Conjugate Gradient (CG) linear solver. Exploring new arithmetics is of paramount importance to maximize the accuracy and timing performance of these algorithms. In this thesis, I have studied the use of the novel posit arithmetic in hardware to improve the accuracy of the CG method. In particular, on PERCIVAL, an application-level RISC-V core with support for posits and quire. The open RISC-V architecture supplies a flexible platform for the exploration of new computer architecture studies. Previous works have tackled the use of posits in the high-performance computing and machine learning fields, amongst others. However, until recently, the lack of hardware support has been a significant barrier to their scalability. The key results from this thesis show that posits are a promising alternative when solving 1D and 2D Poisson equations using the CG linear solver. Notably, this novel arithmetic can execute as fast as IEEE 754 floating-point numbers on specialized hardware, and provide up to 2 orders of magnitude higher accuracy. This accuracy improvement spans both the error of the output values of the algorithms and the value of the final residual in the CG iterative method. Furthermore, the use of the quire accumulator register in the computation of dot-products in posit arithmetic significantly boosts the accuracy of the outputs. Since 32-bit posits perform practically as fast as 32-bit floats, and thus faster than 64-bit floats, they present an intermediate solution between single- and double-precision arithmetic. This paves the way for the deployment of high-efficiency solutions that make intensive use of floating-point operations. / Ny kommande flyttalsaritmetik ger ett sätt att optimera exekveringen av beräkningsintensiva algoritmer. Detta är fallet med vetenskapliga beräkningskärnor som den Conjugate Gradient (CG) metoden kräver. Att utforska ny aritmetik är av största vikt för att minska energikostnaderna för dessa algoritmer. I detta examensarbete har jag studerat användningen av den nya positaritmetiken i hårdvara för att förbättra noggrannheten i CG-metoden. I synnerhet på PERCIVAL, en RISC-V-kärna på applikationsnivå med stöd för posits och quire. Den öppna RISC-V-arkitekturen tillhandahåller en flexibel plattform för utforskning av nya dator arkitekturstudier. Tidigare arbeten har tagit itu med användningen av positurer inom områdena högpresterande datorer och maskininlärning, bland annat. Men fram till nyligen har bristen på hårdvarustöd varit ett betydande hinder för deras skalbarhet. Nyckelresultaten från denna avhandling visar att posits är ett lovande alternativ när man löser 1D och 2D Poisson-ekvationer med den linjära CG-lösaren. Noterbart kan denna nya aritmetik köra så snabbt som IEEE 754 flyttal på specialiserad hårdvara och ge upp till två storleksordningar högre noggrannhet. Denna noggrannhetsförbättring sträcker sig över både felet i algoritmernas utvärden och värdet på den slutliga residualen i den iterativa CG-metoden. Dessutom ökar användningen av quire-ackumulatorregistret vid beräkning av punktprodukter i positaritmetik avsevärt noggrannheten hos utsignalerna. Eftersom 32-bitars posits presterar praktiskt taget lika snabbt som 32-bitars flöten, och därmed snabbare än 64-bitars flöten, presenterar de en mellanlösning mellan enkel-och dubbelprecisionsaritmetik. Detta banar väg för utbyggnaden av högeffektiva lösningar som intensivt utnyttjar flyttalsoperationer.
19

Implementace mikroprocesoru RISC-V s rozšířením pro bitové manipulace / RISC-V microprocessor implementation with bit manipulations instruction set extension

Chovančíková, Lucie January 2020 (has links)
This master thesis deals with the design of a RISC-V processor with bit manipulations instruction set extension. In this work, attention is paid to the description of the RISC-V instruction set and the CodAL language, which is used to describe the instruction sets and the processor architectures. The main goal of this work is to implement a model with a 32-bit address space, RISC-V basic instruction set and bit manipulations instruction set. The processor's design have two models, which one is instruction model and second is RTL model. The resulting parameters of the designed processor are measured using a Genus Synthesis Solution tool. The usability of bit manipulations based on decoder coverage is also included in the measurement.
20

Compiler-Assisted Software Fault Tolerance for Bare Metal and RTOS Applications on Embedded Platforms

James, Benjamin 13 April 2021 (has links)
In the presence of ionizing particles and other high-energy atomic sources, many electronic and computer systems fail. Single event upsets (SEUs) can be mitigated through hardware and/or software methods. Previous research at BYU has introduced COAST, a compiler-based tool that can automatically add software protection schemes to improve fault coverage of programs. This thesis will expand on the work already done with the COAST project by proving its effectiveness across multiple platforms and benchmarks. The ability to automatically add fault protection to arbitrary user programs will be very valuable for many application designers. The results presented herein show that mean work to failure (MWTF) of an application can increase from 1.2x – 36x when protected by COAST. In addition to the results based on bare metal applications, in this thesis we will show that it is both possible and profitable to protect a real-time operating system with COAST. We present experimental results which show that our protection scheme gives a 2x – 100x improvement in MWTF. We also present a fault injection framework that allows for rapid and reliable testing of multiple protection schemes across different benchmarks. The code setup used in this paper is publicly available. We make it public in the hope that it will be useful for others doing similar research to have a concrete starting point.

Page generated in 0.027 seconds