Global ETD Search

1	An Interconnection Network for a Cache Coherent System on FPGAs Mirian, Vincent 12 January 2011 (has links) Field-Programmable Gate Arrays (FPGAs) systems now comprise many processing elements that are processors running software and hardware engines used to accelerate specific functions. To make the programming of such a system simpler, it is easiest to think of a shared-memory environment, much like in current multi-core processor systems. This thesis introduces a novel, shared-memory, cache-coherent infrastructure for heterogeneous systems implemented on FPGAs that can then form the basis of a shared-memory programming model for heterogeneous systems. With simulation results, it is shown that the cache-coherent infrastructure outperforms the infrastructure of Woods [1] with a speedup of 1.10. The thesis explores the various configurations of the cache interconnection network and the benefit of the cache-to-cache cache line data transfer with its impact on main memory access. Finally, the thesis shows the cache-coherent infrastructure has very little overhead when using its cache coherence implementation. FPGA Cache Coherence Interconnection Network Protocols Multiprocessor muti-thread Pthread (Posix Thread) programming model parrallel programming shared-memory model 0544
2	An Interconnection Network for a Cache Coherent System on FPGAs Mirian, Vincent 12 January 2011 (has links) Field-Programmable Gate Arrays (FPGAs) systems now comprise many processing elements that are processors running software and hardware engines used to accelerate specific functions. To make the programming of such a system simpler, it is easiest to think of a shared-memory environment, much like in current multi-core processor systems. This thesis introduces a novel, shared-memory, cache-coherent infrastructure for heterogeneous systems implemented on FPGAs that can then form the basis of a shared-memory programming model for heterogeneous systems. With simulation results, it is shown that the cache-coherent infrastructure outperforms the infrastructure of Woods [1] with a speedup of 1.10. The thesis explores the various configurations of the cache interconnection network and the benefit of the cache-to-cache cache line data transfer with its impact on main memory access. Finally, the thesis shows the cache-coherent infrastructure has very little overhead when using its cache coherence implementation. FPGA Cache Coherence Interconnection Network Protocols Multiprocessor muti-thread Pthread (Posix Thread) programming model parrallel programming shared-memory model 0544
3	Parallellisering av Sliding Extensive Cancellation Algorithm (ECA-S) för passiv radar med OpenMP / Parallelization of Sliding Extensive Cancellation Algorithm (ECA-S) for Passive Radar with OpenMP Johansson Hultberg, Andreas January 2021 (has links) Software parallelization has gained increasing interest since the transistor manufacturing of smaller chips within an integrated circuit has begun to stagnate. This has led to the development of new processing units with an increasing number of cores. Parallelization is an optimization technique that allows the user to utilize parallel processes in order to streamline algorithm flows. This study examines the performance benefits that a passive bistatic radar system can obtain by parallelization and code refactorization. The study focuses mainly on investigating the use of parallel instructions within a shared memory model on a Central Processing Unit (CPU) with the use of an application programming interface, namely OpenMP. Quantitative data is collected to compare the runtime of the most central algorithm in the passive radar system, namely the Extensive Cancellation Algorithm (ECA). ECA can be used to suppress unwanted clutter in the surveillance signal, which purpose is to create clear target detections of airborne objects. The algorithm on the other hand is computationally demanding, which has led to the development of faster versions such as the Sliding ECA (ECA-S). Despite the ongoing development, the algorithm is still relatively computationally demanding which can lead to long execution times within the radar system. In this study, a MATLAB implementation of ECA-S is transformed to C in order to take advantage of the fast execution time of the procedural programming language. Parallelism is introduced within the converted algorithm by the use of Intel's thread methodology and then applied within two different operating systems. The study shows that a speedup can be obtained, in the programming language C, by a factor of 24 while still ensuring the correctness of the results. The results also showed that code refactorization of a MATLAB algorithm could result in 73% faster code and that C-MEX implementations are twice as slow as a C-implementation. Finally, the study pointed out that real-time can be achieved for a passive bistatic radar system with the use of the programming language C and by using parallel instructions within a shared memory model on a CPU. / Parallellisering av mjukvara har fått ett ökat intresse sedan transistortillverkningen av mindre chip inom en integrerade krets har börjat att stagnera. Detta har lett till utveckling av moderna processorer med ett ökande antal av kärnor. Parallellisering är en optimeringsteknik vilken tillåter användaren att utnyttja parallella processer till att effektivisera algoritmflöden. Denna studie undersöker de tidsmässiga fördelar ett passivt bistatiskt radarsystem kan erhålla genom att, bland annat tillämpa parallellisering och omformning. Studien fokuserar främst på att undersöka användandet av parallella trådar inom det delade minnesutrymmet på en centralprocessor (CPU), detta med hjälp av applikationsprogrammeringsgränssnittet OpenMP. Kvantitativa jämförelser tas fram med hjälp av en av de mest centrala algoritmerna inom det passiva radarsystemet, nämligen Extensive Cancellation Algorithm (ECA). ECA kan används till att undertrycka oönskat klotter i övervakningssignalen, vilket har till syfte att skapa klara måldetektioner av luftföremål. Algoritmen är däremot beräkningstung, vilket har medfört utveckling av snabbare versioner som exempelvis Sliding ECA (ECA-S). Trots utvecklingen är algoritmen fortfarande relativt beräkningstung och kan medföra en lång exekeveringstid inom hela radarsystemet. I denna studie transformeras en MATLAB-implementation av ECA-S till C för att kunna dra nytta av den snabba exekeveringstiden i det procedurella programmeringsspråket. Parallellism införs inom den transformerade algoritmen med hjälp av Intels trådmetodik och appliceras sedan inom två olika operativsystem. Studien visar på en tidsmässig optimering i C med upp till 24 gånger snabbare exekeveringstid och bibehållen noggrannhet. Resultaten visade även på att en enklare omformning av en MATLAB-algoritm kunde resultera till 73% snabbare kod och att en C-MEX-implementation är dubbelt så långsam i jämförelse med en C-implementering. Slutligen pekade studien på att realtid kan uppnås för ett passivt bistatiskt radarsystem vid användandet av programmeringsspråket C och med utnyttjandet av parallella instruktioner inom det delade minnet på en CPU. Extensive Cancellation Algorithm (ECA) ECA-S Passive Radar Optimization OpenMP Parallel Processing Shared Memory Model Central Processing Unit (CPU) Signal Processing Signalbehandling Computer Engineering Datorteknik Software Engineering Programvaruteknik

1

Page generated in 0.0549 seconds