Global ETD Search

51	Filtrace paketů ve 100 Gb sítích / Packet Filtration in 100 Gb Networks Kučera, Jan January 2016 (has links) This master's thesis deals with the design and implementation of an algorithm for high-speed network packet filtering. The main goal was to provide hardware architecture, which would support large rule sets and could be used in 100 Gbps networks. The system has been designed with respect to the implementation on an FPGA card and time-space complexity trade-off. Properties of the system have been evaluated using various available rule sets. Due to the highly optimized and deep pipelined architecture it was possible to reach high working frequency (above 220 MHz) together with considerable memory reduction (on average about 72% for compared algorithms). It is also possible to efficiently store up to five thousands of filtering rules on an FPGA with only 8% of on-chip memory utilization. The architecture allows high-speed network packet filtering at wire-speed of 100 Gbps.
52	FPGA-based Speed Limit Sign Detection Tallawi, Reham 19 July 2017 (has links) This thesis presents a new hardware accelerated approach using image processing and detection algorithms for implementing fast and robust traffic sign detection system with focus on speed limit sign detection. The proposed system targets reconfigurable integrated circuits particularly Field Programmable Gate Array (FPGA) devices. This work propose a fully parallelized and pipelined parallel system architecture to exploit the high performance and flexibility capabilities of FPGA devices. This thesis is divided into two phases, the first phase, is a software prototype implementation of the proposed system. The software system was designed and developed using C++ and OpenCV library on general purpose CPU. The prototype is used to explore and investigate potential segmentation and detection algorithms that might be feasible to design and implement in hardware accelerated environments. These algorithms includes RGB colour conversion, colour segmentation through thresholding, noise reduction through median filter, morphological operations through erosion and dilation, and sign detection through template matching. The second phase, a hardware-based design of the system was developed using the same algorithms used in the software design. The hardware design is composed of 20 image processing components each designed to xxx fully parallelized and pipelined xxx. The hardware implementation was developed using VHDL as the hardware description language targeting a Xilinix Virtex-6 FPGA XC6VLX240T device. The development environment is Xilinx ISE®Design Suite version 14.3. A set of 20 640x480 test images was used as the test data for the verification and testing of this work. The images was captured by a smart-phone camera in various weather and lightning conditions. The software implementation delivered speed limit detection results with a success rate of 75%. The hardware implementation was only simulated using Xilinx ISE Simulator (ISim) with a overall system latency of 12964 clock cycles. According to the Place and Route report the maximum operation frequency for the proposed hardware design is 71,2 MHz. The design only utilized 2% of the slice registers, 4% of the slice Look up Tables (LUT), and 11% of the block memory. This thesis project concludes the work based on the provided software and hardware implementation and performance analysis results. Also the conclusions chapter provides recommendations and future work for possible extension of the project. info:eu-repo/classification/ddc/004 ddc:004 VHDL, FPGA, Verkehrsschildererkennung
53	Zobrazení bodů na přímky a jiné parametrizace přímek nejen pro Houghovu transformaci / Point to Line Mappings and Other Line Parameterizations not only for Hough Transform Havel, Jiří January 2012 (has links) Tato práce se zabývá Houghovou transformací (HT). HT je nejčastěji používána pro detekci přímek nebo křivek, ale byla zobecněna i pro detekci libovolných tvarů. Hlavní téma této práce jsou parametrizace přímek, speciálně PTLM - zobrazení bodů na přímky. Tyto parametrizace mají tu vlastnost, že bodům v obrázku odpovídají přímky v parametrickém prostoru. Tato práce poskytuje důkazy některých vlastností PTLM. Za zmínku stojí existence páru PTLM vhodného pro detekci a efekt konvoluce v obrázku na obsah parametrického prostoru. V práci jsou prezentovány dvě implementace HT. Obě využívají k akceleraci grafický hardware. Jedna využívá GPGPU API CUDA a druhá zobrazovací API OpenGL. Jako aplikace detekce přímek je uvedena část detekce šachovnicových markerů použitelných pro rozšířenou realitu.
54	Softwarově řízené monitorování síťového provozu / Software-Controlled Network Traffic Monitoring Kekely, Lukáš January 2017 (has links) Tato disertační práce se zabývá návrhem nového způsobu softwarově řízené (definované) hardwarové akcelerace pro moderní vysokorychlostní počítačové sítě. Hlavním cílem práce je formulace obecného, flexibilního a jednoduše použitelného konceptu akcelerace použitelného pro různé bezpečnostní a monitorovací aplikace, který by umožnil jejich reálné nasazení ve 100 Gb/s a rychlejších sítích. Disertační práce začíná rozborem aktuálního stavu poznání v oborech síťového monitorování, bezpečnosti a způsobů akcelerace zpracování vysokorychlostních síťových dat. Na základě tohoto rozboru je formulován a navržen zcela nový koncept s názvem Softwarově definované monitorování (SDM). Klíčová funkcionalita uvedeného konceptu je postavená na hardwarově akcelerované, aplikačně specifické (řízené), na tocích založené, informované redukci a distribuci zachycených síťových dat. Toto je zajištěno spojením vysokorychlostního hardwarového zpracování s flexibilním softwarovým řízením, které tak společně umožňují jednoduchou tvorbu různých komplexních a vysoce výkonných síťových aplikací. Pokročilé optimalizace a vylepšení základního SDM konceptu a jeho vybraných komponent jsou v práci též zkoumány, což vede k návrhu zcela unikátní a obecně použitelné FPGA architektury modulárního analyzátoru hlaviček paketů a vysoce výkonného klasifikátoru paketů založeného na kukaččím hashovaní. Nakonec je vytvořen vysokorychlostní SDM prototyp postavený nad FPGA akcelerační síťovou kartou, který je podrobně ověřen v podmínkách nasazení do reálných sítí. Jsou změřeny a diskutovány dosažitelné zlepšení výkonností v několika vybraných monitorovacích a bezpečnostních případech užití. Vytvořený SDM prototyp je rovněž nasazen v produkčním monitorování reálné páteřní sítě sdružení Cesnet a byl komercializován společností Netcope Technologies.
55	Building High-performing Web Rendering of Large Data Sets Burwall, William January 2023 (has links) Interactive visualization is an essential tool for data analysis. Cloud-based data analysis software must handle growing data sets without relying on powerful end-user hardware. This thesis explores and tests various methods to speed up primarily time series plots of large data sets on the web for the biotechnology research company Sartorius. To increase rendering speed, I focused on two main approaches: downsampling and hardware acceleration. To find which sampling algorithms suit Sartorius's needs, I implemented multiple alternatives and compared them quantitatively and qualitatively. The results show that downsampling increases or eliminates data set size limits and that test users favored algorithms maintaining local outliers. With hardware acceleration that substantially increased the amount of simultaneously rendered points for more detailed representations, these methods pave the way for efficient visualization of large data sets on the web. Plot Trajectory plot Downsampling Data set Interactive visualization Hardware acceleration Web rendering WebGL D3FC Apache Arrow Other Computer and Information Science Annan data- och informationsvetenskap Computer Sciences Datavetenskap (datalogi)
56	Low-power high-resolution image detection Merchant, Caleb 09 August 2019 (has links) Many image processing algorithms exist that can accurately detect humans and other objects such as vehicles and animals. Many of these algorithms require large amounts of processing often requiring hardware acceleration with powerful central processing units (CPUs), graphics processing units (GPUs), field programmable gate arrays (FPGAs), etc. Implementing an algorithm that can detect objects such as humans at longer ranges makes these hardware requirements even more strenuous as the numbers of pixels necessary to detect objects at both close ranges and long ranges is greatly increased. Comparing the performance of different low-power implementations can be used to determine a trade-off between performance and power. An image differencing algorithm is proposed along with selected low-power hardware that is capable of detected humans at ranges of 500 m. Multiple versions of the detection algorithm are implemented on the selected hardware and compared for run-time performance on a low-power system. object detection image detection low-power long-range high-resolution image differencing frame differencing morphology multi-threading hardware acceleration ARM Vivante NXP GPU CPU
57	Design and Implementation of the Heterogeneous Computing Device Management Architecture Schultek, Brian Robert January 2014 (has links) No description available. Electrical Engineering Computer Engineering Heterogeneous Computing Hardware Acceleration Algorithm Acceleration PCIe Device Management High Throughput Applications
58	Binary Recurrent Unit: Using FPGA Hardware to Accelerate Inference in Long Short-Term Memory Neural Networks Mealey, Thomas C. 31 May 2018 (has links) No description available. Engineering Computer Engineering Electrical Engineering Computer Science FPGA inference hardware acceleration digital hardware neural networks LSTM long short-term memory binarization model compression binary weights
59	Acceleration of the Weather Research & Forecasting (WRF) Model using OpenACC and Case Study of the August 2012 Great Arctic Cyclone Haines, Wesley Adam 04 September 2013 (has links) No description available. Atmospheric Sciences Computer Science Climate Change
60	Hardware Acceleration in the Context of Motion Control for Autonomous Systems / Hårdvaruacceleration i samband med rörelsekontroll för autonoma system Leslin, Jelin January 2020 (has links) State estimation filters are computationally intensive blocks used to calculate uncertain/unknown state values from noisy/not available sensor inputs in any autonomous systems. The inputs to the actuators depend on these filter’s output and thus the scheduling of filter has to be at very small time intervals. The aim of this thesis is to investigate the possibility of using hardware accelerators to perform this computation. To make a comparative study, 3 filters that predicts 4, 8 and 16 state information was developed and implemented in Arm real time and application purpose CPU, NVIDIA Quadro and Turing GPU, and Xilinx FPGA programmable logic. The execution, memory transfer time, and the total developement time to realise the logic in CPU, GPU and FPGA is discussed. The CUDA developement environment was used for the GPU implementation and Vivado HLS with SDSoc environment was used for the FPGA implementation. The thesis concludes that a hardware accelerator is needed if the filter estimates 16 or more state information even if the processor is entirely dedicated for the computation of filter logic. Otherwise, for a 4 and 8 state filter the processor shows similar performance as an accelerator. However, in a real time environment the processor is the brain of the system, so it has to give instructions to many other functions parallelly. In such an environment, the instruction and data caches of the processor will be disturbed and there will be a fluctuation in the execution time of the filter for every iteration. For this, the best and worst case processor timings are calculated and discussed. / Tillståndsberäkningsfilter är beräkningsintensiva block som används för att beräkna osäkra / okända tillståndsvärden från bullriga / ej tillgängliga sensoringångar i autonoma system. Ingångarna till manöverdonen beror på filterens utgång och därför måste schemaläggningen av filtret ske med mycket små tidsintervall. Syftet med denna avhandling är att undersöka möjligheten att använda hårdvaruacceleratorer för att utföra denna beräkning. För att göra en jämförande studie utvecklades och implementerades 3 filter som förutsäger information om 4, 8 och 16 tillstånd i realtid med applikationsändamålen CPU, NVIDIA Quadro och Turing GPU, och Xilinx FPGA programmerbar logik. Exekvering, minnesöverföringstid och den totala utvecklingstiden för att förverkliga logiken i båda hårdvarorna diskuteras. CUDAs utvecklingsmiljö användes för GPU-implementeringen och Vivado HLS med SDSoc-miljö användes för FPGA-implementering. Avhandlingen drar slutsatsen att en hårdvaru-accelerator behövs om filtret uppskattar information om mer än 16 tillstånd även om processorn är helt dedikerad för beräkning av filterlogik. För 4 och 8 tillståndsfilter, visar processorn liknande prestanda som en accelerator. Men i realtid är processorn hjärnan i systemet; så den måste ge instruktioner till många andra funktioner parallellt. I en sådan miljö kommer processorns instruktioner och datacacher att störas och det kommer att bli en fluktuation i exekveringstiden för filtret för varje iteration. För detta beräknas och diskuteras de bästa och värsta fallstiderna. Hardware acceleration Computation offloading State estimation filter Autonomous systems FPGA GPU. Hårdvaruacceleration beräkningsavlastning tillståndsskattningsfilter autonoma system FPGA GPU. Elektroteknik och elektronik

Search results