1 |
Um estudo do uso eficiente de programas em placas gráficas / A case study on the efficient use of programs on GPUsIkeda, Patricia Akemi 20 September 2011 (has links)
Inicialmente projetadas para processamento de gráficos, as placas gráficas (GPUs) evoluíram para um coprocessador paralelo de propósito geral de alto desempenho. Devido ao enorme potencial que oferecem para as diversas áreas de pesquisa e comerciais, a fabricante NVIDIA destaca-se pelo pioneirismo ao lançar a arquitetura CUDA (compatível com várias de suas placas), um ambiente capaz de tirar proveito do poder computacional aliado à maior facilidade de programação. Na tentativa de aproveitar toda a capacidade da GPU, algumas práticas devem ser seguidas. Uma delas consiste em manter o hardware o mais ocupado possível. Este trabalho propõe uma ferramenta prática e extensível que auxilie o programador a escolher a melhor configuração para que este objetivo seja alcançado. / Initially designed for graphical processing, the graphic cards (GPUs) evolved to a high performance general purpose parallel coprocessor. Due to huge potencial that graphic cards offer to several research and commercial areas, NVIDIA was the pioneer lauching of CUDA architecture (compatible with their several cards), an environment that take advantage of computacional power combined with an easier programming. In an attempt to make use of all capacity of GPU, some practices must be followed. One of them is to maximizes hardware utilization. This work proposes a practical and extensible tool that helps the programmer to choose the best configuration and achieve this goal.
|
2 |
Um estudo do uso eficiente de programas em placas gráficas / A case study on the efficient use of programs on GPUsPatricia Akemi Ikeda 20 September 2011 (has links)
Inicialmente projetadas para processamento de gráficos, as placas gráficas (GPUs) evoluíram para um coprocessador paralelo de propósito geral de alto desempenho. Devido ao enorme potencial que oferecem para as diversas áreas de pesquisa e comerciais, a fabricante NVIDIA destaca-se pelo pioneirismo ao lançar a arquitetura CUDA (compatível com várias de suas placas), um ambiente capaz de tirar proveito do poder computacional aliado à maior facilidade de programação. Na tentativa de aproveitar toda a capacidade da GPU, algumas práticas devem ser seguidas. Uma delas consiste em manter o hardware o mais ocupado possível. Este trabalho propõe uma ferramenta prática e extensível que auxilie o programador a escolher a melhor configuração para que este objetivo seja alcançado. / Initially designed for graphical processing, the graphic cards (GPUs) evolved to a high performance general purpose parallel coprocessor. Due to huge potencial that graphic cards offer to several research and commercial areas, NVIDIA was the pioneer lauching of CUDA architecture (compatible with their several cards), an environment that take advantage of computacional power combined with an easier programming. In an attempt to make use of all capacity of GPU, some practices must be followed. One of them is to maximizes hardware utilization. This work proposes a practical and extensible tool that helps the programmer to choose the best configuration and achieve this goal.
|
3 |
CUDA-Accelerated ORB-SLAM for UAVsBourque, Donald 01 June 2017 (has links)
"The use of cameras and computer vision algorithms to provide state estimation for robotic systems has become increasingly popular, particularly for small mobile robots and unmanned aerial vehicles (UAVs). These algorithms extract information from the camera images and perform simultaneous localization and mapping (SLAM) to provide state estimation for path planning, obstacle avoidance, or 3D reconstruction of the environment. High resolution cameras have become inexpensive and are a lightweight and smaller alternative to laser scanners. UAVs often have monocular camera or stereo camera setups since payload and size impose the greatest restrictions on their flight time and maneuverability. This thesis explores ORB-SLAM, a popular Visual SLAM method that is appropriate for UAVs. Visual SLAM is computationally expensive and normally offloaded to computers in research environments. However, large UAVs with greater payload capacity may carry the necessary hardware for performing the algorithms. The inclusion of general-purpose GPUs on many of the newer single board computers allows for the potential of GPU-accelerated computation within a small board profile. For this reason, an NVidia Jetson board containing an NVidia Pascal GPU was used. CUDA, NVidia’s parallel computing platform, was used to accelerate monocular ORB-SLAM, achieving onboard Visual SLAM on a small UAV. Committee members:"
|
4 |
Architectural Analysis and Performance Characterization of NVIDIA GPUs using MicrobenchmarkingSubramoniapillai Ajeetha, Saktheesh 29 August 2012 (has links)
No description available.
|
5 |
Hluboké neuronové sítě pro prostředí superpočítače / Deep neural network for supercomputer environmentsBronda, Samuel January 2019 (has links)
The main benefit of the work is the optimization of the hardware configuration for the calculation of neural networks. The theoretical part describes neural networks, deep learning frameworks and hardware options. The next part of the thesis deals with implementation of performance tests, which include application of Inception V3 and ResNet models. Network models are applied to various graphics cards and computing hardware. The output of the thesis is the implemented model of the network Inception V3, which examines the graphics cards and their performance, time-consuming calculations and their efficiency. The ResNet model is applied to a section that examines other impacts on neural network computing such as used disk, operating memory, and so on. Each practical part contains a discussion where the knowledge of the given part is explained. In the case of consumption measurement, a mismatch between the declaration by the manufacturer and the measured values was identified.
|
6 |
Context-aware automated refactoring for unified memory allocation in NVIDIA CUDA programsNejadfard, Kian 25 June 2021 (has links)
No description available.
|
7 |
Hardware Implementation of Learning-Based Camera ISP for Low-Light ApplicationsPreston Rashad Rahim (17676693) 20 December 2023 (has links)
<p dir="ltr">A camera's image signal processor (ISP) is responsible for taking the mosaiced and noisy image signal from the image sensor and processing it such a way that an end-result image is produced that is informative and accurately captures the scene. Real-time video capture in photon-limited environments remains a challenge for many ISP's today. In these conditions, the image signal is dominated by the photon shot noise. Deep learning methods show promise in extracting the underlying image signal from the noise, but modern AI-based ISPs are too computationally complex to be realized as a fast and efficient hardware ISP. An ISP algorithm, BLADE2 has been designed, which leverages AI in a computationally conservative manner to demosaic and denoise low-light images. The original implementation of this algorihtm is in Python/PyTorch. This Thesis explores taking BLADE2 and implementing it on a general purpose GPU via a suite of Nvidia optimization toolkits, as well as a low-level implementation in C/C++, bringing the algorithm closer to FPGA realization. The GPU implementation demonstrated significant throughput gains and the C/C++ implementation demonstrated the feasibility of further hardware development.</p>
|
8 |
Development and Systems Integration of Small Hydrofoiling Robot for Mapping and Sensing / Utveckling och systemintegration av liten bärplansrobot för kartläggning och avkänningLopperi, Tommy, Söderberg, Henrik January 2022 (has links)
Unmanned surface vehicles (USVs) are vehicles of various levels of autonomy which can be made for a large variety of purposes, for instance ferriage and surveying. USV shave technically been around for about 80 years, however, it is only within fairly recent years developments in miniaturization of components and computers have allowed for the construction of USVs of a small size. The primary benefit of USVs is that they can perform otherwise costly and tedious tasks originally done by manned vehicles. They can also run on electric batteries; thus limiting the effect on the environment compared to the fossil fuels used in traditional vehicles. In this project, performed at the Swedish Maritime Robotics Center at KTH Stockholm, a small USV meant to perform depth measurements of waterways was developed. It can be steered via remote control and has the hardware required to navigate autonomously. This report goes through the multiple steps the project group undertook to develop the USV. The project included studying of previous works, selection and ordering of components, creating a schematic, developing the programming, and testing. 11 components were installed while several planned ones were not included due to time constraints. Testing of the remote control and GNSS logging was successful. / Obemannade ytfarkoster (engelska USV) är fordon med olika nivåer av autonomi som kan tillverkas för en mängd olika ändamål, till exempel för färjor och hydrografi. USV har tekniskt sett funnits i cirka 80 år, men det är först inom de relativt senaste åren utvecklingen inom miniatyrisering av komponenter och datorer har möjliggjort konstruktion av USV:s av en liten storlek. Den främsta fördelen med USV är att de kan utföra annars kostsamma och mödosamma uppgifter som ursprungligen utfördes av bemannade fordon. De kan också köras på elektriska batterier; vilket begränsar effekten på miljön jämfört med de fossila bränslen som används i traditionella fordon. I detta projekt, utfört på Swedish Maritime Robotics Center vid KTH Stockholm, utvecklades en liten USV för att utföra djupmätningar av vattendrag. Den kan styras via fjärrkontroll och har den hårdvara som krävs för att navigera självständigt. Denna rapport går igenom de steg som projektgruppen tog för att utveckla USV:n. I projektet ingick att studera tidigare arbeten, välja och beställa komponenter, skapa tekniska diagram, utveckla programmeringen och testning. 11 komponenter installerades medan flera planerade inte ingick på grund av tidsbrist. Testning av fjärrkontrollen och GNSS-loggningen var lyckade.
|
9 |
COMPARISON OF THE PERFORMANCE OF NVIDIA ACCELERATORS WITH SIMD AND ASSOCIATIVE PROCESSORS ON REAL-TIME APPLICATIONSShaker, Alfred M. 27 July 2017 (has links)
No description available.
|
10 |
Implementing method of moments on a GPGPU using Nvidia CUDAVirk, Bikram 12 April 2010 (has links)
This thesis concentrates on the algorithmic aspects of Method of Moments (MoM) and Locally Corrected Nyström (LCN) numerical methods in electromagnetics. The data dependency in each step of the algorithm is analyzed to implement a parallel version that can harness the powerful processing power of a General Purpose Graphics Processing Unit (GPGPU). The GPGPU programming model provided by NVIDIA's Compute Unified Device Architecture (CUDA) is described to learn the software tools at hand enabling us to implement C code on the GPGPU. Various optimizations such as the partial update at every iteration, inter-block synchronization and using shared memory enable us to achieve an overall speedup of approximately 10. The study also brings out the strengths and weaknesses in implementing different methods such as Crout's LU decomposition and triangular matrix inversion on a GPGPU architecture. The results suggest future directions of study in different algorithms and their effectiveness on a parallel processor environment. The performance data collected show how different features of the GPGPU architecture can be enhanced to yield higher speedup.
|
Page generated in 0.0475 seconds