Spelling suggestions: "subject:"multiengine"" "subject:"multiengineer""
1 |
Volume rendering with Marching cubes and async computeTlatlik, Max Lukas January 2019 (has links)
With the addition of the compute shader stage for GPGPU hardware it has becomepossible to run CPU like programs on modern GPU hardware. The greatest benefit can be seen for algorithms that are of highly parallel nature and in the case of volume rendering the Marching cubes algorithm makes for a great candidate due to its simplicity and parallel nature. For this thesis the Marching cubes algorithm was implemented on a compute shader and used in a DirectX 12 framework to determine if GPU frametime performance can be improved by executing the compute command queue parallell to the graphics command queue. Results from performance benchmarks show that a gain is present for each benchmarked configuration and the largest gains are seen for smaller workloads with up to 52%. This information could therefore prove useful for game developers who want to improve framerates or decrease development time but also in other fields such as volume rendering for medical images.
|
2 |
Particle Simulation using Asynchronous Compute : A Study of The HardwareEnarsson, Kim January 2020 (has links)
Background. With the introduction of the compute shader, followed by the application programming interface (API) DirectX 12, the modern GPU is now going through a transformation. Previously the GPU was used as a massive computational tool for running a single task at unparalleled speed. The compute shader made it possible to run CPU like programs on the GPU, DirectX 12 takes this even further by introducing a multi-engine architecture. Multi-engine architecture unlocks the possibility of running the compute shader alongside the regular graphical stages, this concept is called asynchronous compute. Objectives. This thesis aims to investigate if asynchronous compute can be used to increase the performance of particle simulations. The key metrics being studied are total frame time, rendered frames per second, and overlap time. The frst two are used to determine if asynchronous compute improves performance or not, while the last is used to determine if the particle simulation is running asynchronous compute or not.Methods. For this thesis, the particle simulation used is the N-body particle simulation.The N-body particle simulation is implemented using a compute shader and is part of a larger DirectX 12 framework. One application is implemented that run two different execution models, one is the standard sequential execution model and one is the asynchronous compute model. The main difference between the two execution models is that the sequential execution model will be using only one command queue, this being a 3D command queue. The asynchronous compute model will be running a separate compute command queue alongside the 3D command queue. The performance metrics being studied are all collected using a custom-built GPU profiler. Results. The results indicate that it is possible to increase the performance of particle simulations using asynchronous compute. The registered performance gain reaches as high as 34% on hardware that supports asynchronous compute while hardware that according to NVIDIA does not support asynchronous compute registered performance gains up towards 11%. In terms of overlap time between the compute workload and the graphical workload, the AMD GPU showed an overlap time that matched the frame time. However, NVIDIA GPUs did not show the expected overlap time. Conclusions. It can be determined that asynchronous compute provide benefits when compared to the sequential execution model, it can be used to increase the performance of particle simulations. However, since the research in this thesis only made use of a single particle simulation, more work needs to be done, for example, work to test if the performance gain can be improved even further using different methods like, workload pairing or utilizing multiple GPUs, however that kind of work requires the use of a larger-scale application that consists of multiple different tasks other than just a single particle simulation. / Bakgrund. I och med Introduktionen av compute shadern, tätt följd av DirectX12, så genomgår den moderna GPUn en förvandling. Tidigare användes GPUn som ett massivt uträkningsverktyg ämnat att utföra en enda uppgift med en enastående hastighet. Compute shadern gjorde det möjligt at köra CPU liknande program på GPUn, DirectX 12 tar detta ett steg längre genom att introducera en multi-engine arkitektur. Denna arkitektur låser upp möjligheten att köra compute shadern samtidigt som de vanliga grafiska shader stadigerna, detta konceptet kallas asynchronous compute.Syfte. Syftet med denna avhandling är att undersöka om asynchronous compute kan användas för att öka prestandan på en partikel simulering. Den viktigaste data som kommer studeras är den totala frame tiden, antalet renderade frames varje sekund och överlapp tiden. Den totala frame tiden och antalet renderade frames varje sekund används för att bestämma om asynchronous compute faktiskt ökar prestandan eller inte, medan överlapp tiden används för att bestämma om partikel simuleringen kör asynchronous compute eller inte.Metod. Partikel simuleringen som används i denna avhandling är en N-body partikel simulering. N-body partikel simuleringen är implementerad i en compute shader och är en del av en större DirectX 12 applikation. En applikation implementeras som kör två olika exekverings modeller, den ena är den vanliga sekventiella exekverings modellen och den andra är asynchronous compute modellen. Den primära skillnaden mellan exekverings modellerna är att den sekventiella exekverings modellen bara använder sig av en kommando kö, vilken är en 3D kommando kö. Asynchronous compute modellen kommer använda sig av en separat compute kommando kö tillsammans med 3D kommando kön. Den metriska datan samlas in med hjälp av enegen byggd GPU profilerare.Resultat. Resultatet indikerar att det är möjligt att öka prestandan hos en partikelsimulering som använder sig av asynchronous compute. Den registrerade prestandaökningen når så högt som till 34% på hårdvara som stödjer asynchronous compute, medan hårdvara som inte stödjer asynchronous compute registrerade en prestandaökning upp till 11%. När det kommer till överlapp tiden mellan compute delen och den grafiska delen så visar GPUn från AMD en överlapp tid som matchar frame tiden. När det kommer till GPUerna från NVIDIA så visade dessa inte en förväntad överlapp tid.Slutsatser. Det kan fastställas att asynchronous compute har vissa fördelar jämfört med den sekventiella exekverings modellen. Asynchronous compute kan användas för att öka prestanda hos partikel simuleringar, men eftersom undersökningen i denna avhandling bara använder en enda partikel simulering så krävs ännu mera forskning. Exempelvis forskning som undersöker om prestanda ökningen kan bli ännu bättre, genom att applicera olika metoder som workload pairing och användingen av fera GPUer, detta krväver också att en större application för testing används, som består av fera olika typer av simuleringar och inte bara en enda partikel simuleing.
|
3 |
Text-image Restoration And Text Alignment For Multi-engine Optical Character Recognition SystemsKozlovski, Nikolai 01 January 2006 (has links)
Previous research showed that combining three different optical character recognition (OCR) engines (ExperVision® OCR, Scansoft OCR, and Abbyy® OCR) results using voting algorithms will get higher accuracy rate than each of the engines individually. While a voting algorithm has been realized, several aspects to automate and improve the accuracy rate needed further research. This thesis will focus on morphological image preprocessing and morphological text restoration that goes to OCR engines. This method is similar to the one used in restoration partial finger prints. Series of morphological dilating and eroding filters of various mask shapes and sizes were applied to text of different font sizes and types with various noises added. These images were then processed by the OCR engines, and based on these results successful combinations of text, noise, and filters were chosen. The thesis will also deal with the problem of text alignment. Each OCR engine has its own way of dealing with noise and corrupted characters; as a result, the output texts of OCR engines have different lengths and number of words. This in turn, makes it impossible to use spaces a delimiter as a method to separate the words for processing by the voting part of the system. Text aligning determines, using various techniques, what is an extra word, what is supposed to be two or more words instead of one, which words are missing in one document compared to the other, etc. Alignment algorithm is made up of a series of shifts in the two texts to determine which parts are similar and which are not. Since errors made by OCR engines are due to visual misrecognition, in addition to simple character comparison (equal or not), a technique was developed that allows comparison of characters based on how they look.
|
Page generated in 0.0505 seconds