Global ETD Search

491	Towards High Speed Aerial Tracking of Agile Targets Rizwan, Yassir January 2011 (has links) In order to provide a novel perspective for videography of high speed sporting events, a highly capable trajectory tracking control methodology is developed for a custom designed Kadet Senior Unmanned Aerial Vehicle (UAV). The accompanying high fidelity system identification ensures that accurate flight models are used to design the control laws. A parallel vision based target tracking technique is also demonstrated and implemented on a Graphical Processing Unit (GPU), to assist in real-time tracking of the target. Nonlinear control techniques like feedback linearization require a detailed and accurate system model. This thesis discusses techniques used for estimating these models using data collected during planned test flights. A class of methods known as the Output Error Methods are discussed with extensions for dealing with wind turbulence. Implementation of these methods, including data acquisition details, on the Kadet Senior are also discussed. Results for this UAV are provided. For comparison, additional results using data from a BAC-221 simulation are also provided as well as typical results from the work done at the Dryden Flight Research Center. The proposed controller combines feedback linearization with linear tracking control using the internal model approach, and relies on a trajectory generating exosystem. Three different aircraft models are presented each with increasing levels of complexity, in an effort to identify the simplest controller that yields acceptable performance. The dynamic inversion and linear tracking control laws are derived for each model, and simulation results are presented for tracking of elliptical and periodic trajectories on the Kadet Senior. Mechanical Engineering
492	Smoke Simulation On Programmable Graphics Hardware Yildirim, Gokce 01 September 2005 (has links) (PDF) Fluids such as smoke, water and fire are simulated for both Computer Graphics applications and engineering fields such as Mechanical Engineering. Generally, Fluid Dynamics is used for the achievement of realistic-looking fluid simulations. However, the complexity of these calculations makes it difficult to achieve high performance. With the advances in graphics hardware, it has been possible to provide programmability both at the vertex and the fragment level, which allows for faster simulations of complex fluids and other events. In this thesis, one gaseous fluid, smoke is simulated in three dimensions by solving Navier-Stokes Equations (NSEs) using a semi-Lagrangian unconditionally stable method. Simulation is performed both on Central Processing Unit (CPU) and Graphics Processing Unit (GPU). For the programmability at the vertex and the fragment level, C for Graphics (Cg), a platform-independent and architecture neutralshading language, is used. Owing to the advantage of programmability and parallelism of GPU, smoke simulation on graphics hardware runs significantly faster than the corresponding CPU implementation. The test results prove the higher performance of GPU over CPU for running three dimensional fluid simulations. QA Computer Software 76.75-76.765
493	Molecular Dynamics on a Grand Scale: Towards large-scale atomistic simulations of self-assembling biomolecular systems Matthew Breeze Unknown Date (has links) To explore progressively larger biomolecular systems, methods to model explicit solvent cheaply are required. In this work, the use of Graphics Processing Units, found in commodity video cards, for solving the constraints, calculating the non-bonded forces and generating the pair list in the case of the fully constrained three site SPC water model is investigated. It was shown that the GPU implementation of the SPC constraint-solving algorithm SETTLE was overall 26% faster than a conventional implementation running on a Central Processing Unit (CPU) core. The non-bonded forces were calculated up to 17 times faster than using a CPU core. Using these two approaches, an overall speed up of around 4 times was found. The most successful implementation of the pair-list generation ran at 38% the speed of a conventional grid-based implementation on a CPU core. In each investigation the accuracy was shown to be sufficient using a variety of numerical and distributional tests. Thus, the use of GPUs as parallel processors for MD calculations is highly promising. Lastly, a method of calculating a constraint force analytically is presented. molecular dynamics parallel GPU GPGPU water SETTLE force neighbour list grid search
494	Αρχιτεκτονική προσομοίωση σε επεξεργαστικές μονάδες υψηλού βαθμού παραλληλίας Στρίκος, Νικόλαος 11 January 2011 (has links) Η πρόσφατη εξάπλωση που είδε το μοντέλο της παράλληλης επεξεργασίας στους μικροεπεξεργαστές γενικής χρήσης με την εισαγωγή περισσότερων από έναν πυρήνες εντός του ολοκληρωμένου κυκλώματος έφερε νέες απαιτήσεις στις μεθόδους προσομοίωσης που παραδοσιακά χρησιμοποιήθηκαν για την εξερεύνηση νέων αρχιτεκτονικών. Στην εργασία αυτή προτείνεται ένα πλαίσιο και ένα προγραμματιστικό μοντέλο που κάνει χρήση της αρχιτεκτονικής υψηλού βαθμού παραλληλίας CUDA για να επιτύχει επιτάχυνση στην αρχιτεκτονική προσομοίωση πρωτοκόλλων συνοχής κρυφής μνήμης. / The recent adoption of the parallel computing model in general-use microprocessors with the inclusion of more than one cores in the IC has raised new demands for the simulation methodologies that have been traditionally used. In this work, a framework and a programming model are proposed that make use of the highly parallel CUDA platform to accelerate architectural simulation of cache coherency protocols. Μικροεπεξεργαστές Κρυφή μνήμη 005.275 GPU CUDA Cache coherency protocols Parallel simulation
495	Τριγωνοποίηση Delaunay : μία υλοποίηση βασισμένη στη GPU και η χρήση της σε προβλήματα πραγματικού χρόνου της υπολογιστικής όρασης και της γραφικής Βασιλείου, Πέτρος 01 February 2013 (has links) Μια γρήγορη επίλυση του Delaunay Τριγωνισμός (DT) πρόβληματος αποτελεί ένα από τα βασικά συστατικά σε πολλές θεωριτικές και πρακτικές εφαρμογές. Οι υπάρχουσες μονάδες επεξεργασίας γραφικών (GPU), με βάση τις εφαρμογές των αλγορίθμων DT πάσχουν από δύο σοβαρά μειονεκτήματα. Το πρώτο σχετίζεται με την εξάρτηση του αλγορίθμου καθοδήγηση της GPU από την CPU για τους υπολογισμούς. Το δεύτερο πιο σοβαρό μειονέκτημα είναι η εξάρτησή τους από τη διανομή του σημειοσύνολου εισόδου. Οι περισσότεροι αλγορίθμοι για GPU έχουν καλή απόδοση μόνο με ομοιόμορφες κατανομές σημειοσύνολον. Προτείνουμε ένα καινούριο αλγόριθμο που δεν πάσχουν από τα παραπάνω προβλήματα. / A Fast solver of Delaunay Triangulation (DT) problem constitutes one of the basic ingredients in many practical and sientific applications. Existing Graphics Processing Units (GPU) based implementations of DT algorithms suffer from two serious drawbacks. The first is related to the dependency of the CPU guidance algorithm on GPU calculations. Albeit the modern GPUs have high computational throughput, if the feedback from CPU is necessary for the algorithmic evolution, the overhead caused by CPU-GPU communication can seriously degrade the performance. The second most serious drawback is their dependency on the distribution of the given point-set. Most of the GPU-based implementations can optimally run only on uniformly distributed point-sets, however, in many practical applications this is not the case. Τριγωνοποίηση Κάρτες γραφικών 006.6 Delaunay Graphics processing unit (GPU) Computational geometry Triangulation
496	Accelerating digital forensic searching through GPGPU parallel processing techniques Bayne, Ethan January 2017 (has links) Background: String searching within a large corpus of data is a critical component of digital forensic (DF) analysis techniques such as file carving. The continuing increase in capacity of consumer storage devices requires similar improvements to the performance of string searching techniques employed by DF tools used to analyse forensic data. As string searching is a trivially-parallelisable problem, general purpose graphic processing unit (GPGPU) approaches are a natural fit. Currently, only some of the research in employing GPGPU programming has been transferred to the field of DF, of which, a closed-source GPGPU framework was used— Complete Unified Device Architecture (CUDA). Findings from these earlier studies have found that local storage devices from which forensic data are read present an insurmountable performance bottleneck. Aim: This research hypothesises that modern storage devices no longer present a performance bottleneck to the currently used processing techniques of the field, and proposes that an open-standards GPGPU framework solution – Open Computing Language (OpenCL) – would be better suited to accelerate file carving with wider compatibility across an array of modern GPGPU hardware. This research further hypothesises that a modern multi-string searching algorithm may be better adapted to fulfil the requirements of DF investigation. Methods: This research presents a review of existing research and tools used to perform file carving and acknowledges related work within the field. To test the hypothesis, parallel file carving software was created using C# and OpenCL, employing both a traditional string searching algorithm and a modern multi-string searching algorithm to conduct an analysis of forensic data. A set of case studies that demonstrate and evaluate potential benefits of adopting various methods in conducting string searching on forensic data are given. This research concludes with a final case study which evaluates the performance to perform file carving with the best-proposed string searching solution and compares the result with an existing file carving tool— Foremost. Results: The results demonstrated from the research establish that utilising the parallelised OpenCL and Parallel Failureless Aho-Corasick (PFAC) algorithm solution demonstrates significantly greater processing improvements from the use of a single, and multiple, GPUs on modern hardware. In comparison to CPU approaches, GPGPU processing models were observed to minimised the amount of time required to search for greater amounts of patterns. Results also showed that employing PFAC also delivers significant performance increases over the BM algorithm. The method employed to read data from storage devices was also seen to have a significant effect on the time required to perform string searching and file carving. Conclusions: Empirical testing shows that the proposed string searching method is believed to be more efficient than the widely-adopted Boyer-Moore algorithms when applied to string searching and performing file carving. The developed OpenCL GPGPU processing framework was found to be more efficient than CPU counterparts when searching for greater amounts of patterns within data. This research also refutes claims that file carving is solely limited by the performance of the storage device, and presents compelling evidence that performance is bound by the combination of the performance of the storage device and processing technique employed.
497	Accelerating interpreted programming languages on GPUs with just-in-time compilation and runtime optimisations Fumero Alfonso, Juan José January 2017 (has links) Nowadays, most computer systems are equipped with powerful parallel devices such as Graphics Processing Units (GPUs). They are present in almost every computer system including mobile devices, tablets, desktop computers and servers. These parallel systems have unlocked the possibility for many scientists and companies to process significant amounts of data in shorter time. But the usage of these parallel systems is very challenging due to their programming complexity. The most common programming languages for GPUs, such as OpenCL and CUDA, are created for expert programmers, where developers are required to know hardware details to use GPUs. However, many users of heterogeneous and parallel hardware, such as economists, biologists, physicists or psychologists, are not necessarily expert GPU programmers. They have the need to speed up their applications, which are often written in high-level and dynamic programming languages, such as Java, R or Python. Little work has been done to generate GPU code automatically from these high-level interpreted and dynamic programming languages. This thesis presents a combination of a programming interface and a set of compiler techniques which enable an automatic translation of a subset of Java and R programs into OpenCL to execute on a GPU. The goal is to reduce the programmability and usability gaps between interpreted programming languages and GPUs. The first contribution is an Application Programming Interface (API) for programming heterogeneous and multi-core systems. This API combines ideas from functional programming and algorithmic skeletons to compose and reuse parallel operations. The second contribution is a new OpenCL Just-In-Time (JIT) compiler that automatically translates a subset of the Java bytecode to GPU code. This is combined with a new runtime system that optimises the data management and avoids data transformations between Java and OpenCL. This OpenCL framework and the runtime system achieve speedups of up to 645x compared to Java within 23% slowdown compared to the handwritten native OpenCL code. The third contribution is a new OpenCL JIT compiler for dynamic and interpreted programming languages. While the R language is used in this thesis, the developed techniques are generic for dynamic languages. This JIT compiler uniquely combines a set of existing compiler techniques, such as specialisation and partial evaluation, for OpenCL compilation together with an optimising runtime that compile and execute R code on GPUs. This JIT compiler for the R language achieves speedups of up to 1300x compared to GNU-R and 1.8x slowdown compared to native OpenCL.
498	Begränsning av felströmmar och skydd mot avbrott i neutralledaren för flygfarkoster på marknivå Wikman, Alexander January 2018 (has links) En stor del av dagens flygfarkoster får sin strömförsörjning på marknivå via ett fyrledarsystem med en separat skyddsjordsanslutning. De komplexa laster som flygfarkosterna utvecklar kan ge upphov till felströmmar i skyddsjordledaren vid normaldrift. I den här rapporten arbetas ett system fram för att begränsa och undvika felströmmarna med hjälp av införandet av ett motstånd mellan neutralpunkt och skyddsjordledare i kraftförsörjningsenheterna. En av de största riskerna med kraftförsörjningen av flygfarkosterna är avbrott i neutralledaren. Då känslig utrustning i flygfarkosterna riskeras att förstöras mycket snabbt. Därav har en metod för att detektera och varna för avbrott i neutralledaren arbetats fram och presenteras. Samt har en litteraturstudie utförts för att säkerställa att lösningsförslagen håller sig inom de standarder som är givna för svenska och internationella flygindustrier. Rapporten presenterar rekommenderade lösningar i form av systemskisser. Lösningarna bygger på att man installerar en impedans mellan neutralledare och skyddsutjämningsledare för att begränsa felströmmarna i skyddsutjämningsledaren. Med hjälp av strömtransformatorer installerade på neutralledaren och skyddsutjämnaren kan en Programmable Logic Controller (PLC) kopplad till en brytande kontaktor skydda mot neutralledaravbrott och varna via en varselpanel för felströmmar, skyddsjordsanslutning och avbrott i neutralledare. Under projektarbetets gång utfördes undersökningar för att se i fall det redan fanns några befintliga lösningar på marknaden. Undersökningarna visar att de flesta producenter av GPU-enheter erbjuder skydd mot neutralledareavbrott som ett tillval genom att skicka en låg 50 Hz ström via en 1mm2 till neutralledaren som är sammankopplade i anslutningspluggen mot flygmaskinerna. En metod som fungerar men tillverkarna avråder att den används på grund av redundans. / A large amount of todays aircrafts are on ground level power supplied by a four wiresystem with an seperate protection earth cable connected to the aircrafts chassi. Thecomplex loads that the aircrafts generate casuses fault currents in the protection earthcable during normal operation of the aircraft. This report suggests a system designed toprevent and limit these fault currents by installing an impedance between the neutral wireand the protective earth in the power supply systems. Another problem that this reporthandel is the risks with an interruption in the neutral wire, which can cause severe damagein the aircrafts. An solution for this problem is also included in this report. As well was anliterature study perfomed to make sure that the suggested solutions keeps up with bothswedish, international and military standards given for the aircraft industry.The report presents solutions in form of basic system sketches. The recommendedsolutions are based around the installation of an impedance between the systems neutralwire and protective eart. Current transformers are installed to meassure currents and sendsignals to an PLC connected to an warning and protection system.During the projekt an survey was made to determine if there already was any solutionsavaible at the market for this kind of problem. The survey shows that some ground powerunit makers offers an solution for interuption in the neutralwire. The method is based onsending a 50Hz curent through one of the supply cables extra wires and back with theneutral. A method that even the producers recommend buyers not to add because of badredundance. GPU 400Hz Z-impedans Elektroteknik och elektronik
499	Correspondence-based pairwise depth estimation with parallel acceleration Bartosch, Nadine January 2018 (has links) This report covers the implementation and evaluation of a stereo vision corre- spondence-based depth estimation algorithm on a GPU. The results and feed- back are used for a Multi-view camera system in combination with Jetson TK1 devices for parallelized image processing and the aim of this system is to esti- mate the depth of the scenery in front of it. The performance of the algorithm plays the key role. Alongside the implementation, the objective of this study is to investigate the advantages of parallel acceleration inter alia the differences to the execution on a CPU which are significant for all the function, the imposed overheads particular for a GPU application like memory transfer from the CPU to the GPU and vice versa as well as the challenges for real-time and concurrent execution. The study has been conducted with the aid of CUDA on three NVIDIA GPUs with different characteristics and with the aid of knowledge gained through extensive literature study about different depth estimation algo- rithms but also stereo vision and correspondence as well as CUDA in general. Using the full set of components of the algorithm and expecting (near) real-time execution is utopic in this setup and implementation, the slowing factors are in- ter alia the semi-global matching. Investigating alternatives shows that results for disparity maps of a certain accuracy are also achieved by local methods like the Hamming Distance alone and by a filter that refines the results. Further- more, it is demonstrated that the kernel launch configuration and the usage of GPU memory types like shared memory is crucial for GPU implementations and has an impact on the performance of the algorithm. Just concurrency proves to be a more complicated task, especially in the desired way of realization. For the future work and refinement of the algorithm it is therefore recommended to invest more time into further optimization possibilities in regards of shared memory and into integrating the algorithm into the actual pipeline. Depth estimation disparity stereo vision stereo correspondence NVIDIA GPU CUDA parallelization Computer Systems Datorsystem
500	Interactive out-of-core rendering and filtering of one billion stars measured by the ESA Gaia mission Alsegård, Adam January 2018 (has links) The purpose of this thesis was to visualize the 1.7 billion stars released by the European Space Agency, as the second data release (DR2) of their Gaia mission, in the open source software OpenSpace with interactive framerates and also to be able to filter the data in real-time. An additional implementation goal was to streamline the data pipeline so that astronomers could use OpenSpace as a visualization tool in their research. An out-of-core rendering technique has been implemented where the data is streamed from disk during runtime. To be able to stream the data it first has to be read, sorted into an octree structure and then stored as binary files in a preprocess. The results of this report show that the entire DR2 dataset can be read from multiple files in a folder and stored as binary values in about seven hours. This step determines what values the user will be able to filter by and only has to be done once for a specific dataset. Then an octree can be created in about 5 to 60 minutes where the user can define if the stars should be filtered by any of the previously stored values. Only values used in the rendering will be stored in the octree. If the created octree can fit in the computer’s working memory then the entire octree will be loaded asynchronously on start-up otherwise only a binary file with the structure of the octree will be read during start-up while the actual star data will be streamed from disk during runtime. When the data have been loaded it is streamed to the GPU. Only stars that are visible are uploaded and the application also keeps track of which nodes that already have been uploaded to eliminate redundant updates. The inner nodes of the octree store the brightest stars in all its descendants as a level-of-detail cache that can be used when the nodes are small enough in screen space. The previous star rendering in OpenSpace has been improved by dividing the rendering phase into two passes. The first pass renders into a framebuffer object while the second pass then performs a tonemapping of the values. The rendering can be done either with billboard instancing or point splatting. The latter is generally the faster alternative. The user can also switch between using VBOs or SSBOs when updating the buffers. The latter is faster but requires OpenGL 4.3, which Apple products do not currently support. The rendering runs with interactive framerates for both flat and curved screen, such as domes/planetariums. The user can also switch dataset during render as well as render technique, buffer objects, color settings and many other properties. It is also possible to turn time on and see the stars move with their calculated space velocity, or transverse velocity if the star lacks radial velocity measurements. The calculations omits the gravitational rotation. The purpose of the thesis has been fulfilled as it is possible to fly through the entire DR2 dataset on a moderate desktop computer and filter the data in real-time. However, the main contribution of the project may be that the ground work has been laid in OpenSpace for astronomers to actually use it as a tool when visualizing their own datasets and also for continuing to explore the coming Gaia releases. out-of-core rendering large-scale visualization hierarchical octree GPU streaming realtime filtering. Media and Communication Technology Medieteknik

Search results