Global ETD Search

271	GPUHElib and DistributedHElib: Distributed Computing Variants of HElib, a Homomorphic Encryption Library Frame, Ethan Andrew 01 June 2015 (has links) (PDF) Homomorphic Encryption, an encryption scheme only developed in the last five years, allows for arbitrary operations to be performed on encrypted data. Using this scheme, a user can encrypt data, and send it to an online service. The online service can then perform an operation on the data and generate an encrypted result. This encrypted result is then sent back to the user, who decrypts it. This decryption produces the same data as if the operation performed by the online service had been performed on the unencrypted data. This is revolutionary because it allows for users to rely on online services, even untrusted online services, to perform operations on their data, without the online service gaining any knowledge from their data. A prominent implementation of homomorphic encryption is HElib. While one is able to perform homomorphic encryption with this library, there are problems with it. It, like all other homomorphic encryption libraries, is slow relative to other encryption systems. Thus there is a need to speed it up. Because homomorphic encryption will be deployed on online services, many of them distributed systems, it is natural to modify HElib to utilize some of the tools that are available on them in an attempt to speed up run times. Thus two modified libraries were designed: GPUHElib, which utilizes a GPU, and DistributedHElib, which utilizes a distributed computing design. These designs were then tested against the original library to see if they provided any speed up. GPU Distributed computing Homomorphic encryption HElib Computer Sciences Information Security
272	Optimizing Lempel-Ziv Factorization for the GPU Architecture Ching, Bryan 01 June 2014 (has links) (PDF) Lossless data compression is used to reduce storage requirements, allowing for the relief of I/O channels and better utilization of bandwidth. The Lempel-Ziv lossless compression algorithms form the basis for many of the most commonly used compression schemes. General purpose computing on graphic processing units (GPGPUs) allows us to take advantage of the massively parallel nature of GPUs for computations other that their original purpose of rendering graphics. Our work targets the use of GPUs for general lossless data compression. Specifically, we developed and ported an algorithm that constructs the Lempel-Ziv factorization directly on the GPU. Our implementation bypasses the sequential nature of the LZ factorization and attempts to compute the factorization in parallel. By breaking down the LZ factorization into what we call the PLZ, we are able to outperform the fastest serial CPU implementations by up to 24x and perform comparatively to a parallel multicore CPU implementation. To achieve these speeds, our implementation outputted LZ factorizations that were on average only 0.01 percent greater than the optimal solution that what could be computed sequentially. We are also able to reevaluate the fastest GPU suffix array construction algorithm, which is needed to compute the LZ factorization. We are able to find speedups of up to 5x over the fastest CPU implementations. GPU CUDA Parallelism Compression LZ77 Lempel-Ziv Computer and Systems Architecture
273	Physics Engine on the GPU with OpenGL Compute Shaders Bui, Quan Huy Minh 01 March 2021 (has links) (PDF) Any kind of graphics simulation can be thought of like a fancy flipbook. This notion is, of course, nothing new. For instance, in a game, the central computing unit (CPU) needs to process frame by frame, figuring out what is happening, and then finally issues draw calls to the graphics processing unit (GPU) to render the frame and display it onto the monitor. Traditionally, the CPU has to process a lot of things: from the creation of the window environment for the processed frames to be displayed, handling game logic, processing artificial intelligence (AI) for non-player characters (NPC), to the physics, and issuing draw calls; and all of these have to be done within roughly 0.0167 seconds to maintain real-time performance of 60 frames per second (fps). The main goal of this thesis is to move the physics pipeline of any kind of simulation to the GPU instead of the CPU. The main tool to make this possible would be the usage of OpenGL Compute Shaders. OpenGL is a high-performance graphics application programming interface (API), used as an abstraction layer for the CPU to communicate with the GPU. OpenGL was created by the Khronos Group primarily for graphics, or drawing frames only. In the later versions of OpenGL, the Khronos Group has introduced Compute Shader, which can be used for general-purpose computing on the GPU (GPGPU). This means the GPU can be used to process any arbitrary math computations, and not limited to only process the vertices and fragments of polygons. This thesis features Broad Phase and Narrow Phase collision detection stages, and a collision Resolution Phase with Sequential Impulses entirely on the GPU with real-time performance. Physics Simulation Engine 3D OpenGL GPU Graphics and Human Computer Interfaces
274	Millipyde: A Cross-Platform Python Framework for Transparent GPU Acceleration Asbury, James B 01 December 2021 (has links) (PDF) The prevalence of general-purpose GPU computing continues to grow and tackle a wider variety of problems that benefit from GPU-acceleration. This acceleration often suffers from a high barrier to entry, however, due to the complexity of software tools that closely map to the underlying GPU hardware, the fast-changing landscape of GPU environments, and the fragmentation of tools and languages that only support specific platforms. Because of this, new solutions will continue to be needed to make GPGPU acceleration more accessible to the developers that can benefit from it. AMD’s new cross-platform development ecosystem ROCm provides promise for developing applications and solutions that work across systems running both AMD and non-AMD GPU computing hardware. This thesis presents Millipyde, a framework for GPU acceleration in Python using AMD’s ROCm. Millipyde includes two new types, the gpuarray and gpuimage, as well as three new constructs for building GPU-accelerated applications – the Operation, Pipeline, and Generator. Using these tools, Millipyde hopes to make it easier for engineers and researchers to write GPU-accelerated code in Python. Millipyde also has the potential to schedule work across many GPUs in complex multi-device environments. These capabilities will be demonstrated in a sample application of augmenting images on-device for machine learning applications. Our results showed that Millipyde is capable of making individual image-related transformations up to around 200 times faster than their CPU-only equivalents. Constructs such as the Millipyde’s Pipeline was also able to additionally improve performance in certain situations, and it performed best when it was allowed to transparently schedule work across multiple devices. Parallel Programming GPGPU GPU Systems Programming Systems Architecture
275	Parallelising High OrderTransform of Point SpreadFunction and TemplateSubtraction for AstronomicImage Subtraction : The implementation of BACH Wång, Annie, Lells, Victor January 2023 (has links) This thesis explores possible improvements, using parallel computing, to the PSF-alignment and image subtraction algorithm found in HOTPANTS. In time-domain astronomy the PSF-alignment and image subtraction algorithm OIS is used to identify transient events. hotpants is a software package based on OIS, the software package ISIS, and other subsequent research done to improve OIS. A parallel GPU implementation of the algorithm from HOTPANTS – henceforth known as BACH –was created for this thesis. The goal of this thesis is to answer the questions: “what parts of HOTPANTS are most suited for parallelisation?” and “how does bach perform compared to HOTPANTS and SFFT?”, another PSF-alignment and image subtraction tool. The authors found that the parts most susceptible to parallelisation were the convolution and subtraction steps. However, the subtraction did not display a significant improvement to its sequential counterpart. The other parts of HOTPANTS were deemed too complex to implement in parallel on the GPU. However, some parts could probably either be partly parallelised on the GPU or parallelised usingthe CPU. BACH was always as fast as or faster than HOTPANTS; it was generally 2 times faster, but was up to 4.5 times faster in some test cases. It was also faster than SFFT, but this result was not equivalent to the result presented in [15], which is why the authors of this thesis believe something was wrong with either the installation of SFFT or the hardware used to test it. Image Subtraction HOTPANTS SFFT GPU Parallelisation Computer Engineering Datorteknik
276	Autonomous Path-Following by Approximate Inverse Dynamics and Vector Field Prediction Gerlach, Adam R. 23 October 2014 (has links) No description available. Aerospace Materials UAV Navigation Control Prediction GPU Vector Field
277	A Fast Poisson Solver with Periodic Boundary Conditions for GPU Clusters in Various Configurations Rattermann, Dale N. 27 October 2014 (has links) No description available. Aerospace Materials GPU Poisson Incompressible CFD FFT CUDA
278	Sparse Matrix-Vector Multiplication on GPU Ashari, Arash January 2014 (has links) No description available. Computer Engineering Computer Science GPU CUDA Sparse SpMV BRC ACSR
279	Architectural Solutions For Mitigating Voltage Noise in GPUs Thomas, Renji George George January 2015 (has links) No description available. Computer Engineering Computer Science
280	Graphic-Processing-Units Based Adaptive Parameter Estimation of a Visual Psychophysical Model Gu, Hairong 17 December 2012 (has links) No description available. Psychology Psychophysics Adaptive Design Optimization GPU computing parameter estimation

Search results