Global ETD Search

11	Exploring the dynamic radio sky with many-core high-performance computing Malenta, Mateusz January 2018 (has links) As new radio telescopes and processing facilities are being built, the amount of data that has to be processed is growing continuously. This poses significant challenges, especially if the real-time processing is required, which is important for surveys looking for poorly understood objects, such as Fast Radio Bursts, where quick detection and localisation can enable rapid follow-up observations at different frequencies. With the data rates increasing all the time, new processing techniques using the newest hardware, such as GPUs, have to be developed. A new pipeline, called PAFINDER, has been developed to process data taken with a phased array feed, which can generate up to 36 beams on the sky, with data rates of 25 GBps per beam. With the majority of work done on GPUs, the pipeline reaches real-time performance when generating filterbank files used for offline processing. The full real-time processing, including single-pulse searches has also been implemented and has been shown to perform well under favourable conditions. The pipeline was successfully used to record and process data containing observations of RRAT J1819-1458 and positions on the sky where 3 FRBs have been observed previously, including the repeating FRB121102. Detailed examination of J1819-1458 single-pulse detections revealed a complex emission environment with pulses coming from three different rotation phase bands and a number of multi-component emissions. No new FRBs and no repeated bursts from FRB121102 have been detected. The GMRT High Resolution Southern Sky survey observes the sky at high galactic latitudes, searching for new pulsars and FRBs. 127 hours of data have been searched for the presence of any new bursts, with the help of new pipeline developed for this survey. No new FRBs have been found, which can be the result of bad RFI pollution, which was not fully removed despite new techniques being developed and combined with the existing solutions to mitigate these negative effects. Using the best estimates on the total amount of data that has been processed correctly, obtained using new single-pulse simulation software, no detections were found to be consistent with the expected rates for standard candle FRBs with a flat or positive spectrum. 500
12	Analysis Of Single Phase Fluid Flow And Heat Transfer In Slip Flow Regime By Parallel Implementation Of Lattice Boltzmann Method On Gpus Celik, Sitki Berat 01 September 2012 (has links) (PDF) In this thesis work fluid flow and heat transfer in two-dimensional microchannels are studied numerically. A computer code based on Lattice Boltzmann Method (LBM) is developed for this purpose. The code is written using MATLAB and Jacket software and has the important feature of being able to run parallel on Graphics Processing Units (GPUs). The code is used to simulate flow and heat transfer inside micro and macro channels. Obtained velocity profiles and Nusselt numbers are compared with the Navier-Stokes based analytical and numerical results available in the literature and good matches are observed. Slip velocity and temperature jump boundary conditions are used for the micro channel simulations with Knudsen number values covering the slip flow regime. Speed of the parallel version of the developed code running on GPUs is compared with that of the serial one running on CPU and for large enough meshes more than 14 times speedup is observed. QC Acoustics, Sound 221-246
13	Efficient and Private Processing of Analytical Queries in Scientific Datasets Kumar, Anand 01 January 2013 (has links) Large amount of data is generated by applications used in basic-science research and development applications. The size of data introduces great challenges in storage, analysis and preserving privacy. This dissertation proposes novel techniques to efficiently analyze the data and reduce storage space requirements through a data compression technique while preserving privacy and providing data security. We present an efficient technique to compute an analytical query called spatial distance histogram (SDH) using spatiotemporal properties of the data. Special spatiotemporal properties present in the data are exploited to process SDH efficiently on the fly. General purpose graphics processing units (GPGPU or just GPU) are employed to further boost the performance of the algorithm. Size of the data generated in scientific applications poses problems of disk space requirements, input/output (I/O) delays and data transfer bandwidth requirements. These problems are addressed by applying proposed compression technique. We also address the issue of preserving privacy and security in scientific data by proposing a security model. The security model monitors user queries input to the database that stores and manages scientific data. Outputs of user queries are also inspected to detect privacy breach. Privacy policies are enforced by the monitor to allow only those queries and results that satisfy data owner specified policies. Big Data Compression Edit Automata GPU Computing Molecular Simulations Parallel Processing Computer Engineering Computer Sciences Engineering
14	Nonnegative matrix factorization for clustering Kuang, Da 27 August 2014 (has links) This dissertation shows that nonnegative matrix factorization (NMF) can be extended to a general and efficient clustering method. Clustering is one of the fundamental tasks in machine learning. It is useful for unsupervised knowledge discovery in a variety of applications such as text mining and genomic analysis. NMF is a dimension reduction method that approximates a nonnegative matrix by the product of two lower rank nonnegative matrices, and has shown great promise as a clustering method when a data set is represented as a nonnegative data matrix. However, challenges in the widespread use of NMF as a clustering method lie in its correctness and efficiency: First, we need to know why and when NMF could detect the true clusters and guarantee to deliver good clustering quality; second, existing algorithms for computing NMF are expensive and often take longer time than other clustering methods. We show that the original NMF can be improved from both aspects in the context of clustering. Our new NMF-based clustering methods can achieve better clustering quality and run orders of magnitude faster than the original NMF and other clustering methods. Like other clustering methods, NMF places an implicit assumption on the cluster structure. Thus, the success of NMF as a clustering method depends on whether the representation of data in a vector space satisfies that assumption. Our approach to extending the original NMF to a general clustering method is to switch from the vector space representation of data points to a graph representation. The new formulation, called Symmetric NMF, takes a pairwise similarity matrix as an input and can be viewed as a graph clustering method. We evaluate this method on document clustering and image segmentation problems and find that it achieves better clustering accuracy. In addition, for the original NMF, it is difficult but important to choose the right number of clusters. We show that the widely-used consensus NMF in genomic analysis for choosing the number of clusters have critical flaws and can produce misleading results. We propose a variation of the prediction strength measure arising from statistical inference to evaluate the stability of clusters and select the right number of clusters. Our measure shows promising performances in artificial simulation experiments. Large-scale applications bring substantial efficiency challenges to existing algorithms for computing NMF. An important example is topic modeling where users want to uncover the major themes in a large text collection. Our strategy of accelerating NMF-based clustering is to design algorithms that better suit the computer architecture as well as exploit the computing power of parallel platforms such as the graphic processing units (GPUs). A key observation is that applying rank-2 NMF that partitions a data set into two clusters in a recursive manner is much faster than applying the original NMF to obtain a flat clustering. We take advantage of a special property of rank-2 NMF and design an algorithm that runs faster than existing algorithms due to continuous memory access. Combined with a criterion to stop the recursion, our hierarchical clustering algorithm runs significantly faster and achieves even better clustering quality than existing methods. Another bottleneck of NMF algorithms, which is also a common bottleneck in many other machine learning applications, is to multiply a large sparse data matrix with a tall-and-skinny dense matrix. We use the GPUs to accelerate this routine for sparse matrices with an irregular sparsity structure. Overall, our algorithm shows significant improvement over popular topic modeling methods such as latent Dirichlet allocation, and runs more than 100 times faster on data sets with millions of documents. Nonnegative matrix factorization Cluster analysis Hierarchical clustering Cancer subtype discovery GPU computing Sparse matrix multiplication
15	MR-CUDASW - GPU accelerated Smith-Waterman algorithm for medium-length (meta)genomic data 2014 November 1900 (has links) The idea of using a graphics processing unit (GPU) for more than simply graphic output purposes has been around for quite some time in scientific communities. However, it is only recently that its benefits for a range of bioinformatics and life sciences compute-intensive tasks has been recognized. This thesis investigates the possibility of improving the performance of the overlap determination stage of an Overlap Layout Consensus (OLC)-based assembler by using a GPU-based implementation of the Smith-Waterman algorithm. In this thesis an existing GPU-accelerated sequence alignment algorithm is adapted and expanded to reduce its completion time. A number of improvements and changes are made to the original software. Workload distribution, query profile construction, and thread scheduling techniques implemented by the original program are replaced by custom methods specifically designed to handle medium-length reads. Accordingly, this algorithm is the first highly parallel solution that has been specifically optimized to process medium-length nucleotide reads (DNA/RNA) from modern sequencing machines (i.e. Ion Torrent). Results show that the software reaches up to 82 GCUPS (Giga Cell Updates Per Second) on a single-GPU graphic card running on a commodity desktop hardware. As a result it is the fastest GPU-based implemen- tation of the Smith-Waterman algorithm tailored for processing medium-length nucleotide reads. Despite being designed for performing the Smith-Waterman algorithm on medium-length nucleotide sequences, this program also presents great potential for improving heterogeneous computing with CUDA-enabled GPUs in general and is expected to make contributions to other research problems that require sensitive pairwise alignment to be applied to a large number of reads. Our results show that it is possible to improve the performance of bioinformatics algorithms by taking full advantage of the compute resources of the underlying commodity hardware and further, these results are especially encouraging since GPU performance grows faster than multi-core CPUs. Bioinformatics Sequence Alignment Smith-Waterman Algorithm GPU Computing CUDA Sequence Assembly Metagenomics Next-Generation-Sequencing
16	Performance analysis of GPGPU and CPU on AES Encryption Neelap, Akash Kiran January 2014 (has links) The advancements in computing have led to tremendous increase in the amount of data being generated every minute, which needs to be stored or transferred maintaining high level of security. The military and armed forces today heavily rely on computers to store huge amount of important and secret data, that holds a big deal for the security of the Nation. The traditional standard AES encryption algorithm being the heart of almost every application today, although gives a high amount of security, is time consuming with the traditional sequential approach. Implementation of AES on GPUs is an ongoing research since few years, which still is either inefficient or incomplete, and demands for optimizations for better performance. Considering the limitations in previous research works as a research gap, this paper aims to exploit efficient parallelism on the GPU, and on multi-core CPU, to make a fair and reliable comparison. Also it aims to deduce implementation techniques on multi-core CPU and GPU, in order to utilize them for future implementations. This paper experimentally examines the performance of a CPU and GPGPU in different levels of optimizations using Pthreads, CUDA and CUDA STREAMS. It critically exploits the behaviour of a GPU for different granularity levels and different grid dimensions, to examine the effect on the performance. The results show considerable acceleration in speed on NVIDIA GPU (QuadroK4000), over single-threaded and multi-threaded implementations on CPU (Intel® Xeon® E5-1650). / +46-760742850 AES algorithm CUDA GPU computing Pthreads Computer Sciences Datavetenskap (datalogi) Telecommunications Telekommunikation Software Engineering Programvaruteknik
17	Collective behaviour of model microswimmers Putz, Victor B. January 2010 (has links) At small length scales, low velocities, and high viscosity, the effects of inertia on motion through fluid become insignificant and viscous forces dominate. Microswimmer propulsion, of necessity, is achieved through different means than that achieved by macroscopic organisms. We describe in detail the hydrodynamics of microswimmers consisting of colloidal particles and their interactions. In particular we focus on two-bead swimmers and the effects of asymmetry on collective motion, calculating analytical formulae for time-averaged pair interactions and verifying them with microscopic time-resolved numerical simulation, finding good agreement. We then examine the long-term effects of a swimmer's passing on a passive tracer particle, finding that the force-free nature of these microswimmers leads to loop-shaped tracer trajectories. Even in the presence of Brownian motion, the loop-shaped structures of these trajectories can be recovered by averaging over a large enough sample size. Finally, we explore the phenomenon of synchronisation between microswimmers through hydrodynamic interactions, using the method of constraint forces on a force-based swimmer. We find that the hydrodynamic interactions between swimmers can alter the relative phase between them such that phase-locking can occur over the long term, altering their collective motion. 530.4
18	Optimization of American option pricing through GPU computing / Optimering av prissättning av amerikanska optioner genom GPU-beräkningar Greinsmark, Hadar, Lindström, Erik January 2017 (has links) Over the last decades the market for financial derivatives has grown dramatically to values of global importance. With the digital automation of the markets, programs able to efficiently value financial derivatives has become key to market competitiveness and thus garnered considerable interest. This report explores the potential efficiency gains of employing modern technology in GPU computing to price financial options, using the binomial option pricing model. The model is implemented using both CPU and GPU hardware and results compared in terms of computational efficiency. According to this thesis, GPU computing can considerably improve option pricing runtimes. / Under de senaste decennierna har marknaden för finansiella derivatinstrument vuxit till värden av global betydelse. Med ökande digitalisering av marknaden har program som effektivt kan värdera derivatinstrument blivit avgörande för konkurrenskraft och därför givits avsevärt intresse. Denna rapport utforskar vilka möjliga ökningar i effektivitet som kan nås genom att använda modern teknik för GPU-beräkningar för att värdera finansiella optioner genom den binomiala optionsvärderingsmodellen. Modellen implementeras både med CPU-, och GPU-hårdvara och resultaten jämförs i termer av beräkningseffektivitet. Enligt denna studie kan GPU-beräkingar avsevärt förbättra körtider för optionsvärderingar. finance options GPU GPGPU GPU computing binomial method BOPM CUDA Computer Sciences Datavetenskap (datalogi)
19	Reducing the Cost of Chemistry in Reactive-Flow Simulations: Novel Mechanism Reduction Strategies and Acceleration via Graphics Processing Units Niemeyer, Kyle Evan 21 February 2014 (has links) No description available. Mechanical Engineering Energy Combustion Detailed chemistry Transportation fuels Mechanism reduction GPU computing
20	Gravitational Microlensing: GPU-based Simulation Algorithms and the Information Content of Light Curves / Der Mikrogravitationslinseneffekt: GPU-basierte Simulationsalgorithmen und der Informationsgehalt von Lichtkurven Hundertmark, Markus Peter Gerhard 20 June 2011 (has links) No description available. 520 Astronomie Physics Mikrogravitationslinseneffekt Doppellinsenmodell Photometrie GPU Computing gravitational microlensing binary-lens model photometry GPU computing 39.22 TFA 000: Relativistische Astrophysik Gravitation THG 000: Sternmassen Sterndichten {Astronomie}

Search results