Global ETD Search

21	Brain perfusion imaging : performance and accuracy Zhu, Fan January 2013 (has links) Brain perfusion weighted images acquired using dynamic contrast studies have an important clinical role in acute stroke diagnosis and treatment decisions. The purpose of my PhD research is to develop novel methodologies for improving the efficiency and quality of brain perfusion-imaging analysis so that clinical decisions can be made more accurately and in a shorter time. This thesis consists of three parts: My research investigates the possibility that parallel computing brings to make perfusion-imaging analysis faster in order to deliver results that are used in stroke diagnosis earlier. Brain perfusion analysis using local Arterial Input Functions (AIF) techniques takes a long time to execute due to its heavy computational load. As time is vitally important in the case of acute stroke, reducing analysis time and therefore diagnosis time can reduce the number of brain cells damaged and improve the chances for patient recovery. We present the implementation of a deconvolution algorithm for brain perfusion quantification on GPGPU (General Purpose computing on Graphics Processing Units) using the CUDA programming model. Our method aims to accelerate the process without any quality loss. Specific features of perfusion source images are also used to reduce noise impact, which consequently improves the accuracy of hemodynamic maps. The majority of existing approaches for denoising CT images are optimized for 3D (spatial) information, including spatial decimation (spatially weighted mean filters) and techniques based on wavelet and curvelet transforms. However, perfusion imaging data is 4D as it also contains temporal information. Our approach using Gaussian process regression (GPR) makes use of the temporal information in the perfusion source imges to reduce the noise level. Over the entire image, our noise reduction method based on Gaussian process regression gains a 99% contrast-to-noise ratio improvement over the raw image and also improves the quality of hemodynamic maps, allowing a better identification of edges and detailed information. At the level of individual voxels, GPR provides a stable baseline, helps identify key parameters from tissue time-concentration curves and reduces the oscillations in the curves. Furthermore, the results show that GPR is superior to the alternative techniques compared in this study. My research also explores automatic segmentation of perfusion images into potentially healthy areas and lesion areas, which can be used as additional information that assists in clinical diagnosis. Since perfusion source images contain more information than hemodynamic maps, good utilisation of source images leads to better understanding than the hemodynamic maps alone. Correlation coefficient tests are used to measure the similarities between the expected tissue time-concentration curves (from reference tissue) and the measured time-concentration curves (from target tissue). This information is then used to distinguish tissues at risk and dead tissues from healthy tissues. A correlation coefficient based signal analysis method that directly spots suspected lesion areas from perfusion source images is presented. Our method delivers a clear automatic segmentation of healthy tissue, tissue at risk and dead tissue. From our segmentation maps, it is easier to identify lesion boundaries than using traditional hemodynamic maps. 006.3
22	Parallelization of virtual machines for efficient multi-processor emulation Chakravarthy, Ramapriyan Sreenath 09 November 2010 (has links) Simulation is an essential tool for computer systems research. The speed of the simulator has a first-order effect on what experiments can be run. The recent shift in the industry to multi-core processors has put even more pressure on simulation performance, especially since most simulators are single-threaded. In this thesis, we present QEMU-MP, a parallelized version of a fast functional simulator called QEMU. / text Computer simulation Functional simulation Parallelization QEMU Binary translation
23	Parallelization of a software based intrusion detection system - Snort Zhang, Huan January 2011 (has links) Computer networks are already ubiquitous in people’s lives and work and network security is becoming a critical part. A simple firewall, which can only scan the bottom four OSI layers, cannot satisfy all security requirements. An intrusion detection system (IDS) with deep packet inspection, which can filter all seven OSI layers, is becoming necessary for more and more networks. However, the processing throughputs of the IDSs are far behind the current network speed. People have begun to improve the performance of the IDSs by implementing them on different hardware platforms, such as Field-Programmable Gate Array (FPGA) or some special network processors. Nevertheless, all of these options are either less flexible or more expensive to deploy. This research focuses on some possibilities of implementing a parallelized IDS on a general computer environment based on Snort, which is the most popular open-source IDS at the moment. In this thesis, some possible methods have been analyzed for the parallelization of the pattern-matching engine based on a multicore computer. However, owing to the small granularity of the network packets, the pattern-matching engine of Snort is unsuitable for parallelization. In addition, a pipelined structure of Snort has been implemented and analyzed. The universal packet capture API - LibPCAP has been modified for a new feature, which can capture a packet directly to an external buffer. Then, the performance of the pipelined Snort can have an improvement up to 60% on an Intel i7 multicore computer for jumbo frames. A primary limitation is on the memory bandwidth. With a higher bandwidth, the performance of the parallelization can be further improved. Snort IDS Intrusion Detection Multicore Parallelization Pattern Matching
24	The Majo programming language : Creation and analysis of a programming language for parallelization Nilsson, Joel January 2018 (has links) It is well known that parallelization of software is a difficult problem to solve. This project aimed to research a possible solution by creating a programming language for parallelization and subsequently analyzing its syntax and semantics. This analysis consisted of readability and writability tests followed by a subjective discussion from the point of view of the author. The project resulted in the Majo programming language. Majo uses a graph based concurrency model with implicit shared data synchronization. The model is integrated into the languages design, making it easier to use. The analysis of the language showed that the integration of the threading model simplifies the writing of parallel software. However, there are several syntactic elements that could be improved upon, especially regarding the communication between concurrently executing threads. In conclusion, the author believes that the way forward in regards to parallel programming is to make programming languages more human centric and design syntax in a way that intuitively expresses the underlying semantics. Parallelization Programming language Syntax design Majo Software Engineering Programvaruteknik
25	Simulação geoestatística utilizando múltiplos passeios aleatórios Caixeta, Rafael Moniz January 2015 (has links) Simulação geoestatística compreende uma variedade de técnicas que permitem gerar cenários que reproduzem a continuidade espacial e o histograma do fenômeno de interesse. Essas técnicas podem ser usadas para ajudar nas tomadas de decisões, permitindo um acesso à incerteza nas funções de resposta (que dependem dos cenários simulados), geralmente por meio de uma relação não-linear (retorno financeiro, recuperação geometalúrgica...). No entanto, uma de suas limitações é que as simulações podem demandar um tempo considerável para serem executadas em grandes depósitos. E a motivação dessa dissertação se concentra justamente nesse fato, buscando uma alternativa para acelerar o processamento computacional dessas simulações. A opção escolhida para isso foi desenvolver a Simulação via Múltiplos Passeios Aleatórios, que é uma nova abordagem para se realizar simulações geoestatística. Ela combina a krigagem com a simulação de passeios aleatórios independentes, de modo a gerar cenários simulados de uma maneira mais rápida que os algoritmos tradicionais. Essa dissertação apresenta detalhes do método e importantes contribuições desenvolvidas para melhorar o desempenho e a qualidade dos resultados gerados com o método. Foi também desenvolvido um software específico para possibilitar um uso simples, prático e rápido da técnica em qualquer situação (2D ou 3D). Estudos de caso foram feitos para checar a validade das simulações, que demonstraram boa reprodução dos histogramas e variogramas, além de um ganho de velocidade considerável, alcançando uma aceleração de até 5,65 x (em comparação com a Simulação por Bandas Rotativas) na simulação de um depósito de ferro em 3D, desempenho que pode ser melhor ainda à medida que mais dados amostrais estão disponíveis. / Geostatistical simulation comprises a variety of techniques, which allow the generation of multiple scenarios reproducing the spatial continuity and the histogram of the desired phenomenon (grades for instance). These methods can be used in the decision making process, allowing the assess to the uncertainty of functions responses (which depend on the simulated inputs) commonly through a non-linear relationship (net present value, interest tax return, ore geometallurgical recovery…). However, one of their limitations is that running simulation can take a considerable processing time to be executed in large deposits or large grids. Therefore, the motivation of this dissertation focuses on this fact, leading to the main goal, i.e. investigating an alternative to accelerate the simulation process. The option chosen is based on the development and adaptation of the Multiple Random Walk Simulation, which is algorithm to build geostatistical simulations. It combines kriging with the simulation of independent random walks in order to generate simulated scenarios faster than via traditional simulation algorithms. This dissertation presents details of the method and new important contributions developed to improve its performance and statistics reproduction. An algorithm and software was also presented in order to allow a simple, practical and fast use of the method in any condition (2D or 3D). Case studies were developed to check the validity of the simulations, which showed good reproduction of histogram and variograms, in addition to a considerable speed gain, achieving an acceleration up to 5.65 x (in comparison with Turning Bands Simulation) in the simulation of a 3D iron deposit, performance that can be enhanced as more conditioning samples are available. Simulação geoestatística Krigagem Mineração Geostatistics Conditional simulation Mining Parallelization
26	A Case Study of Semi-Automatic Parallelization of Divide and Conquer Algorithms Using Invasive Interactive Parallelization Hansson, Erik January 2009 (has links) <p>Since computers supporting parallel execution have become more and more common the last years, especially on the consumer market, the need for methods and tools for parallelizing existing sequential programs has highly increased. Today there exist different methods of achieving this, in a more or less user friendly way. We have looked at one method, Invasive Interactive Parallelization (IIP), on a special problem area, divide and conquer algorithms, and performed a case study. This case study shows that by using IIP, sequential programs can be parallelized both for shared and distributed memory machines. We have focused on parallelizing Quick Sort for OpenMP and MPI environment using a tool, Reuseware, which is based on the concepts of Invasive Software Composition.</p> parallelization IIP ISC HPC OpenMP MPI Computer science Datavetenskap
27	An Adaptive Mesh MPI Framework for Iterative C++ Programs Silva, Karunamuni Charuka 23 March 2009 (has links) Computational Science and Engineering (CSE) applications often exhibit the pattern of adaptive mesh applications. Adaptive mesh algorithm starts with a coarse base-level grid structure covering entire computational domain. As the computation intensified, individual grid points are tagged for refinement. Such tagged grid points are dynamically overlayed with finer grid points. Similarly if the level of refinement in a cell is greater than required, all such regions are replaced with coarser grids. These refinements proceed recursively. We have developed an object-oriented framework enabling time-stepped adaptive mesh application developers to convert their sequential applications to MPI applications in few easy steps. We present in this thesis our positive experience converting such application using our framework. In addition to the MPI support, framework does the grid expansion/contraction and load balancing making the application developer’s life easier. Battlefield management Time-stepped Parallelization Adaptive mesh Computer Sciences
28	Faster Optimal Design Calculations for Practical Applications Strömberg, Eric January 2011 (has links) PopED is a software developed by the Pharmacometrics Research Group at the Department of Pharmaceutical Biosiences, Uppsala University written mainly in MATLAB. It uses pharmacometric population models to describe the pharmacokinetics and pharmacodynamics of a drug and then estimates an optimal design of a trial for that drug. With optimization calculations in average taking a very long time, it was desirable to increase the calculation speed of the software by parallelizing the serial calculation script. The goal of this project was to investigate different methods of parallelization and implement the method which seemed the best for the circumstances.The parallelization was implemented in C/C++ by using Open MPI and tested on the UPPMAX Kalkyl High-Performance Computation Cluster. Some alterations were made in the original MATLAB script to adapt PopED to the new parallel code. The methods which where parallelized included the Random Search and the Line Search algorithms. The testing showed a significant performance increase, with effectiveness per active core rangingfrom 55% to 89% depending on model and number of evaluated designs. Optimal Design PopED pharmacometrics pharmacokinetics pharmacodynamics Fisher Information Matrix parallelization
29	A Run-Time Loop Parallelization Technique on Shared-Memory Multiprocessor Systems Wu, Chi-Fan 06 July 2000 (has links) High performance computing power is important for the current advanced calculations of scientific applications. A multiprocessor system obtains its high performance from the fact that some computations can proceed in parallel. A parallelizing compiler can take a sequential program as input and automatically translate it into parallel form for the target multiprocessor system. But when loops with arrays of irregular, nonlinear or dynamic access patterns, no any current parallelizing compiler can determine whether data dependences exist at compile-time. Thus a run-time parallel algorithm will be utilized to determine dependences and extract the potential parallelism of loops. In this thesis, we propose an efficient run-time parallelization technique to compute a proper parallel execution schedule in those loops. This new method first detects immediate predecessor iterations of each loop iteration and constructs an immediate predecessor table, then efficiently schedules the whole loop iterations into wavefronts for parallel execution. According to either theoretical analysis or experimental results, our new run-time parallelization technique reveals high speedup and low processing overhead. Furthermore, this new technique is appropriate to implement on multiprocessor systems due to the characteristics of high scalability. Run-time parallelization Parallelizing compiler Multiprocessor system Wavefront scheduling
30	Parallel algorithms for inductance extraction Mahawar, Hemant 17 September 2007 (has links) In VLSI circuits, signal delays play an important role in design, timing verification and signal integrity checks. These delays are attributed to the presence of parasitic resistance, capacitance and inductance. With increasing clock speed and reducing feature sizes, these delays will be dominated by parasitic inductance. In the next generation VLSI circuits, with more than millions of components and interconnect segments, fast and accurate inductance estimation becomes a crucial step. A generalized approach for inductance extraction requires the solution of a large, dense, complex linear system that models mutual inductive effects among circuit elements. Iterative methods are used to solve the system without explicit computation of the system matrix itself. Fast hierarchical techniques are used to compute approximate matrix-vector products with the dense system matrix in a matrix-free way. Due to unavailability of system matrix, constructing a preconditioner to accelerate the convergence of the iterative method becomes a challenging task. This work presents a class of parallel algorithms for fast and accurate inductance extraction of VLSI circuits. We use the solenoidal basis approach that converts the linear system into a reduced system. The reduced system of equations is solved by a preconditioned iterative solver that uses fast hierarchical methods to compute products with the dense coefficient matrix. A GreenÃÂ¢ÃÂÃÂs function based preconditioner is proposed that achieves near-optimal convergence rates in several cases. By formulating the preconditioner as a dense matrix similar to the coefficient matrix, we are able to use fast hierarchical methods for the preconditioning step as well. Experiments on a number of benchmark problems highlight the efficient preconditioning scheme and its advantages over FastHenry. To further reduce the solution time of the software, we have developed a parallel implementation. The parallel software package is capable of analyzing interconnects con- figurations involving several conductors within reasonable time. A two-tier parallelization scheme enables mixed mode parallelization, which uses both OpenMP and MPI directives. The parallel performance of the software is demonstrated through experiments on the IBM p690 and AMD Linux clusters. These experiments highlight the portability and efficiency of the software on multiprocessors with shared, distributed, and distributed-shared memory architectures. Preconditioning Inductance Extraction Parallel Algorithms Mixed Mode Parallelization Iterative Methods

Search results