381 |
Processamento paralelo na simulação de campos eletromagnéticos pelo método das diferenças finitas no domínio do tempo - FDTD. / Parallel processing in the electromagnetic fields simulation with the finite-difference time-domain method - FDTD.Marcelo Porto Trevizan 08 January 2007 (has links)
São crescentes as pesquisas e os projetos envolvendo o eletromagnetismo. Tanto para as pesquisas quanto para os projetos, tem-se o recurso de realizar simulações computacionais dos problemas envolvidos, a fim de investigar o comportamento dos fenômenos eletromagnéticos diante da situação na qual encontram-se. Há casos, contudo, em que o problema pode ficar computacionalmente grande, requisitando maior quantidade de memória e maior tempo de processamento, devido às geometrias envolvidas ou à acuracidade desejada. Com o objetivo de contornar estas questões, tem-se o desenvolvimento da computação paralela. Uma das implementações possíveis de sistema paralelizado é por meio de uma rede de computadores e, empregando-se programas gratuitos, tem-se sua realização a custo praticamente nulo. O presente trabalho, utilizando o método FDTD, visa a implementação de tal sistema paralelizado. Entretanto, na etapa de desenvolvimento, uma especial atenção foi dada às boas práticas de programação, com o objetivo de garantir ao programa flexibilidade, modularidade e expansibilidade. Adicionalmente, desenvolveu-se uma ferramenta matemática para estimar o tempo de processamento total de uma simulação paralelizada, além de fornecer indicativos de ajustes de parâmetros para que este tempo seja o menor possível. Validam-se o código, o sistema paralelizado e a ferramenta matemática com alguns exemplos. Finalmente, realiza-se um estudo para uma aplicação prática de interesse com a ferramenta desenvolvida. / Researches and projects involving electromagnetic problems are continuously increasing. As much for researches as for projects, there is a resource of achieving computer simulations for the involved problems aiming to investigate the electromagnetic phenomenons behavior, in the situation they are. There are cases, however, the problem results in high computational size, requesting more memories sizes and high processing times, because of the given geometries or high accuracy wanted. With the intent of solving these questions, the parallel computation developing becomes interesting. One of the possible implementations of this parallel system is the use of a computer network. Besides, using free programms, the implementation has almost any costs. The present work, using the FDTD method, aims at the implementation of this parallel system. However, during the development stage, a special attention was given to the programming practices, with the intent of guaranteeing the flexibility, modularity and expansibility of the program. In addition, a mathematic tool was developed to estimate the total processing time of the parallel simulation and to predict indications for adjustments of parameters to reach the minimum time possible. The code, the parallel system and the mathematic tool are validated with some examples. Finally, a study for a practical aplication of interest is done with the developed tool.
|
382 |
Processamento paralelo na simulação de campos eletromagnéticos pelo método das diferenças finitas no domínio do tempo - FDTD. / Parallel processing in the electromagnetic fields simulation with the finite-difference time-domain method - FDTD.Trevizan, Marcelo Porto 08 January 2007 (has links)
São crescentes as pesquisas e os projetos envolvendo o eletromagnetismo. Tanto para as pesquisas quanto para os projetos, tem-se o recurso de realizar simulações computacionais dos problemas envolvidos, a fim de investigar o comportamento dos fenômenos eletromagnéticos diante da situação na qual encontram-se. Há casos, contudo, em que o problema pode ficar computacionalmente grande, requisitando maior quantidade de memória e maior tempo de processamento, devido às geometrias envolvidas ou à acuracidade desejada. Com o objetivo de contornar estas questões, tem-se o desenvolvimento da computação paralela. Uma das implementações possíveis de sistema paralelizado é por meio de uma rede de computadores e, empregando-se programas gratuitos, tem-se sua realização a custo praticamente nulo. O presente trabalho, utilizando o método FDTD, visa a implementação de tal sistema paralelizado. Entretanto, na etapa de desenvolvimento, uma especial atenção foi dada às boas práticas de programação, com o objetivo de garantir ao programa flexibilidade, modularidade e expansibilidade. Adicionalmente, desenvolveu-se uma ferramenta matemática para estimar o tempo de processamento total de uma simulação paralelizada, além de fornecer indicativos de ajustes de parâmetros para que este tempo seja o menor possível. Validam-se o código, o sistema paralelizado e a ferramenta matemática com alguns exemplos. Finalmente, realiza-se um estudo para uma aplicação prática de interesse com a ferramenta desenvolvida. / Researches and projects involving electromagnetic problems are continuously increasing. As much for researches as for projects, there is a resource of achieving computer simulations for the involved problems aiming to investigate the electromagnetic phenomenons behavior, in the situation they are. There are cases, however, the problem results in high computational size, requesting more memories sizes and high processing times, because of the given geometries or high accuracy wanted. With the intent of solving these questions, the parallel computation developing becomes interesting. One of the possible implementations of this parallel system is the use of a computer network. Besides, using free programms, the implementation has almost any costs. The present work, using the FDTD method, aims at the implementation of this parallel system. However, during the development stage, a special attention was given to the programming practices, with the intent of guaranteeing the flexibility, modularity and expansibility of the program. In addition, a mathematic tool was developed to estimate the total processing time of the parallel simulation and to predict indications for adjustments of parameters to reach the minimum time possible. The code, the parallel system and the mathematic tool are validated with some examples. Finally, a study for a practical aplication of interest is done with the developed tool.
|
383 |
DATA MINING: TRACKING SUSPICIOUS LOGGING ACTIVITY USING HADOOPSodhi, Bir Apaar Singh 01 March 2016 (has links)
In this modern rather interconnected era, an organization’s top priority is to protect itself from major security breaches occurring frequently within a communicational environment. But, it seems, as if they quite fail in doing so. Every week there are new headlines relating to information being forged, funds being stolen and corrupt usage of credit card and so on. Personal computers are turned into “zombie machines” by hackers to steal confidential and financial information from sources without disclosing hacker’s true identity. These identity thieves rob private data and ruin the very purpose of privacy. The purpose of this project is to identify suspicious user activity by analyzing a log file which then later can help an investigation agency like FBI to track and monitor anonymous user(s) who seek for weaknesses to attack vulnerable parts of a system to have access of it. The project also emphasizes the potential damage that a malicious activity could have on the system. This project uses Hadoop framework to search and store log files for logging activities and then performs a ‘Map Reduce’ programming code to finally compute and analyze the results.
|
384 |
High-Level Parallel Programming of Computation-Intensive Algorithms on Fine-Grained ArchitectureCheema, Fahad Islam January 2009 (has links)
<p>Computation-intensive algorithms require a high level of parallelism and programmability, which </p><p>make them good candidate for hardware acceleration using fine-grained processor arrays. Using </p><p>Hardware Description Language (HDL), it is very difficult to design and manage fine-grained </p><p>processing units and therefore High-Level Language (HLL) is a preferred alternative. </p><p> </p><p>This thesis analyzes HLL programming of fine-grained architecture in terms of achieved </p><p>performance and resource consumption. In a case study, highly computation-intensive algorithms </p><p>(interpolation kernels) are implemented on fine-grained architecture (FPGA) using a high-level </p><p>language (Mitrion-C). Mitrion Virtual Processor (MVP) is extracted as an application-specific </p><p>fine-grain processor array, and the Mitrion development environment translates high-level design </p><p>to hardware description (HDL). </p><p> </p><p>Performance requirements, parallelism possibilities/limitations and resource requirement for </p><p>parallelism vary from algorithm to algorithm as well as by hardware platform. By considering </p><p>parallelism at different levels, we can adjust the parallelism according to available hardware </p><p>resources and can achieve better adjustment of different tradeoffs like gates-performance and </p><p>memory-performance tradeoffs. This thesis proposes different design approaches to adjust </p><p>parallelism at different design levels. For interpolation kernels, different parallelism levels and </p><p>design variants are proposed, which can be mixed to get a well-tuned application and resource </p><p>specific design.</p>
|
385 |
A TIME-AND-SPACE PARALLELIZED ALGORITHM FOR THE CABLE EQUATIONLi, Chuan 01 August 2011 (has links)
Electrical propagation in excitable tissue, such as nerve fibers and heart muscle, is described by a nonlinear diffusion-reaction parabolic partial differential equation for the transmembrane voltage $V(x,t)$, known as the cable equation. This equation involves a highly nonlinear source term, representing the total ionic current across the membrane, governed by a Hodgkin-Huxley type ionic model, and requires the solution of a system of ordinary differential equations. Thus, the model consists of a PDE (in 1-, 2- or 3-dimensions) coupled to a system of ODEs, and it is very expensive to solve, especially in 2 and 3 dimensions.
In order to solve this equation numerically, we develop an algorithm, extended from the Parareal Algorithm, to efficiently incorporate space-parallelized solvers into the framework of the Parareal algorithm, to achieve time-and-space parallelization. Numerical results and comparison of the performance of several serial, space-parallelized and time-and-space-parallelized time-stepping numerical schemes in one-dimension and in two-dimensions are also presented.
|
386 |
Melt convection in welding and crystal growthDo-Quang, Minh January 2004 (has links)
A parallel finite element code with adaptive meshing was developed and used to study three dimensional, time-dependent fluid flows caused by thermocapillary convection as well as temperature and dopant distribution in fusion welding and floating zone crystal growth. A comprehensive numerical model of the three dimensional time-dependent fluid flows in a weld pool had been developed. This model considered most of the physical mechanisms involved in gas tungsten arc welding. The model helped obtaining the actual chaotic time-dependent melt flow. It was found that the fluid flow in the weld pool was highly complex and influenced the weld pool’s depth and width. The physicochemical model had also been studied and applied numerically in order to simulate the surfactant adsorption onto the surface effect to the surface tension of the metal liquid in a weld pool. Another model, a three dimensional time-dependent, with adaptive mesh refinement and coarsening was applied for simulating the effect of weak flow on the radial segregation in floating zone crystal growth. The phase change equation was also included in this model in order to simulate the real interface shape of floating zone. In the new parallel code, a scheme that keeps the level of node and face instead of the complete history of refinements was utilized to facilitate derefinement. The information was now local and the exchange of information between each and every processor during the derefinement process was minimized. This scheme helped to improve the efficiency of the parallel adaptive solver. / QC 20100527
|
387 |
PARALLEL COMPUTING ALGORITHMS FOR TANDEM2013 April 1900 (has links)
Tandem mass spectrometry, also known as MS/MS, is an analytical technique to measure the mass-to-charge ratio of charged ions and widely used in genomics, proteomics and metabolomics areas. There are two types of automatic ways to interpret tandem mass spectra: de novo methods and database searching methods. Both of them need to use massive computational resources and complicated comparison algorithms. The real-time peptide-spectrum matching (RT-PSM) algorithm is a database searching method to interpret tandem mass spectra with strict time constraints. Restricted by the hardware and architecture of an individual workstation the RT-PSM algorithm has to sacrifice the level of accuracy in order to provide prerequisite processing speed. The peptide-spectrum similarity scoring module is the most time-consuming part out of four modules in the RT-PSM algorithm, which is also the core of the algorithm.
In this study, a multi-core computing algorithm is developed for individual workstations. Moreover, a distributed computing algorithm is designed for a cluster. The improved algorithms can achieve the speed requirement of RT-PSM without sacrificing the accuracy. With some expansion, this distributed computing algorithm can also support different PSM algorithms. Simulation results show that compared with the original RT-PSM, the parallelization version achieves 25 to 34 times speed-up based on different individual workstations. A cluster with 240 CPU cores could accelerate the similarity score module 210 times compare with the single-thread similarity score module and the whole peptide identification process 85 times compare with the single-thread peptide identification process.
|
388 |
A model of dynamic compilation for heterogeneous compute platformsKerr, Andrew 10 December 2012 (has links)
Trends in computer engineering place renewed emphasis on increasing parallelism and heterogeneity.
The rise of parallelism adds an additional dimension to the challenge of portability, as
different processors support different notions of parallelism, whether vector parallelism executing
in a few threads on multicore CPUs or large-scale thread hierarchies on GPUs. Thus, software
experiences obstacles to portability and efficient execution beyond differences in instruction sets;
rather, the underlying execution models of radically different architectures may not be compatible.
Dynamic compilation applied to data-parallel heterogeneous architectures presents an abstraction
layer decoupling program representations from optimized binaries, thus enabling portability without
encumbering performance. This dissertation proposes several techniques that extend dynamic
compilation to data-parallel execution models. These contributions include:
- characterization of data-parallel workloads
- machine-independent application metrics
- framework for performance modeling and prediction
- execution model translation for vector processors
- region-based compilation and scheduling
We evaluate these claims via the development of a novel dynamic compilation framework,
GPU Ocelot, with which we execute real-world workloads from GPU computing. This enables
the execution of GPU computing workloads to run efficiently on multicore CPUs, GPUs, and a
functional simulator. We show data-parallel workloads exhibit performance scaling, take advantage
of vector instruction set extensions, and effectively exploit data locality via scheduling which
attempts to maximize control locality.
|
389 |
Soporte arquitectónico a la sincronización imparcial de lectores y escritores en computadores paralelosVallejo Gutiérrez, Enrique 10 June 2010 (has links)
La evolución tecnológica en el diseño de microprocesadores ha conducido a sistemas paralelos con múltiples hilos de ejecución. Estos sistemas son más difíciles de programar y presentan overheads mayores que los sistemas uniprocesadores tradicionales, que pueden limitar su rendimiento y escalabilidad: sincronización, coherencia, consistencia y otros mecanismos requeridos para garantizar una ejecución correcta. La programación paralela tradicional se basa en primitivas de sincronización como barreras y locks de lectura/escritura, con alta tendencia a fallos de programación. La Memoria Transaccional (TM) oculta estos problemas de sincronización al programador; sin embargo, múltiples sistemas TM aún se basan en locks, y se beneficiarían de una implementación eficiente de los mismos.Esta tesis presenta nuevas técnicas hardware para acelerar la ejecución de estos programas paralelos. Proponemos un sistema TM híbrido basado en locks de lectura/escritura, que minimiza los overheads del software cuando la aceleración hardware está presente. Desarrollamos un mecanismo para garantizar fairness entre transacciones hardware y software. Introducimos un mecanismo distribuido de aceleración de locks de lectura/escritura, llamado Lock Control Unit. Finalmente, proponemos una organización de multiprocesadores basadas en Kilo-Instruction Processors que garantiza Consistencia Secuencial y permite especulación en secciones críticas. / Technological evolution in microprocessor design has led to parallel systems with multiple execution threads. These systems are more difficult to program and present higher performance overheads than the traditional uniprocessor systems, what may limit their performance and scalability: synchronization, coherence, consistency and other mechanisms required to guarantee a correct execution. Traditional parallel programming is based on synchronization primitives such as barriers, critical sections and reader/writer locks, highly prone to programming errors. Transactional Memory (TM) removes the synchronization problems from the programmer. However, many TM systems still rely on reader/writer locks, and would get benefited from an efficient implementation.This thesis presents new hardware techniques to accelerate the execution of such parallel programs. We propose a Hybrid TM system based on reader/writer locks, which minimizes the software overheads when acceleration hardware is present, still allowing for correct software-only execution. We propose a mechanism to guarantee fairness between hardware and software transactions is provided. We introduce a low-cost distributed mechanism named the Lock Control Unit to handle fine-grain reader-writer locks. Finally, we propose an organization of a mutiprocessor based on Kilo-Instruction Processors, which guarantees Sequential Consistency while allowing for speculation in critical sections.
|
390 |
High-Level Parallel Programming of Computation-Intensive Algorithms on Fine-Grained ArchitectureCheema, Fahad Islam January 2009 (has links)
Computation-intensive algorithms require a high level of parallelism and programmability, which make them good candidate for hardware acceleration using fine-grained processor arrays. Using Hardware Description Language (HDL), it is very difficult to design and manage fine-grained processing units and therefore High-Level Language (HLL) is a preferred alternative. This thesis analyzes HLL programming of fine-grained architecture in terms of achieved performance and resource consumption. In a case study, highly computation-intensive algorithms (interpolation kernels) are implemented on fine-grained architecture (FPGA) using a high-level language (Mitrion-C). Mitrion Virtual Processor (MVP) is extracted as an application-specific fine-grain processor array, and the Mitrion development environment translates high-level design to hardware description (HDL). Performance requirements, parallelism possibilities/limitations and resource requirement for parallelism vary from algorithm to algorithm as well as by hardware platform. By considering parallelism at different levels, we can adjust the parallelism according to available hardware resources and can achieve better adjustment of different tradeoffs like gates-performance and memory-performance tradeoffs. This thesis proposes different design approaches to adjust parallelism at different design levels. For interpolation kernels, different parallelism levels and design variants are proposed, which can be mixed to get a well-tuned application and resource specific design.
|
Page generated in 0.1238 seconds