Spelling suggestions: "subject:"ppc"" "subject:"dppc""
131 |
EN JÄMFÖRELSE AV BERÄKNINGSNODER AVSEENDE ENERGIEFFEKTIVITET OCH FÖRMÅGAN ATT BERÄKNA FLYTTALSOPERATIONER I ETT MICROSOFT HPC-KLUSTERKronlund, Marcus January 2012 (has links)
Beräkningskluster används exempelvis till vädersimuleringar eller produktsimulering. Microsoft HPC-kluster tillhandahåller två olika typer av beräkningsnoder var av den ena är Computenod, som körs med operativsystemet Windows Server 2008 R2, och den andra är Workstationnod, som körs med operativsystemet Windows 7. Arbetets syfte är att jämföra operativsystemen Windows 7 och Windows Server 2008 R2 för att se om de presterar likartat som en beräkningsnod. Detta avgörs med avseende på energieffektivitet samt hur de presterar i Linpack. Linpack är ett prestandaverktyg som mäter ett beräkningsklusters beräkningsförmåga i flyttalsoperationer per sekund. Studien utförs genom en experimentell metod. Några studier om att operativsystemen Windows 7 och Windows Server 2008 R2 presterar likartat finns inte. Därför motiveras det till att verifiera hypotesen att de ska prestera likartat inom beräkningskluster. Eftersom båda operativsystemen är byggda på Windows NT 6.1, bör de prestera likartat (Microsoft msdn, 2012). Studier av Narayan och Shi (2009, 2010) visar att operativsystem presterar olika med TCP och UDP protokollen. De visar även att operativsystemen presterar olika på applikationslagret. En annan studie av Abouelhoda och Mohamed (2009) visar att valet av operativsystem påverkar resultaten för deras testverktyg, WinBioinfTools. Testverktyget utvärderades på Linux-kluster och Microsoft HPC-kluster. Sottile och Minnich (2004) visar i sin studie att beräkningsförmågan påverkas av operativsystemen. Bidraget för denna studie är att administratörer ska kunna använda resultaten som underlag när de ska motivera valet av vilken typ av beräkningsnod som ska väljas till ledningen i organisationer eller företag. Resultatet visar att operativsystemen presterar ungefär lika efter att vissa processer har stängts av i Windows 7. De processer som stängts av körs inte på operativsystemet Windows Server 2008 R2 utan endast på Windows 7. En slutsats som dras är att processerna som körs påverkar resultaten. Processerna bör därför stängas av om de inte är nödvändiga för företaget eller organisationen. Stängs onödiga processer av, ökar energieffektiviteten och prestandan för beräkningsklustret vilket medför att bidraget till den globala uppvärmningen minskar eftersom energin går åt till att beräkna uppgifterna och inte onödiga processer.
|
132 |
ServerklustringFendell, Robert, Nordström, Philemond January 2014 (has links)
Klustring innebär att flera servrar arbetar tillsammans och på så sätt klarar av en uppgift som en ensam server inte skulle klara. Klustring kan också användas för att säkra drift med hjälp av en eller ett flertal servrar som väntar i startgropen om den aktiva servern som tillhandahåller tjänsten går ner. Detta examensarbete utfördes genom att först undersöka vilka klusterlösningar som fanns tillgängliga. Därefter utfördes intervjuer med företag som använde sig av olika typer av klustring. Laborationer gjordes för vidare undersökning av några av de mjukvaror som de intervjuade företagen använde. Det upptäcktes att variationen av lösningar hos de intervjuade företagen inte var så stor som förväntat innan arbetet påbörjades. Efter genomförd litteraturstudien och intervjumaterialet, för att sedan ge rekommendationer på vilka lösningar som bör väljas av exempelföretag utifrån vilka krav och kriterier som finns hos dessa.
|
133 |
Instalace a konfigurace Octave výpočetního clusteru / Installation and configuration of Octave computation clusterMikulka, Zdeněk January 2014 (has links)
This diploma thesis contains detailed design of high-performance cluster, primarely focused for parallel computing in Octave application. Each of component of this cluster is described along with instructions for installation and configuration. Cluster is based on GNU/Linux operating system and Message Parsing Interface. Design alllows implementation of this cluster in computers of schoolroom with active lessons.
|
134 |
Espaces grossiers pour les méthodes de décomposition de domaine avec conditions d'interface optimisées / Coarse spaces for domain decomposition method with optimized transmission conditionsHaferssas, Ryadh Mohamed 23 November 2016 (has links)
L'objectif de cette thèse est la conception, l'analyse et l'implémentation d'une méthode de décomposition de domaine efficiente pour des problèmes de la mécanique des solides et des fluides. Pour cela les méthodes de Schwarz optimisée (OSM) sont considérées et révisées. Les méthodes de décomposition de domaine de Schwarz optimisées ont été introduites par P.L. Lions, elles apportent une amélioration aux méthodes de Schwarz classiques en substituant les conditions d'interface de Dirichlet par des conditions de type Robin et cela pour les méthodes avec ou sans recouvrement. Les conditions de Robin offrent un très bon levier qui nous permet d'aller vers l'optimalité des méthodes de Schwarz ainsi que la conception d'une méthode de décomposition de domaine robuste pour des problèmes de mécanique complexes comportant une nature presque incompressible. Dans cette thèse un nouveau cadre mathématique est introduit qui consiste à munir les méthodes de Schwarz optimisées (e.g. L'algorithme de Lions ) d'une théorie semblable à celle déjà existante pour des méthodes de Schwarz additives, on définit un espace grossier pour lequel le taux de convergence de la méthode à deux niveaux peut être prescrit, indépendamment des éventuelles hétérogénéités du problème traité. Une formulation sous forme de preconditioneur de la méthode à deux niveaux est proposée qui permettra la simulation parallèle d'un large spectre de problèmes mécanique, tel que le problème d'élasticité presque incompressible, le problème de Stokes incompressible ainsi que le problème instationnaire de Navier-Stokes. Des résultats numériques issues de simulations parallèles à grande échelle sur plusieurs milliers de processeurs sont présentés afin de montrer la robustesse de l'approche proposée. / The objective of this thesis is to design an efficient domain decomposition method to solve solid and fluid mechanical problems, for this, Optimized Schwarz methods (OSM) are considered and revisited. The optimized Schwarz methods were introduced by P.L. Lions. They consist in improving the classical Schwarz method by replacing the Dirichlet interface conditions by a Robin interface conditions and can be applied to both overlapping and non overlapping subdomains. Robin conditions provide us an another way to optimize these methods for better convergence and more robustness when dealing with mechanical problem with almost incompressibility nature. In this thesis, a new theoretical framework is introduced which consists in providing an Additive Schwarz method type theory for optimized Schwarz methods, e.g. Lions' algorithm. We define an adaptive coarse space for which the convergence rate is guaranteed regardless of the regularity of the coefficients of the problem. Then we give a formulation of a two-level preconditioner for the proposed method. A broad spectrum of applications will be covered, such as incompressible linear elasticity, incompressible Stokes problems and unstationary Navier-Stokes problem. Numerical results on a large-scale parallel experiments with thousands of processes are provided. They clearly show the effectiveness and the robustness of the proposed approach.
|
135 |
Simulace šíření ultrazvuku v kostech / Simulation of Ultrasound Propagation in BonesKadlubiak, Kristián January 2017 (has links)
It is estimated that mind-boggling 14.1 million new cases of cancer occurred worldwide in 2012 alone. This number is alarming. Although healthy lifestyle may reduce a risk of developing cancer, there is always some probability that cancer would develop even in an absolutely fit individual. There are two main conditions for successful treatment of cancer. Firstly, early diagnostic is absolutely crucial. Secondly, there is a need for suitable surgical methods for affected tissue removal. Ultrasound has a great potential to be used for both purposes as a non-invasive method. Photoacoustic spectroscopy is imaging method for tumor detection of great properties making the use of ultrasound while High-Intensity Focused Ultrasound (HIFU) is non-invasive surgical method. These methods would be impossible without precise ultrasound propagation simulations. The k-Wave is an open source MATLAB toolbox implementing such simulations. So, why are not these methods already deployed in treatment? Unfortunately, the simulation of ultrasound propagation is a very time consuming task, which makes it ineffective for medical purposes. However, there are a few options how to accelerate these simulations. The use of GPU is a very promising way to accelerate simulation. The main topic of this thesis is the acceleration of the simulation of soundwaves propagation in bones and hard tissue. The implementation developed as a part of this thesis was benchmarked on various supercomputers including Anselm in Ostrava and Piz Daint in Lugano. The implemented solution provides remarkable acceleration compared to the original MATLAB prototype. It was able to accelerate the simulation around 160 times in the best case. It means that the simulation, which would otherwise last for 6.5 days, can be now computed in one hour. This acceleration was achieved using an NVIDIA Tesla P100 to run the simulation with the domain size of 416x416x416 grid points. The thesis includes performance benchmarks on different GPUs to provide complex image acceleration capabilities of developed implementation and provides discussion about memory usage and numerical accuracy. Thanks to the implemented solution harnessing the power of modern GPUs, doctors and researchers all around the world have a powerful tool in hands.
|
136 |
Efektivní komunikace v multi-GPU systémech / Efficient Communication in Multi-GPU SystemsŠpeťko, Matej January 2018 (has links)
After the introduction of CUDA by Nvidia, the GPUs became devices capable of accelerating any general purpose computation. GPUs are designed as parallel processors which posses huge computation power. Modern supercomputers are often equipped with GPU accelerators. Sometimes the performance or the memory capacity of a single GPU is not enough for a scientific application. The application needs to be scaled into multiple GPUs. During the computation there is need for the GPUs to exchange partial results. This communication represents computation overhead. For this reason it is important to research the methods of the effective communication between GPUs. This means less CPU involvement, lower latency, shared system buffers. Inter-node and intra-node communication is examined. The main focus is on GPUDirect technologies from Nvidia and CUDA-Aware MPI. Subsequently k-Wave toolbox for simulating the propagation of acoustic waves is introduced. This application is accelerated by using CUDA-Aware MPI.
|
137 |
Neblokující vstup/výstup pro projekt k-Wave / Non-Blocking Input/Output for the k-Wave ToolboxKondula, Václav January 2020 (has links)
This thesis deals with an implementation of non-blocking I/O interface for the k-Wave project, which is designed for time-domain simulation of ultrasound propagation. Main focus is on large domain simulations that, due to high computing power requirements, must run on supercomputers and produce tens of GB of data in a single simulation step. In this thesis, I have designed and implemented a non-blocking interface for storing data using dedicated threads, which allows to overlap simulation calculations with disk operations in order to speed up the simulation. An acceleration of up to 33% was achieved compared to the current implementation of project k-Wave, which resulted, among other things, also to reduce cost of the simulation.
|
138 |
Event Sequence Identification and Deep Learning Classification for Anomaly Detection and Predication on High-Performance Computing SystemsLi, Zongze 12 1900 (has links)
High-performance computing (HPC) systems continue growing in both scale and complexity. These large-scale, heterogeneous systems generate tens of millions of log messages every day. Effective log analysis for understanding system behaviors and identifying system anomalies and failures is highly challenging. Existing log analysis approaches use line-by-line message processing. They are not effective for discovering subtle behavior patterns and their transitions, and thus may overlook some critical anomalies. In this dissertation research, I propose a system log event block detection (SLEBD) method which can extract the log messages that belong to a component or system event into an event block (EB) accurately and automatically. At the event level, we can discover new event patterns, the evolution of system behavior, and the interaction among different system components. To find critical event sequences, existing sequence mining methods are mostly based on the a priori algorithm which is compute-intensive and runs for a long time. I develop a novel, topology-aware sequence mining (TSM) algorithm which is efficient to generate sequence patterns from the extracted event block lists. I also train a long short-term memory (LSTM) model to cluster sequences before specific events. With the generated sequence pattern and trained LSTM model, we can predict whether an event is going to occur normally or not. To accelerate such predictions, I propose a design flow by which we can convert recurrent neural network (RNN) designs into register-transfer level (RTL) implementations which are deployed on FPGAs. Due to its high parallelism and low power, FPGA achieves a greater speedup and better energy efficiency compared to CPU and GPU according to our experimental results.
|
139 |
Instalace a konfigurace Octave výpočetního clusteru / Installation and configuration of Octave computation clusterMikulka, Zdeněk January 2014 (has links)
This diploma thesis contains detailed design of high-performance cluster, primarely focused for parallel computing in Octave application. Each of component of this cluster is described along with instructions for installation and configuration. Cluster is based on GNU/Linux operating system and Message Parsing Interface. Design alllows implementation of this cluster in computers of schoolroom with active lessons.
|
140 |
Akcelerace částicových rojů PSO pomocí GPU / Particle Swarm Optimization on GPUsZáň, Drahoslav January 2013 (has links)
This thesis deals with a population based stochastic optimization technique PSO (Particle Swarm Optimization) and its acceleration. This simple, but very effective technique is designed for solving difficult multidimensional problems in a wide range of applications. The aim of this work is to develop a parallel implementation of this algorithm with an emphasis on acceleration of finding a solution. For this purpose, a graphics card (GPU) providing massive performance was chosen. To evaluate the benefits of the proposed implementation, a CPU and GPU implementation were created for solving a problem derived from the known NP-hard Knapsack problem. The GPU application shows 5 times average and almost 10 times the maximum speedup of computation compared to an optimized CPU application, which it is based on.
|
Page generated in 0.0325 seconds