Global ETD Search

231	Reducing Inter-Process Communication Overhead in Parallel Sparse Matrix-Matrix Multiplication Ahmed, Salman, Houser, Jennifer, Hoque, Mohammad A., Raju, Rezaul, Pfeiffer, Phil 01 July 2017 (has links) Parallel sparse matrix-matrix multiplication algorithms (PSpGEMM) spend most of their running time on inter-process communication. In the case of distributed matrix-matrix multiplications, much of this time is spent on interchanging the partial results that are needed to calculate the final product matrix. This overhead can be reduced with a one-dimensional distributed algorithm for parallel sparse matrix-matrix multiplication that uses a novel accumulation pattern based on the logarithmic complexity of the number of processors (i.e., O (log (p)) where p is the number of processors). This algorithm's MPI communication overhead and execution time were evaluated on an HPC cluster, using randomly generated sparse matrices with dimensions up to one million by one million. The results showed a reduction of inter-process communication overhead for matrices with larger dimensions compared to another one dimensional parallel algorithm that takes O(p) run-time complexity for accumulating the results. communication overhead MPI communication parallel computing performance analysis scalability sparse matrix-matrix multiplication Computing
232	Paralelizace sledování paprsku / Parallelization of Ray Tracing Čižek, Martin January 2009 (has links) Ray tracing is widely used technique for realistic rendering of computer scenes. Its major drawback is time needed to compute the image, therefore it's usually parallelized. This thesis describes parallelization and ray tracing in general. It explains the possibility of how can be ray tracing parallelized as well as it defines the problems which may occur during the process. The result is parallel rendering application which uses selected ray tracing software and measurement of how successful this application is.
233	Accelerator-enabled Communication Middleware for Large-scale Heterogeneous HPC Systems with Modern Interconnects Chu, Ching-Hsiang January 2020 (has links) No description available. Computer Engineering Computer Science
234	A C++ based MPI-enabled Tasking Framework to Efficiently Parallelize Fast Multipole Methods for Molecular Dynamics Haensel, David 31 August 2018 (has links) Today's supercomputers gain their performance through a rapidly increasing number of cores per node. To tackle issues arising from those developments new parallelization approaches guided by modern software engineering are inevitable. The concept of task-based parallelization is a promising candidate to overcome many of those challenges. However, for latency-critical applications, like molecular dynamics, available tasking frameworks introduce considerable overheads. In this work a lightweight task engine for latency-critical applications is proposed. The main contributions of this thesis are a static data-flow dispatcher, a type-driven priority scheduler and an extension for communication-enabled tasks. The dispatcher allows a user-configurable mapping of algorithmic dependencies in the task-engine at compile-time. Resolving these dependencies at compile-time reduces the run-time overhead. The scheduler enables the prioritized execution of a critical path of an algorithm. Additionally, the priorities are deduced from the task type at compile-time as well. Furthermore, the aforementioned task engine supports inter-node communication via message passing. The provided communication interface drastically simplifies the user interface of inter-node communication without introducing additional performance penalties. This is only possible by distinguishing two developer roles -- the library developer and the algorithm developer. All proposed components follow a strict guideline to increase the maintainability for library developers and the usability for algorithm developers. To reach this goal a high level of abstraction and encapsulation is required in the software stack. As proof of concept the communication-enabled task engine is utilized to parallelize the FMM for molecular dynamics. info:eu-repo/classification/ddc/004 ddc:004
235	Användande av myocardial performance index vid bedömning av vänster och höger kammares systoliska och diastoliska funktion / Assessment of systolic and diastolic function in both ventricles with myocardial performance index Lundqvist, Michelle January 2023 (has links) Vid en ekokardiografisk undersökning läggs fokus främst på klaffunktion, hjärtrumsstorlek och vänstersidans systoliska funktion. Vänstersidans diastoliska funktion har dock börjat få större betydelse, men upplevs ofta vara svårbedömd. Höger kammare har en komplex anatomi med ett trabekulerat myokardie och är otillgängligt placerad i bröstkorgen, vilket gör den svårare att bedöma än vänster kammare. 1995 publicerades ett index för bedömning av hjärtats kombinerade systoliska och diastoliska funktion, myocardial performance index (MPI). Syftet med studien var att undersöka om MPI kan vara en användbar och kompletterande metod vid bedömning av systolisk och diastolisk funktion i höger respektive vänster kammare. I den aktuella studien ingick 33 personer i åldrarna 21–80. MPI beräknades med pulsad vävnadsdoppler under en hjärtcykel. MPI jämfördes mot traditionella ekokardiografiska mätmetoder som speglar systolisk och diastolisk funktion för vänster respektive höger kammare. Normalfördelnings-, korrelations- och överrensstämmelseanalyser utfördes. För vänstersidig kammarfunktion sågs en signifikant korrelation mellan MPI och MAPSE. Ingen eller dålig överensstämmelse sågs mellan MPI och samtliga traditionella mätmetoder för systolisk funktion. För högersidig kammarfunktion sågs en starkt signifikant korrelation mellan MPI och FAC samt TAPSE. Mindre god överensstämmelse sågs mellan högersidans MPI och FAC samt TAPSE. För MPI och E/e’ sågs ingen signifikant korrelation hos vare sig vänster eller höger kammare och en sämre överensstämmelse än om klassificeringen hade gjorts rent slumpmässigt. Användbarheten av MPI för bedömning av vänster kammarfunktion anses, baserat på aktuell studie, vara låg. MPI kan vara användbart vid bedömning av höger kammares systoliska funktion, men inte avseende diastolisk funktion. Diastolisk funktion ekokardiografi höger kammare myocardial performance index systolisk funktion TDI-MPI Cardiac and Cardiovascular Systems Kardiologi
236	Approximate Bayesian Inference based on Dense Matrices and New Features using INLA Abdul Fattah, Esmail 30 July 2023 (has links) The Integrated Nested Laplace Approximations (INLA) method has become a commonly used tool for researchers and practitioners to perform approximate Bayesian inference for various fields of applications. It has become essential to incorporate more complex models and expand the method’s capabilities with more features. In this dissertation, we contribute to the INLA method in different aspects. First, we present a new framework, INLA$^+$, based on dense matrices to perform approximate Bayesian inference. An application of the new approach is fitting disease-mapping models for count data with complex interactions. When the precision matrix is dense, the new approach scales better than the existing INLA method and utilizes the power of multiprocessors on shared and distributed memory architectures in today’s computational resources. Second, we propose an adaptive technique to improve gradient estimation for the convex gradient-based optimization framework in INLA. We propose a simple limited-memory technique for improving the accuracy of the numerical gradient of the marginal posterior of the hyperparameter by exploiting a coordinate transformation of the gradient and the history of previously taken descent directions. Third, we extend the commonly utilized Bayesian spatial model in disease mapping, known as the Besag model, into a non-stationary spatial model. This new model considers variations in spatial dependency among a predetermined number of sub-regions. The model incorporates multiple precision parameters, which enable different intensities of spatial dependence in each sub-region. To avoid overfitting and enhance generalization, we derive a joint penalized complexity prior for these parameters. These contributions expand the capabilities of the INLA method, improving its scalability, accuracy, and flexibility for a wider range of applications. INLA Non-Stationary Dense Matrices Bayesian Inference Smart Gradient fbesag INLAPLUS MPI OpenMP Spatial Model
237	A Study of Improving the Parallel Performance of VASP. Baker, Matthew Brandon 13 August 2010 (has links) (PDF) This thesis involves a case study in the use of parallelism to improve the performance of an application for computational research on molecules. The application, VASP, was migrated from a machine with 4 nodes and 16 single-threaded processors to a machine with 60 nodes and 120 dual-threaded processors. When initially migrated, VASP's performance deteriorated after about 17 processing elements (PEs), due to network contention. Subsequent modifications that restrict communication amongst VASP processes, together with additional support for threading, allowed VASP to scale up to 112 PEs, the maximum number that was tested. Other performance-enhancing optimizations that were attempted included replacing old libraries, which produced improvements of about 10%, and prefetching, which degraded, rather than enhanced, VASP performance. cluster parallel performance openmp mpi HPC VASP Computer Sciences Physical Sciences and Mathematics Systems Architecture
238	Effective Data Redistribution and Load Balancing for Sort-Last Volume Rendering Using a Group Hierarchy / Effektiv datadistribution och belastningsutjämning för sort-last volumetrisk rendering med hjälp av en grupphierarki Walldén, Marcus January 2018 (has links) Volumetric rendering is used to visualize volume data from e.g. scientific simulations. Many advanced applications use large gigabyte- or terabyte-sized data sets, which typically means that multiple compute nodes need to partake in the rendering process to achieve interactive frame rates. Load balancing is generally used to optimize the rendering performance. In existing load balancing techniques, nodes generally only render directly-connected data and handle load balancing based on data locality in kd-trees. This approach can result in redundant data transfers and unbalanced data distribution, which affect the frame rate and increase the hardware requirements of all nodes. In this thesis we present a novel load balancing technique for sort-last volume rendering which utilizes a group hierarchy. The technique allows nodes to render data from arbitrary positions in the volume, without inducing a costly image compositing stage. The technique is compared to a static load balancing technique as well as a dynamic kd-tree based load balancing technique. Our testing demonstrated that the presented technique performed better than or equal to the kd-tree based technique while also lowering the worst-case memory usage complexity of all nodes. Utilizing a group hierarchy effectively helped to lower the compositing time of the presented technique. / Volumetrisk rendering används för att visualisera bland annat vetenskapligasimuleringar. Inom avancerade användingsområden används ofta dataset med en storlek på flera gigabyte eller terabyte. Detta medför att flera noder ofta måste användas för att uppnå en interaktiv bildfrekvens. Belastningsutjämning används generellt för att optimera renderingsprestandan. I befintliga tekniker renderar noder vanligtvis endast direkt sammankopplad data och utför belastningsutjämning baserat på datalokalitet i kd-träd. Detta kan resultera i redundanta dataöverföringar och en obalanserad datadistribution, vilket påverkar bildfrekvensen och ökar hårdvarukraven för alla noder. I denna avhandling presenterar vi en ny teknik för belastningsutjämning för sort-last volumetrisk rendering som använder en grupphierarki. Tekniken tillåter noder att rendera data från godtyckliga positioner i volymen utan att förorsaka ett kostsamt steg för bildsammansättning. Tekniken jämförs med en statisk belastningsutjämningsteknik såväl som en dynamisk belastningsutjämningsteknik baserad på kd-träd. Våra tester visar att den presenterade tekniken presterar bättre eller likvärdigt med den kd-trädbaserade tekniken medan den samtidigt sänker minneskomplexiteten för alla noder. Användandet av en grupphierarki sänkte effektivt bildsammansättningstiden för den presenterade tekniken. volume rendering cuda load balancing sort-last mpi Computer Sciences Datavetenskap (datalogi)
239	Desarrollo y verificación de una plataforma multifísica de altas prestaciones para análisis de seguridad en ingeniería nuclear Abarca Giménez, Agustín 02 October 2017 (has links) In recent years, in parallel with advances in computer technology, a multitude of computer tools have been developed through which it is possible to obtain a detailed description of the phenomena occurring in the core of nuclear reactors. The final ob-jective of these new tools is to perform safety analysis using best estimate techniques. The best estimate techniques, as opposed to the conservative ones, allow the operation of the reactor with narrower safety margins, and thus greater core economy. In this context, in this work is developed an multiphysics computer platform that inte-grates simulation codes that cover most of the physics that take place in nuclear reac-tors. For the integration of the different feedback phenomena between thermal-hydraulics, neutronics and heat transfer, a series of couplings have been developed between the codes that compose the platform. All the developments carried out are intended to realistically represent the design and behavior of the nuclear facility, in-cluding the control system, fuel elements and fuel rods. The computer platform includes some of the state-of-the-art codes for reactor analysis. The thermal-hydraulics is covered with a developed coupled code, consisting of the semi-implicit coupling between the TRACE system code and the subchannel code COBRA-TF (CTF), whose parallel version has been created in this work. In transients where three-dimensional neutron calculations are necessary, the explicit coupling be-tween the three-dimensional PARCS core simulator and the subchannel code CTF has been developed. For the analysis of the integrity of the fuel rods, the FRAPCON and FRAPTRAN codes are used, coupling the latter explicitly with CTF. All the developed tools have been included in the same computer platform that en-compasses and coordinates the simulations under the user's guidelines. The platform has enough flexibility to perform safety studies in a multitude of operational or acci-dental scenarios, and it is hoped that in the future it may be used for supporting li-cense calculations. The developed tools have been verified through a series of practical applications in different transient and accidental scenarios in light water reactors. The results obtained have been compared with actual plant measurements and with the results of other simulation codes showing adequate predictive capacity. The work carried out in this doctoral thesis is part of the research line financed by the Ministerio de Economía y Competitividad in the NUC-MULTPHYS project (ENE2012-34585) and the interdisciplinary collaboration projects of the Universitat Politècnica de Valencia COBRA_PAR (PAID-05-11-2810) and Open-NUC (PAID-05-12). / En los últimos años, paralelamente a los avances en tecnología informática, se están desarrollando multitud de herramientas informáticas mediante las que es posible obte-ner una descripción detallada de los fenómenos que tienen lugar en el núcleo de los reactores nucleares. El objeto de estas nuevas herramientas es el de realizar análisis de seguridad en reactores nucleares utilizando técnicas de mejor estimación. Las técnicas de mejor estimación, en contraposición con las conservadoras, permiten la operación del reactor con márgenes de seguridad más estrechos, y por tanto mayor economía del núcleo. En este contexto, en la presente tesis doctoral se desarrolla una plataforma informática que integra códigos informáticos que cubren la mayor parte de las físicas que tienen lugar en los reactores nucleares. Para la integración de los diferentes fenómenos de realimentación entre termohidráulica, neutrónica, mecánica y transmisión de calor se han desarrollado una serie de acoplamientos entre los códigos que componen la plata-forma. Todos los desarrollos realizados tienen por objetivo representar de forma rea-lista el diseño y comportamiento de la instalación nuclear, incluyendo el sistema de control, los elementos y las varillas de combustible. En la plataforma informática se incluyen algunos de los códigos de última generación (estado de arte) para el análisis del comportamiento de reactor. En el plano termohi-dráulico se utiliza el código acoplado desarrollado, formado por el acople semi-implícito entre el código de sistema TRACE y el de subcanal COBRA-TF (CTF), cuya versión paralela ha sido creada en este trabajo. En transitorios en los que resultan ne-cesarios los cálculos de neutrónica tridimensional, se ha desarrollado el acople explíci-to entre el simulador tridimensional de núcleos PARCS y el código de subcanal CTF. Para el análisis de la integridad de las varillas de combustible se emplean los códigos FRAPCON y FRAPTRAN, acoplando este último de forma temporalmente explícita con CTF. Todos los desarrollos realizados se han incluido en una misma plataforma informática que los engloba y coordina las simulaciones bajo las directrices del usuario. La plata-forma posee suficiente flexibilidad para realizar estudios de seguridad en multitud de escenarios operacionales o accidentales, y se desea que en un futuro pueda ser utilizada en cálculos de apoyo a licencia. Las herramientas desarrolladas han sido verificadas mediante una serie de aplicaciones prácticas en distintos transitorios y escenarios acci-dentales en reactores de agua ligera. Los resultados obtenidos se han comparado con medidas reales de planta y con los resultados de otros códigos de simulación mostran-do una adecuada capacidad predictiva. El trabajo realizado en la presente tesis doctoral se enmarca dentro de la línea de in-vestigación financiada por el Ministerio de Economía y Competitividad en el proyec-to NUC-MULTPHYS (ENE2012-34585) y los proyectos de colaboración interdisci-plinar de la Universitat Politècnica de Valencia COBRA_PAR (PAID-05-11-2810) y Open-NUC (PAID-05-12) / En els últims anys, paral·lelament als avanços en tecnologia informàtica, s'estan desenvolupant multitud de ferramentes informàtiques mitjançant les quals és possible obtindre una descripció detallada dels fenòmens que tenen lloc en el nucli dels reactors nuclears. L'objecte final d'aquestes noves ferramentes és el de realitzar anàlisis de segu-retat a reactors nuclears utilitzant tècniques de millor estimació. Les tècniques de mi-llor estimació, en contraposició amb les conservadores, permeten l'operació del reactor amb marges de seguretat més estrets, i per tant una major economia del nucli. En aquest context, en el present treball de tesi es desenvolupa una plataforma in-formàtica que integra codis informàtics que cobreixen la major part de les físiques que tenen lloc als reactors nuclears. Per a la integració dels diferents fenòmens de reali-mentació entre termohidràulica, neutrònica i transmissió de calor s'han desenvolupat una sèrie d'acoblaments entre els codis que componen la plataforma. Tots els desenvo-lupaments realitzats tenen per objectiu representar de forma realista el disseny i com-portament de la instal·lació nuclear, incloent el sistema de control, els elements i les varetes de combustible. A la plataforma informàtica s'inclouen alguns dels codis d'última generació (estat de l'art) per a l'anàlisi del comportament de reactor. En el pla termohidràulic s'utilitza el codi acoblat desenvolupat, format per l'acoblament semi-implícit entre el codi de sis-tema TRACE i el de subcanal COBRA-TF (CTF), en una versió paral·lela creada en aquest treball. En transitoris en els que resulten necessaris els càlculs de neutrònica tridimensional, s'ha desenvolupat l'acoblament explícit entre el simulador tridimensio-nal de nuclis PARCS i el codi de subcanal CTF. Per a l'anàlisi de la integritat de les varetes de combustible s'empren els codis FRAPCON i FRAPTRAN, acoblant aquest últim de forma temporalment explícita amb CTF. Tots els desenvolupaments realitzats s'han inclòs en una mateixa plataforma informàti-ca que els engloba i coordina les simulacions sota les directrius de l'usuari. La plata-forma posseeix suficient flexibilitat per realitzar estudis de seguretat en multitud d'es-cenaris operacionals o accidentals, i es desitja que en un futur pugui ser utilitzada en càlculs de suport a llicència. Les ferramentes desenvolupades han sigut verificades mitjançant una sèrie d'aplicacions pràctiques en diferents transitoris i escenaris acci-dentals en reactors d'aigua lleugera. Els resultats obtinguts s'han comparat amb mesu-res reals de planta i amb els resultats obtinguts amb altres codis de simulació, mostrant una adequada capacitat predictiva. El treball realitzat en la present tesi doctoral s'emmarca dins de la línia d'investigació finançada pel Ministeri d'Economia i Competitivitat en el projecte NUC-MULTPHYS (ENE2012-34585) i els projectes de col·laboració interdisciplinar de la Universitat Politècnica de València COBRA_PAR (PAID-2810.11.05) i Open-NUC (PAID-05-12). / Abarca Giménez, A. (2017). Desarrollo y verificación de una plataforma multifísica de altas prestaciones para análisis de seguridad en ingeniería nuclear [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/88399 PLATAFORMA MULTIFÍSICA Seguridad nuclear Códigos de acoplamiento COBRA-TF Trace Parcs Paralelización MPI Ingeniería nuclear
240	Emerging Paradigms in the Convergence of Cloud and High-Performance Computing Araújo De Medeiros, Daniel January 2023 (has links) Traditional HPC scientific workloads are tightly coupled, while emerging scientific workflows exhibit even more complex patterns, consisting of multiple characteristically different stages that may be IO-intensive, compute-intensive, or memory-intensive. New high-performance computer systems are evolving to adapt to these new requirements and are motivated by the need for performance and efficiency in resource usage. On the other hand, cloud workloads are loosely coupled, and their systems have matured technologies under different constraints from HPC. In this thesis, the use of cloud technologies designed for loosely coupled dynamic and elastic workloads is explored, repurposed, and examined in the landscape of HPC in three major parts. The first part deals with the deployment of HPC workloads in cloud-native environments through the use of containers and analyses the feasibility and trade-offs of elastic scaling. The second part relates to the use of workflow management systems in HPC workflows; in particular, a molecular docking workflow executed through Airflow is discussed. Finally, object storage systems, a cost-effective and scalable solution widely used in the cloud, and their usage in HPC applications through MPI I/O are discussed in the third part of this thesis. / Framväxande vetenskapliga applikationer är mycket datatunga och starkt kopplade. Nya högpresterande datorsystem anpassar sig till dessa nya krav och motiveras av behovet av prestanda och effektivitet i resursanvändningen. Å andra sidan är moln-applikationer löst kopplade och deras system har mogna teknologier som utvecklats under andra begränsningar än HPC. I den här avhandlingen diskuteras användningen av moln-teknologier som har mognat under löst kopplade applikationer i HPC-landskapet i tre huvuddelar. Den första delen handlar om implementeringen av HPC-applikationer i molnmiljöer genom användning av containrar och analyserar genomförbarheten och avvägningarna av elastisk skalning. Den andra delen handlar om användningen av arbetsflödeshanteringsystem i HPC-arbetsflöden; särskilt diskuteras ett molekylär dockningsarbetsflöde som utförs genom Airflow. Objektlagringssystem och deras användning inom HPC, tillsammans med ett gränssnitt mellan S3-standard och MPI I/O, diskuteras i den tredje delen av denna avhandling / <p>QC 20231122</p> High-performance computing Kubernetes airflow elastic scaling MPI S3 Computer Sciences Datavetenskap (datalogi)

Search results