• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Using Task Parallelism for Distributed Parallel Skeleton Programming : Implementing a StarPU Back-End to SkePU 2 / Distribuerade parallellprogrammeringsskelett genom uppgiftsparallellism : Implementation av en StarPU-baserad SkePU 2 backend

Henrik, Henriksson January 2024 (has links)
We extended the parallel skeleton programming framework SkePU 2 with a new back-end utilizing StarPU, a task programming framework for hybrid and distributed architectures. The aim was to allow SkePU to run on distributed clusters, using MPI through StarPU. The implemented back-end distributes data and work across participating ranks. While we did not implement the full SkePU API, the Map and Reduce1D skeletons were successfully implemented. During the implementation, we discovered some differences in API design between SkePU and StarPU. We combine the type-safe templates used in the SkePU API with the C-style void*-heavy API of StarPU. This requires the implementation to use more complex templates than normally desired. While we could preserve most of the SkePU 2 API when moving to a distributed memory situation, some parts had to change. In particular, we needed to change the semantics of SkePU 2 containers with regards to iterators and random access. We benchmarked the performance of the implemented back-end against an MPI+OpenMP reference implementation on two problems, n-body and a simple reduction. While the n-body problem demonstrates promising scaling properties, reductions do not scale well to larger number of ranks. A performance comparison against the MPI+OpenMP reference implementation reveals that, aside from the higher communication overhead, there may also be some overhead in the work performed between communications, potentially performing at below 60-70% of the reference. In most cases, the new back-end to SkePU exhibits significantly lower performance than the reference. Extending the implemented solution to cover the full API and improving performance could provide a high level interface to distributed programming for application programmers. Indeed, subsequent developments of SkePU 3 extend and improve our StarPU back-end.

Page generated in 0.0699 seconds