Efficient use of hardware resources is a vital part of getting good results within high performance computing. This thesis explores the predictability of optimal CPU-core distribution between two tasks running in parallel on a shared-memory machine, with the intent to reach the shortest total runtime possible. The predictions are based on the weight and speedup of each task, in regards to the CPU-frequency decrease that comes with a growing number of active cores in modern CPUs. The weight of a task is the number of floating point operations needed to compute it to completion. The Intel oneAPI Math Kernel Library is used to create a set of different tasks, where each task consists of a single call to a dgemm-routine. Two prediction algorithms for optimal core distribution are presented and used in this thesis. Their predictions are compared to the fastest distribution observed by either running the tasks back-to-back, with each using all available cores, or running the tasks simultaneously in two parallel regions. Experimental results suggest that there is merit to this method, with the best of the two algorithms having a 14/15 prediction-accuracy of the core distribution resulting in the fastest run.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:umu-197539 |
Date | January 2022 |
Creators | Eriksson, Rasmus |
Publisher | Umeå universitet, Institutionen för datavetenskap |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | UMNAD ; 1355 |
Page generated in 0.0018 seconds