Spelling suggestions: "subject:"coscheduling"" "subject:"lotscheduling""
1 |
Analyzing Symbiosis on SMT Processors Using a Simple Co-scheduling SchemeLundmark, Elias, Persson, Chris January 2017 (has links)
Simultanous Multithreading (SMT) är ett koncept för att möjligöra effektivare utnyttjande av processorer genom att exekvera flera trådar samtidigt på en enda processorkärna, vilket leder till att systemet kan nyttjas till större grad. Om flera trådar använder samma funktonsenheter i procesorkärnan kommer effektiviteten att minska eftersom detta är ett scenario när SMT inte kan omvandla thread-level parallelism (TLP) till instruction-level parallelism (ILP). I tidigare arbete av de Blanche och Lundqvist föreslår de en simpel schemaläggningsprincip genom att anta att flera instanser av samma program använder samma resurser, bör dessa inte tillåtas att samköras. I detta arbete tillämpar vi deras princip på processorer med stöd för SMT, med motiveringen att flera identiska trådar använder samma funktionsenheter inom processorkärnan. Vi påvisar detta genom att förhindra program från att exekveras simultant med identiska program härleder till att SMT kan omvandla TLP till ILP oftare, när jobb inte kan utnyttja ILP självständigt. Intuitivt visar vi även att genom sakta ned ILP genom att göra det motsatta kan vi lindra belastningen på minnessystemet. / Simultaneous Multithreading (SMT) allows for more efficient processor utilization through co-executing multiple threads on a single processing core, increasing system efficiency and throughput. Multiple co-executing threads share the functional units of a processing core and if the threads use the same functional units, efficiency decreases as this is a scenario where SMT cannot convert thread-level parallelism (TLP) to instruction-level parallelism (ILP). In previous work by de Blanche and Lundqvist, they propose a simple co-scheduling principle co-scheduling multiple instances of the same job should be considered a bad co-schedule as they are more likely to use the same resources. In this thesis, we apply their principle on SMT processors with the rationale that identical threads should use the same functional units within a processing core. We demonstrate that by disallowing jobs to coexecute with itself we enable SMT to convert TLP to ILP more often and that this is true if jobs cannot exploit ILP by themselves. Intuitively, we also show that slowing down ILP by doing the opposite can alleviate the stress on the memory system.
|
2 |
Co-scheduling for large-scale applications : memory and resilience / Ordonnancement concurrent d’applications à grande échelle : mémoire et résiliencePottier, Loïc 18 September 2018 (has links)
Cette thèse explore les problèmes liés à l'ordonnancement concurrent dans le contexte des applications massivement parallèle, de deux points de vue: le coté mémoire (en particulier la mémoire cache) et le coté tolérance aux fautes.Avec l'avènement récent des architectures dites many-core, tels que les récents processeurs multi-coeurs, le nombre d'unités de traitement augmente de manière importante.Dans ce contexte, les avantages fournis par les techniques d'ordonnancements concurrents ont été démontrés à travers de nombreuses études.L'ordonnancement concurrent, aussi appelé co-ordonnancement, consiste à exécuter les applications de manière concurrente plutôt que les unes après les autres, dans le but d'améliorer le débit global de la plateforme.Mais le partage des ressources peut souvent générer des interférences.Une des solutions pour réduire de manière importante ces interférences est le partitionnement de cache.À travers un modèle théorique, des simulations et des expériences sur une plateforme existante, nous montrons l'utilité et l'importance du co-ordonnancement quand nos stratégies de partitionnement de cache sont utilisées.De plus, avec ce nombre croissant de processeurs, la probabilité d'une panne augmente également.L'efficacité des techniques de co-ordonnancement a été démontrée dans un contexte sans pannes, mais les plateformes massivement parallèles sont confrontées à des pannes fréquentes, et des techniques de tolérance aux fautes doivent être mise en place pour améliorer l'efficacité de ces plateformes.Nous étudions la complexité du problème avec un modèle théorique, nous concevons des heuristiques et nous effectuons un ensemble complet de simulations avec un simulateur de pannes, qui démontre l'efficacité des heuristiques proposées. / This thesis explores co-scheduling problems in the context of large-scale applications with two main focus: the memory side, in particular the cache memory and the resilience side.With the recent advent of many-core architectures such as chip multiprocessors (CMP), the number of processing units is increasing.In this context, the benefits of co-scheduling techniques have been demonstrated. Recall that, the main idea behind co-scheduling is to execute applications concurrently rather than in sequence in order to improve the global throughput of the platform.But sharing resources often generates interferences.With the arising number of processing units accessing to the same last-level cache, those interferences among co-scheduled applications becomes critical.In addition, with that increasing number of processors the probability of a failure increases too.Resiliency aspects must be taking into account, specially for co-scheduling because failure-prone resources might be shared between applications.On the memory side, we focus on the interferences in the last-level cache, one solution used to reduce these interferences is the cache partitioning.Extensive simulations demonstrate the usefulness of co-scheduling when our efficient cache partitioning strategies are deployed.We also investigate the same problem on a real cache partitioned chip multiprocessors, using the Cache Allocation Technology recently provided by Intel.In a second time, still on the memory side, we study how to model and schedule task graphs on the new many-core architectures, such as Knights Landing architecture.These architectures offer a new level in the memory hierarchy through a new on-packagehigh-bandwidth memory. Current approaches usually do not take intoaccount this new memory level, however new scheduling algorithms anddata partitioning schemes are needed to take advantage of this deepmemory hierarchy.On the resilience, we explore the impact on failures on co-scheduling performance.The co-scheduling approach has been demonstrated in a fault-free context, but large-scale computer systems are confronted by frequent failures, and resilience techniques must be employed for large applications to execute efficiently. Indeed, failures may create severe imbalance between applications, and significantly degrade performance.We aim at minimizing the expected completion time of a set of co-scheduled applications in a failure-prone context by redistributing processors.
|
Page generated in 0.0755 seconds