Global ETD Search

51	Du prototypage à l’exploitation d’overlays FPGA / From prototyping to exploitation of FPGA overlays Bollengier, Théotime 15 January 2018 (has links) De part leur capacité de reconfiguration et les performances qu’ils offrent, les FPGAs sont de bons candidats pour accélérer des applications dans le Cloud. Cependant, les FPGAs présentent certaines caractéristiques qui font obstacle à leur utilisation dans le Cloud et leur adoption par les clients : premièrement, la programmation des FPGAs se fait à bas niveau et demande une certaine expertise, que n’ont pas nécessairement les clients habituels du Cloud. Deuxièmement, les FPGAs ne présentent pas de mécanismes natifs permettant leur intégration dans le modèle de gestion dynamique d’une infrastructure Cloud.Dans ce travail, nous proposons d’utiliser des architectures overlay afin de faciliter l’adoption, l’intégration et l’exploitation de FPGAs dans le Cloud. Les overlays sont des architectures reconfigurables elles-mêmes implémentée sur FPGA. En tant que couche d’abstraction matérielle placée entre le FPGA et les applications, les overlays permettent de monter le niveau d’abstraction du modèle d’exécution présenté aux applications et aux utilisateurs, ainsi que d’implémenter des mécanismes facilitant leur intégration et leur exploitation dans une infrastructure Cloud.Ce travail présente une approche verticale adressant tous les aspects de la mise en œuvre d’overlays dans le Cloud en tant qu’accélérateurs reconfigurables par les clients : de la conception et l’implémentation des overlays, leur intégration sur des plateformes FPGA commerciales, la mise en place de leurs mécanismes d’exploitation, jusqu’à la réalisationde leurs outils de programmation. L’environnement réalisé est complet, modulaire et extensible, il repose en partie sur différents outils existants, et démontre la faisabilité de notre approche. / Due to their reconfigurable capability and the performance they offer, FPGAs are good candidates for accelerating applications in the cloud. However, FPGAs have some features that hinder their use in the Cloud as well as their adoption by customers : first, FPGA programming is done at low level and requires some expertise that usual Cloud clients do not necessarily have. Secondly, FPGAs do not have native mechanisms allowing them to easily fit in the dynamic execution model of the Cloud.In this work, we propose to use overlay architectures to facilitate FPGA adoption, integration, and operation in the Cloud. Overlays are reconfigurable architectures synthesized on FPGA. As hardware abstraction layers placed between the FPGA and applications, overlays allow to raise the abstraction level of the execution model presented to applications and users, as well as to implement mechanisms making them fit in a Cloud infrastructure.This work presents a vertical approach addressing all aspects of overlay operation in the Cloud as reconfigurable accelerators programmable by tenants : from designing and implementing overlays, integrating them on commercial FPGA platforms, setting up their operating mechanisms, to developping their programming tools. The environment developped in this work is complete, modular and extensible, it is partially based on several existing tools, and demonstrate the feasibility of our approach. Architecture reconfigurable Overlay FPGA Virtualisation matérielle Compatibilité des bitstreams Migration de tâches matérielles Ordonnancement de tâches matérielles Reconfigurable architecture FPGA overlay Hardware virtualization Bitstream compatibility Hardware task migration Hardware task scheduling
52	Mapping Concurrent Applications to Multiprocessor Systems with Multithreaded Processors and Network on Chip-Based Interconnections Pop, Ruxandra January 2011 (has links) Network on Chip (NoC) architectures provide scalable platforms for designing Systems on Chip (SoC) with large number of cores. Developing products and applications using an NoC architecture offers many challenges and opportunities. A tool which can map an application or a set of applications to a given NoC architecture will be essential. In this thesis we first survey current techniques and we present our proposals for mapping and scheduling of concurrent applications to NoCs with multithreaded processors as computational resources. NoC platforms are basically a special class of Multiprocessor Embedded Systems (MPES). Conventional MPES architectures are mostly bus-based and, thus, are exposed to potential difficulties regarding scalability and reusability. There has been a lot of research on MPES development including work on mapping and scheduling of applications. Many of these results can also be applied to NoC platforms. Mapping and scheduling are known to be computationally hard problems. A large range of exact and approximate optimization algorithms have been proposed for solving these problems. The methods include Branch-and–Bound (BB), constructive and transformative heuristics such as List Scheduling (LS), Genetic Algorithms (GA) and various types of Mathematical Programming algorithms. Concurrent applications are able to capture a typical embedded system which is multifunctional. Concurrent applications can be executed on an NoC which provides a large computational power with multiple on-chip computational resources. Improving the time performances of concurrent applications which are running on Network on Chip (NoC) architectures is mainly correlated with the ability of mapping and scheduling methodologies to exploit the Thread Level Parallelism (TLP) of concurrent applications through the available NoC parallelism. Matching the architectural parallelism to the application concurrency for obtaining good performance-cost tradeoffs is another aspect of the problem. Multithreading is a technique for hiding long latencies of memory accesses, through the overlapped execution of several threads. Recently, Multi-Threaded Processors (MTPs) have been designed providing the architectural infrastructure to concurrently execute multiple threads at hardware level which, usually, results in a very low context switching overhead. Simultaneous Multi-Threaded Processors (SMTPs) are superscalar processor architectures which adaptively exploit the coarse grain and the fine grain parallelism of applications, by simultaneously executing instructions from several thread contexts. In this thesis we make a case for using SMTPs and MTPs as NoC resources and show that such a multiprocessor architecture provides better time performances than an NoC with solely General-purpose Processors (GP). We have developed a methodology for task mapping and scheduling to an NoC with mixed SMTP, MTP and GP resources, which aims to maximize the time performance of concurrent applications and to satisfy their soft deadlines. The developed methodology was evaluated on many configurations of NoC-based platforms with SMTP, MTP and GP resources. The experimental results demonstrate that the use of SMTPs and MTPs in NoC platforms can significantly speed-up applications. Network on Chip Multiprocessor Embedded Systems Task Mapping Task Scheduling Multithreading Simultaneous Multithreading Response Time Estimation Genetic Algorithms List Scheduling Soft Deadline Task Graphs Engineering and Technology Teknik och teknologier
53	Realisierung einer Schedulingumgebung für gemischt-parallele Anwendungen und Optimierung von layer-basierten Schedulingalgorithmen Kunis, Raphael 20 January 2011 (has links) Eine Herausforderung der Parallelverarbeitung ist das Erreichen von Skalierbarkeit großer paralleler Anwendungen für verschiedene parallele Systeme. Das zentrale Problem ist, dass die Ausführung einer Anwendung auf einem parallelen System sehr gut sein kann, die Portierung auf ein anderes System in der Regel jedoch zu schlechten Ergebnissen führt. Durch die Verwendung des Programmiermodells der parallelen Tasks mit Abhängigkeiten kann die Skalierbarkeit für viele parallele Algorithmen deutlich verbessert werden. Die Programmierung mit parallelen Tasks führt zu Task-Graphen mit Abhängigkeiten zur Darstellung einer parallelen Anwendung, die auch als gemischt-parallele Anwendung bezeichnet wird. Die Grundlage für eine effiziente Abarbeitung einer gemischt-parallelen Anwendung bildet ein geeigneter Schedule, der eine effiziente Abbildung der parallelen Tasks auf die Prozessoren des parallelen Systems vorgibt. Für die Berechnung eines Schedules werden Schedulingalgorithmen eingesetzt. Ein zentrales Problem bei der Bestimmung eines Schedules für gemischt-parallele Anwendungen besteht darin, dass das Scheduling bereits für Single-Prozessor-Tasks mit Abhängigkeiten und ein paralleles System mit zwei Prozessoren NP-hart ist. Daher existieren lediglich Approximationsalgorithmen und Heuristiken um einen Schedule zu berechnen. Eine Möglichkeit zur Berechnung eines Schedules sind layerbasierte Schedulingalgorithmen. Diese Schedulingalgorithmen bilden zuerst Layer unabhängiger paralleler Tasks und berechnen den Schedule für jeden Layer separat. Eine Schwachstelle dieser Schedulingalgorithmen ist das Zusammenfügen der einzelnen Schedules zum globalen Schedule. Der vorgestellte Algorithmus Move-blocks bietet eine elegante Möglichkeit das Zusammenfügen zu verbessern. Dies geschieht durch eine Verschmelzung der Schedules aufeinander folgender Layer. Obwohl eine Vielzahl an Schedulingalgorithmen für gemischt-parallele Anwendungen existiert, gibt es bislang keine umfassende Unterstützung des Schedulings durch Programmierwerkzeuge. Im Besonderen gibt es keine Schedulingumgebung, die eine Vielzahl an Schedulingalgorithmen in sich vereint. Die Vorstellung der flexiblen, komponentenbasierten und erweiterbaren Schedulingumgebung SEParAT ist der zweite Fokus dieser Dissertation. SEParAT unterstützt verschiedene Nutzungsszenarien, die weit über das reine Scheduling hinausgehen, z.B. den Vergleich von Schedulingalgorithmen und die Erweiterung und Realisierung neuer Schedulingalgorithmen. Neben der Vorstellung der Nutzungsszenarien werden sowohl die interne Verarbeitung eines Schedulingdurchgangs als auch die komponentenbasierte Softwarearchitektur detailliert vorgestellt. info:eu-repo/classification/ddc/004 ddc:004
54	Optimization and Scheduling on Heterogeneous CPU/FPGA Architecture with Communication Delays / Optimisation et ordonnancement sur une architecture hétérogène CPU/FPGA avec délais de communication Abdallah, Fadel 21 December 2017 (has links) Le domaine de l'embarqué connaît depuis quelques années un essor important avec le développement d'applications de plus en plus exigeantes en calcul auxquels les architectures traditionnelles à base de processeurs (mono/multi cœur) ne peuvent pas toujours répondre en termes de performances. Si les architectures multiprocesseurs ou multi cœurs sont aujourd'hui généralisées, il est souvent nécessaire de leur adjoindre des circuits de traitement dédiés, reposant notamment sur des circuits reconfigurables, permettant de répondre à des besoins spécifiques et à des contraintes fortes particulièrement lorsqu'un traitement temps-réel est requis. Ce travail présente l'étude des problèmes d'ordonnancement dans les architectures hétérogènes reconfigurables basées sur des processeurs généraux (CPUs) et des circuits programmables (FPGAs). L'objectif principal est d'exécuter une application présentée sous la forme d'un graphe de précédence sur une architecture hétérogène CPU/FPGA, afin de minimiser le critère de temps d'exécution total ou makespan (Cmax). Dans cette thèse, nous avons considéré deux cas d'étude : un cas d'ordonnancement qui tient compte des délais d'intercommunication entre les unités de calcul CPU et FPGA, pouvant exécuter une seule tâche à la fois, et un autre cas prenant en compte le parallélisme dans le FPGA, qui peut exécuter plusieurs tâches en parallèle tout en respectant la contrainte surfacique. Dans un premier temps, pour le premier cas d'étude, nous proposons deux nouvelles approches d'optimisation, GAA (Genetic Algorithm Approach) et MGAA (Modified Genetic Algorithm Approach), basées sur des algorithmes génétiques. Nous proposons également de tester un algorithme par séparation et évaluation (méthode Branch & Bound). Les approches GAA et MGAA proposées offrent un très bon compromis entre la qualité des solutions obtenues (critère d'optimisation de makespan) et le temps de calcul nécessaire à leur obtention pour résoudre des problèmes à grande échelle, en comparant à la méthode par séparation et évaluation (Branch & Bound) proposée et l'autre méthode exacte proposée dans la littérature. Dans un second temps, pour le second cas d'étude, nous avons proposé et implémenté une méthode basée sur les algorithmes génétiques pour résoudre le problème du partitionnement temporel dans un circuit FPGA en utilisant la reconfiguration dynamique. Cette méthode fournit de bonnes solutions avec des temps de calcul raisonnables. Nous avons ensuite amélioré notre précédente approche MGAA afin d'obtenir une nouvelle approche intitulée MGA (Multithreaded Genetic Algorithm), permettent d'apporter des solutions au problème de partitionnement. De plus, nous avons également proposé un algorithme basé sur le recuit simulé, appelé MSA (Multithreaded Simulated Annealing). Ces deux approches proposées, basées sur les méthodes métaheuristiques, permettent de fournir des solutions approchées dans un intervalle de temps très raisonnable aux problèmes d'ordonnancement et de partitionnement sur système de calcul hétérogène / The domain of the embedded systems becomes more and more attractive in recent years with the development of increasing computationally demanding applications to which the traditional processor-based architectures (either single or multi-core) cannot always respond in terms of performance. While multiprocessor or multicore architectures have now become generalized, it is often necessary to add to them dedicated processing circuits, based in particular on reconfigurable circuits, to meet specific needs and strong constraints, especially when real-time processing is required. This work presents the study of scheduling problems into the reconfigurable heterogeneous architectures based on general processors (CPUs) and programmable circuits (FPGAs). The main objective is to run an application presented in the form of a Data Flow Graph (DFG) on a heterogeneous CPU/FPGA architecture in order to minimize the total running time or makespan criterion (Cmax). In this thesis, we have considered two case studies: a scheduling case taking into account the intercommunication delays and where the FPGA device can perform a single task at a time, and another case taking into account parallelism in the FPGA, which can perform several tasks in parallel while respecting the constraint surface. First, in the first case, we propose two new optimization approaches GAA (Genetic Algorithm Approach) and MGAA (Modified Genetic Algorithm Approach) based on genetic algorithms. We also propose to compare these algorithms to a Branch & Bound method. The proposed approaches (GAA and MGAA) offer a very good compromise between the quality of the solutions obtained (optimization makespan criterion) and the computational time required to perform large-scale problems, unlike to the proposed Branch & Bound and the other exact methods found in the literature. Second, we first implemented an updated method based on genetic algorithms to solve the temporal partitioning problem in an FPGA circuit using dynamic reconfiguration. This method provides good solutions in a reasonable running time. Then, we improved our previous MGAA approach to obtain a new approach called MGA (Multithreaded Genetic Algorithm), which allows us to provide solutions to the partitioning problem. In addition, we have also proposed an algorithm based on simulated annealing, called MSA (Multithreaded Simulated Annealing). These two proposed approaches which are based on metaheuristic methods provide approximate solutions within a reasonable time period to the scheduling and partitioning problems on a heterogeneous computing system Algorithme génétique Problèmes d'ordonnancement Optimisation combinatoire Métaheuristiques Makespan (longueur d'ordonnancement) Programmation linéaire Système hétérogène MPSoC Parallélisme Systèmes embarqués reconfigurables Genetic algorithm Task scheduling problem Combinatorial optimization Metaheuristics Makespan (schedule length) Linear programming Heterogeneous system MPSoC Parallelism Embedded reconfigurable systems 006.22
55	Projeto e avaliação de um broker como agente de intermediação e QoS em uma nuvem computacional híbrida / Design and evaluation of a broker as QoS and intermediation agent in hybrid cloud computing Pardo, Mario Henrique de Souza 16 June 2016 (has links) A presente tese de doutorado propõe uma arquitetura de cloud broker para ambientes de computação em nuvem híbrida. Um cloud broker tem o objetivo de executar a mediação entre clientes e provedores, recebendo requisições dos clientes e encaminhando-as ao serviço do provedor que melhor se adaptar aos requisitos de qualidade de serviço (QoS) solicitados. A arquitetura de broker de serviços com QoS proposta denomina-se QBroker, características de implementação de seu modo de operação bem como sua interação com os recursos virtuais de um ambiente de nuvem são apresentadas. O modelo de nuvem considerado foi o de nuvem híbrida com uma caracterização de arquitetura orientada a serviços (SOA) na qual serviços remotos são disponibilizados aos clientes. A política de escalonamento de tarefas desenvolvida para o QBroker foi a de intermediação de serviços, considerando tratativas de QoS, diferenciação das instâncias de serviços (SOA) e alocação dinâmica de serviços. Além disso, toda a caracterização do modo de operação do QBroker foi baseada no conceito de intermediação do modelo de referência de nuvem do NIST. O componente QBroker foi introduzido numa arquitetura de computação em nuvem BEQoS (Bursty Energy and Quality of Service), desenvolvida no Laboratório de Sistemas Distribuídos e Programação Concorrente do ICMC-USP de São Carlos. Avaliações de desempenho para a implementação da arquitetura QBroker foram conduzidas por meio de programas de simulação com uso da API do simulador CloudSim e da arquitetura CloudSim-BEQoS. Três cenários experimentais foram avaliados e, segundo a análise de resultados efetuada, foi possível validar que as características arquiteturais implementadas no QBroker resultaram em significativo impacto nas variáveis de resposta consideradas. Assim, foi possível comprovar que o uso do QBroker como mecanismo de mediação em ambientes de nuvem híbrida com SOA promoveu ganhos em desempenho para o sistema de nuvem e permitiu melhoria na qualidade dos serviços oferecidos. / This doctoral thesis proposes a cloud broker architecture for hybrid cloud computing environments. A cloud broker aims to perform mediation between clients and providers, receiving customer requests and forwarding them to the service provider that best suits the requested QoS requirements. The broker architecture services with QoS proposal is called QBroker. Implementation features of its mode of operation as well as its interaction with the virtual resources from a cloud environment are presented. The cloud deployment model was considered a hybrid cloud with a characterization of service-oriented architecture (SOA) in which remote services are available to customers. The task scheduling policy developed for QBroker was the intermediation of services, considering negotiations of QoS, differentiation of services instances and dynamic allocation of services. Moreover, the entire characterization of QBroker operation mode is based on the intermediation concept of the NIST cloud reference model. The QBroker component was introduced into a cloud computing architecture BEQoS (Bursty, Energy and Quality of Service), developed in the Laboratory of Distributed Systems and Concurrent Programming at ICMC-USP. Performance evaluations analysis the of results of QBroker architecture were conducted through simulation programs using the CloudSim simulator API and CloudSim-BEQoS architecture. Three experimental scenarios were evaluated and, according to analysis of the results, it was possible to validate that the architectural features implemented in QBroker resulted in significant impact on response variables considered. Thus, it was possible to prove that the use of QBroker as mediation mechanism in hybrid cloud environments with SOA promoted performance gains for the cloud system and allowed improvement in the quality of services offered. Algorithm Algoritmo Alocação dinâmica Avaliação de desempenho Cloud computing CloudSim CloudSim Computação em nuvem Diferenciação de serviço Dynamic allocation Escalonamento de tarefas Hybrid cloud. Intermediação Intermediation Nuvem híbrida. Performance evaluation QoS QoS Service differentiation Services Serviços Simulação Simulation SOA SOA Task scheduling
56	Projeto e avaliação de um broker como agente de intermediação e QoS em uma nuvem computacional híbrida / Design and evaluation of a broker as QoS and intermediation agent in hybrid cloud computing Mario Henrique de Souza Pardo 16 June 2016 (has links) A presente tese de doutorado propõe uma arquitetura de cloud broker para ambientes de computação em nuvem híbrida. Um cloud broker tem o objetivo de executar a mediação entre clientes e provedores, recebendo requisições dos clientes e encaminhando-as ao serviço do provedor que melhor se adaptar aos requisitos de qualidade de serviço (QoS) solicitados. A arquitetura de broker de serviços com QoS proposta denomina-se QBroker, características de implementação de seu modo de operação bem como sua interação com os recursos virtuais de um ambiente de nuvem são apresentadas. O modelo de nuvem considerado foi o de nuvem híbrida com uma caracterização de arquitetura orientada a serviços (SOA) na qual serviços remotos são disponibilizados aos clientes. A política de escalonamento de tarefas desenvolvida para o QBroker foi a de intermediação de serviços, considerando tratativas de QoS, diferenciação das instâncias de serviços (SOA) e alocação dinâmica de serviços. Além disso, toda a caracterização do modo de operação do QBroker foi baseada no conceito de intermediação do modelo de referência de nuvem do NIST. O componente QBroker foi introduzido numa arquitetura de computação em nuvem BEQoS (Bursty Energy and Quality of Service), desenvolvida no Laboratório de Sistemas Distribuídos e Programação Concorrente do ICMC-USP de São Carlos. Avaliações de desempenho para a implementação da arquitetura QBroker foram conduzidas por meio de programas de simulação com uso da API do simulador CloudSim e da arquitetura CloudSim-BEQoS. Três cenários experimentais foram avaliados e, segundo a análise de resultados efetuada, foi possível validar que as características arquiteturais implementadas no QBroker resultaram em significativo impacto nas variáveis de resposta consideradas. Assim, foi possível comprovar que o uso do QBroker como mecanismo de mediação em ambientes de nuvem híbrida com SOA promoveu ganhos em desempenho para o sistema de nuvem e permitiu melhoria na qualidade dos serviços oferecidos. / This doctoral thesis proposes a cloud broker architecture for hybrid cloud computing environments. A cloud broker aims to perform mediation between clients and providers, receiving customer requests and forwarding them to the service provider that best suits the requested QoS requirements. The broker architecture services with QoS proposal is called QBroker. Implementation features of its mode of operation as well as its interaction with the virtual resources from a cloud environment are presented. The cloud deployment model was considered a hybrid cloud with a characterization of service-oriented architecture (SOA) in which remote services are available to customers. The task scheduling policy developed for QBroker was the intermediation of services, considering negotiations of QoS, differentiation of services instances and dynamic allocation of services. Moreover, the entire characterization of QBroker operation mode is based on the intermediation concept of the NIST cloud reference model. The QBroker component was introduced into a cloud computing architecture BEQoS (Bursty, Energy and Quality of Service), developed in the Laboratory of Distributed Systems and Concurrent Programming at ICMC-USP. Performance evaluations analysis the of results of QBroker architecture were conducted through simulation programs using the CloudSim simulator API and CloudSim-BEQoS architecture. Three experimental scenarios were evaluated and, according to analysis of the results, it was possible to validate that the architectural features implemented in QBroker resulted in significant impact on response variables considered. Thus, it was possible to prove that the use of QBroker as mediation mechanism in hybrid cloud environments with SOA promoted performance gains for the cloud system and allowed improvement in the quality of services offered. Algoritmo Alocação dinâmica Avaliação de desempenho CloudSim Computação em nuvem Diferenciação de serviço Escalonamento de tarefas Intermediação Nuvem híbrida. QoS Serviços Simulação SOA Algorithm Cloud computing CloudSim Dynamic allocation Hybrid cloud. Intermediation Performance evaluation QoS Service differentiation Services Simulation SOA Task scheduling
57	An I/O-aware scheduler for containerized data-intensive HPC tasks in Kubernetes-based heterogeneous clusters / En I/O-medveten schemaläggare för containeriserade dataintensiva HPC-uppgifter i Kubernetes-baserade heterogena kluster Wu, Zheyun January 2022 (has links) Cloud-native is a new computing paradigm that takes advantage of key characteristics of cloud computing, where applications are packaged as containers. The lifecycle of containerized applications is typically managed by container orchestration tools such as Kubernetes, the most popular container orchestration system that automates the containers’ deployment, maintenance, and scaling. Kubernetes has become the de facto standard for container orchestrators in the cloud-native era. Meanwhile, with the increasing demand for High-Performance Computing (HPC) over the past years, containerization is being adopted by the HPC community and various processors and special-purpose hardware are utilized to accelerate HPC applications. The architecture of cloud systems has been gradually shifting from homogeneous to heterogeneous with different processors and hardware accelerators, which raises a new challenge: how to exploit different computing resources efficiently? Much effort has been devoted to improving the use efficiency of computing resources in heterogeneous systems from the perspective of task scheduling, which aims to match different types of tasks to optimal computing devices for execution. Existing proposals do not take into account the variation in I/O performance between heterogeneous nodes when scheduling tasks. However, I/O performance is an important but often overlooked factor that can be a potential performance bottleneck for HPC tasks. This thesis proposes an I/O-aware scheduler named cmio-scheduler for containerized data-intensive HPC tasks in Kubernetes-based heterogeneous clusters, which is aware of the I/O throughput of compute nodes when making task placement decisions. In principle, cmio-scheduler assigns data-intensive HPC tasks to the node that fulfills the tasks’ requirements for CPU, memory, and GPU and has the highest I/O throughput. The experimental results demonstrate that cmio-scheduler reduces the execution time by 19.32% for the overall workflow and 15.125% for parallelizable tasks on average. / Cloud-native är ett nytt dataparadigm som drar nytta av de viktigaste egenskaperna hos molntjänster, där applikationer paketeras som behållare. Livscykeln för applikationer i containrar hanteras vanligtvis av verktyg för containerorkestrering, t.ex. Kubernetes, det mest populära systemet för containerorkestrering, som automatiserar installation, underhåll och skalning av containrar. Kubernetes har blivit de facto-standard för containerorkestrar i den molnnativa eran. Med den ökande efterfrågan på högpresterande beräkningar (HPC) under de senaste åren har containerisering antagits av HPC-samhället och olika processorer och specialhårdvara används för att påskynda HPC-tillämpningar. Arkitekturen för molnsystem har gradvis skiftat från homogen till heterogen med olika processorer och hårdvaruacceleratorer, vilket ger upphov till en ny utmaning: hur kan man utnyttja olika datorresurser på ett effektivt sätt? Mycket arbete har ägnats åt att förbättra utnyttjandet av datorresurser i heterogena system ur perspektivet för uppgiftsfördelning, som syftar till att matcha olika typer av uppgifter till optimala datorutrustning för utförande. Befintliga förslag tar inte hänsyn till variationen i I/O-prestanda mellan heterogena noder vid schemaläggning av uppgifter. I/O-prestanda är dock en viktig men ofta förbisedd faktor som kan vara en potentiell flaskhals för HPC-uppgifter. I den här avhandlingen föreslås en I/O-medveten schemaläggare vid namn cmio-scheduler för containeriserade dataintensiva HPC-uppdrag i Kubernetes-baserade heterogena kluster, som är medveten om beräkningsnodernas I/O-genomströmning när den fattar beslut om placering av uppdrag. I princip tilldelar cmio-scheduler dataintensiva HPC-uppgifter till den nod som uppfyller uppgifternas krav på CPU, minne och GPU och som har den högsta I/O-genomströmningen. De experimentella resultaten visar att cmio-scheduler i genomsnitt minskar exekveringstiden med 19,32 % för det totala arbetsflödet och med 15,125 % för parallelliserbara uppgifter. Cloud-native Containers Kubernetes High-performance computing (HPC) Data-intensive computing Task scheduling Heterogeneous systems Cloud-native Containrar Kubernetes Högpresterande datoranvändning (HPC) Dataintensiv datoranvändning Uppgiftsschemaläggning Heterogena system Computer and Information Sciences Data- och informationsvetenskap
58	Δρομολόγηση και αποδοτική ανάθεση χωρητικότητας σε ευρυζωνικά οπτικά δίκτυα Χριστοδουλόπουλος, Κωνσταντίνος 19 August 2009 (has links) Τα οπτικά δίκτυα αποτελούν την αποδοτικότερη επιλογή όσον αφορά την εγκατάσταση ευρυζωνικών δικτύων κορμού, καθώς παρουσιάζουν μοναδικά χαρακτηριστικά μετάδοσης. Διαθέτουν τεράστιο εύρος ζώνης, υψηλή αξιοπιστία, ενώ επίσης έχουν μειωμένο κόστος μετάδοσης ανά bit πληροφορίας σε σχέση με τα υπόλοιπα ενσύρματα δίκτυα. Σημαντικές ερευνητικές προσπάθειες έχουν επικεντρωθεί στις προοπτικές μετάβασης από τα παραδοσιακά στατικά δίκτυα κυκλωμάτων, στα οποία χρησιμοποιείται από-σημείο-σε-σημείο οπτική μετάδοση, σε δίκτυα μετάδοσης δεδομένων που προσφέρουν δυναμική και γρήγορη επαναρύθμιση των οπτικών μονοπατιών και πρόσβαση σε χωρητικότητες κάτω του ενός μήκους κύματος, ανάλογα με τις απαιτήσεις των χρηστών και των εκάστοτε εφαρμογών. Τα τελευταία χρόνια υπάρχει η τάση για δημιουργία δυναμικών και επαναρυθμιζόμενων οπτικών δικτύων μεταγωγής κυκλώματος (Optical Circuit Switching), τα οποία θα βασίζονται σε διαφανείς κόμβους μεταγωγής. Η μονάδα μεταγωγής των δικτύων οπτικής μεταγωγής κυκλώματος είναι τα οπτικά μονοπάτια (lightpaths) και το βασικό πρόβλημα βελτιστοποίησης που σχετίζεται με την αποδοτική εκμετάλλευση της χωρητικότητας τέτοιων δικτύων είναι το πρόβλημα της δρομολόγησης και ανάθεσης μήκους κύματος (Routing and Wavelength Assignment - RWA). Στα αμιγώς διαφανή (transparent) οπτικά δίκτυα κυκλώματος η μετάδοση του σήματος υποβαθμίζεται από μια σειρά φυσικών εξασθενήσεων (physical impairments), σε σημείο που η εγκατάσταση ενός οπτικού μονοπατιού να μην είναι αποδεκτή. Για την αντιμετώπιση αυτού του προβλήματος στην παρούσα διατριβή προτείνουμε αλγόριθμους οι οποίοι λαμβάνουν υπόψη τους τις φυσικές εξασθενήσεις (Impairment Aware RWA ή ΙΑ-RWA algorithms) τόσο για στατική όσο και για δυναμική κίνηση. Συγκεκριμένα, παρουσιάζουμε έναν IA-RWA αλγόριθμο για στατική κίνηση, ο οποίος βασίζεται στην τεχνική της LP-χαλάρωσης και χρησιμοποιεί αποδοτικές μεθόδους για την παραγωγή ακεραίων λύσεων. Εκφράζουμε τις φυσικές εξασθενήσεις μέσω επιπλέον περιορισμών στην LP μοντελοποίηση του RWA προβλήματος, επιτυγχάνοντας την διαστρωματική βελτιστοποίηση (cross-layer optimization) πάνω στο φυσικό επίπεδο και στο επίπεδο δικτύου. Στη συνέχεια, προτείνουμε έναν IA-RWA αλγόριθμο πολλαπλών κριτηρίων (multi-cost) για δυναμική κίνηση. Ορίζουμε ένα διάνυσμα από κόστη για κάθε σύνδεσμο και τις πράξεις συσχέτισης αυτών, ώστε να μπορούμε να υπολογίσουμε το διάνυσμα από κόστη ενός μονοπατιού και μέσω αυτού να αξιολογήσουμε την ποιότητα μετάδοσης των διαθέσιμων μηκών κύματος του μονοπατιού. Για την εξυπηρέτηση μιας νέας αίτησης σύνδεσης, ο αλγόριθμος πολλαπλών κριτηρίων υπολογίζει το σύνολο των μη κυριαρχούμενων μονοπατιών, από την πηγή στο ζητούμενο προορισμό, και μετά εφαρμόζει μια πολιτική για να επιλέξει το βέλτιστο οπτικό μονοπάτι. Προτείνουμε και αξιολογούμε την απόδοση μιας σειράς από πολιτικές επιλογής, η κάθε μια από τις οποίες ουσιαστικά αντιστοιχεί σε έναν διαφορετικό δυναμικό IA-RWA αλγόριθμο. Στη συνέχεια, στρέφουμε την προσοχή μας στα δίκτυα οπτικής μεταγωγής καταιγισμών (Optical Burst Switching – OBS), τα οποία θεωρούνται ότι αποτελούν το επόμενο στάδιο των δικτύων οπτικής μεταγωγής κυκλώματος, όπου η δέσμευση της χωρητικότητας γίνεται για μικρότερο χρονικό διάστημα. Στα OBS δίκτυα, τα πακέτα που έχουν τον ίδιο προορισμό και παρόμοιες απαιτήσεις ποιότητας υπηρεσίας συναθροίζονται σε καταιγισμούς (bursts). Οι καταιγισμοί μεταδίδονται πάνω από αμιγώς οπτικά μονοπάτια, τα οποία ρυθμίζονται με τη χρήση πακέτων ελέγχου που μεταδίδονται πριν από τους αντίστοιχους καταιγισμούς και τα οποία επεξεργάζονται ηλεκτρονικά οι ενδιάμεσοι κόμβοι. Επικεντρώνουμε την προσοχή μας σε δυο βασικά στοιχεία ενός δικτύου οπτικής μεταγωγής καταιγισμών, την διαδικασία συναρμολόγησης καταιγισμών και τα πρωτόκολλα σηματοδοσίας, και παραθέτουμε δύο προτάσεις για την αποδοτική ανάθεσης χωρητικότητας σε αυτά τα δίκτυα. Συγκεκριμένα, προτείνουμε και αξιολογούμε ένα νέο αλγόριθμο συναρμολόγησης καταιγισμών που βασίζεται στη μέση καθυστέρηση των πακέτων που αποτελούν έναν καταιγισμό. Δείχνουμε ότι ο προτεινόμενος αλγόριθμος συναρμολόγησης καταιγισμών μειώνει την διασπορά της καθυστέρησης των πακέτων (packet delay jitter), η οποία είναι σημαντική για μια σειρά από εφαρμογές. Στην συνέχεια προτείνουμε ένα νέο αμφίδρομο (two-way) πρωτόκολλο σηματοδοσίας που βασίζεται στις μελλοντικές (in-advance) και χαλαρωμένες χρονικά (relaxed timed) δεσμεύσεις χωρητικότητας. Στο προτεινόμενο πρωτόκολλο, κατά τη φάση εγκατάστασης της σύνδεσης οι δεσμεύσεις χωρητικότητας γίνονται για χρονικό διάστημα μεγαλύτερο από το χρόνο μετάδοσης του καταιγισμού, ώστε να αυξηθεί η πιθανότητα επιτυχούς εγκατάστασης στους επόμενους συνδέσμους του μονοπατιού. Συγκρίνουμε το προτεινόμενο πρωτόκολλο με τυπικά πρωτόκολλα που έχουν προταθεί στη βιβλιογραφία και δείχνουμε οτι μπορεί να χρησιμοποιηθεί για την παροχή διαφοροποιημένης ποιότητα υπηρεσιών (QoS differentiation) στους χρήστες του OBS δικτύου. Στη συνέχεια, εξετάζουμε το πρόβλημα της δρομολόγησης και του χρονοπρογραμματισμού συνδέσεων με χαλαρό - μη συγκεκριμένο χρόνο εκκίνησης, πρόβλημα που εμφανίζεται υπό ελαφρώς διαφορετική μορφή σε δίκτυα οπτικής μεταγωγής κυκλώματος, οπτικής μεταγωγής καταιγισμών αλλά και μεταγωγής πακέτου. Η εξυπηρέτηση αυτών των συνδέσεων γίνεται μέσω μελλοντικών δεσμεύσεων χωρητικότητας, τρόπος ο οποίος είναι τυπικός για να παρεχθεί εγγυημένη ποιότητα υπηρεσίας (QoS) στους χρήστες ενός δικτύου. Θεωρούμε ότι μας δίνεται μια σύνδεση με γνωστή πηγή και προορισμό, γνωστό ή άγνωστο όγκο δεδομένων και γνωστό ρυθμό μετάδοσης και ζητείται να αποφασίσουμε το μονοπάτι που θα ακολουθήσουν τα δεδομένα και το χρόνο που θα αρχίσει η μετάδοση. Διακριτοποιούμε το χρόνο και χρησιμοποιούμε κατάλληλα διανύσματα ως δομές δεδομένων για να αναπαραστήσουμε τη διαθεσιμότητα των συνδέσμων του δικτύου ως συνάρτηση του χρόνου. Χρησιμοποιούμε αυτά τα διανύσματα σε ένα αλγόριθμο πολλαπλών κριτηρίων για τη δρομολόγηση και το χρονοπρογραμματισμό των συνδέσεων. Αρχικά, παρουσιάζουμε έναν αλγόριθμο πολλαπλών κριτηρίων μη πολυωνυμικής πολυπλοκότητας, ο οποίος βασίζεται στην έννοια των μη-κυριαρχούμενων μονοπατιών. Μετά προτείνουμε δύο ευριστικούς αλγορίθμους πολυωνυμικής πολυπλοκότητας, ορίζοντας κατάλληλες σχέσεις ψευδο-κυριαρχίας οι οποίες μειώνουν το χώρο των λύσεων. Επίσης, προτείνουμε ένα μηχανισμό branch-and-bound, ο οποίος μπορεί να μειώσει το χώρο λύσεων στην περίπτωση που χρησιμοποιούμε μια συγκεκριμένη συνάρτηση βελτιστοποίησης για όλες τις συνδέσεις. Η απόδοση των προτεινόμενων αλγορίθμων αξιολογήθηκε σε ένα δίκτυο οπτικής μεταγωγής καταιγισμών, ωστόσο τα συμπεράσματα και η εφαρμοσιμότητα του προτεινόμενου αλγόριθμου επεκτείνεται και σε άλλου είδους οπτικά δίκτυα. Τέλος, εξετάζουμε το πρόβλημα του συνδυασμένου χρονοπρογραμματισμού των δικτυακών και υπολογιστικών πόρων που απαιτούνται για την εκτέλεση μιας διεργασίας σε ένα Δίκτυο Πλέγματος (Grid Network). Τα Δίκτυα Πλέγματος θεωρούνται το επόμενο βήμα στον τομέα των κατανεμημένων συστημάτων, εισάγοντας την έννοια της “κοινής” χρήσης γεωγραφικά κατανεμημένων και ετερογενών πόρων (υπολογιστικών, αποθηκευτικών, δικτυακών, κλπ.). Υποθέτουμε ότι η εκτέλεση μιας διεργασίας αποτελείται από δύο διαδοχικά στάδια: (α) Τη μεταφορά των δεδομένων εισόδου της διεργασίας από μια αποθηκευτική μονάδα σε μια συστοιχία υπολογιστών (cluster), (β) την εκτέλεση της διεργασίας στη συστοιχία υπολογιστών. Επεκτείνουμε τον αλγόριθμο πολλαπλών κριτηρίων για τη δρομολόγηση και το χρονοπρογραμματισμό συνδέσεων που περιγράφηκε προηγουμένως, έτσι ώστε να χειρίζεται με ένα συνδυασμένο τρόπο δικτυακούς και υπολογιστικούς πόρους για την εκτέλεση των διεργασιών. Ο προτεινόμενος αλγόριθμος επιστρέφει: (i) τη συστοιχία υπολογιστών όπου θα εκτελεστεί η διεργασία, (ii) το μονοπάτι το οποίο θα ακολουθήσουν τα δεδομένα εισόδου, (iii) τη χρονική στιγμή εκκίνησης μετάδοσης και (iv) τη χρονική στιγμή εκκίνησης εκτέλεσης της διεργασίας στη συστοιχία υπολογιστών. Ξεκινάμε παρουσιάζοντας έναν αλγόριθμο μη πολυωνυμικού χρόνου και μετά, αφού μειώσουμε κατάλληλα το χώρο λύσεων, δίνουμε έναν ευριστικό αλγόριθμο πολυωνυμικής πολυπλοκότητας. / Optical networks have developed rapidly over the last ten years and are widely used in core networks due to their superior transmission characteristics. Optical networks provide huge available capacity that can be efficiently utilized using wavelength division multiplexing (WDM) and high reliability at the lowest cost per bit ratio when compared to the other wired and wireless networking solutions. Much research has focused on ways to evolve from the typical point-to-point opaque WDM networks that are currently employed in the core to optical networks that are dynamically and quickly reconfigurable and can provide on-demand services to users at subwavelength granularity according to users’ requirements. The most common architecture utilized for establishing communication in WDM optical networks is wavelength routing that fall in the general category of Optical Circuit Switched (OCS) networks. The switched entities in OCS networks are the lightpaths and the basic optimization problem that is related to the efficient allocation of bandwidth is the routing and wavelength assignment problem (RWA). The current optical technology employed in core networks is point-to-point transmission, where the signal is regenerated at every intermediate node via optical-electronic-optical (OEO) conversion. During the recent few years, the trend clearly shows an evolution towards low-cost and high capacity all-optical transparent networks that do not utilize OEO. In transparent OCS networks the signal of a lightpath remains in the optical domain and its quality deteriorates due to a series of physical layer impairments (PLIs). These PLIs may degrade the received signal quality to the extent that the bit-error rate (BER) at the receiver may be so high that signal detection may be infeasible for some lightpaths. To address this problem we proposed algorithms that take into account the PLIs, usually referred in the literature as Impairment Aware RWA or ΙΑ-RWA algorithms, for both offline (static) and online (dynamic) traffic. In particular we propose an IA-RWA algorithm for static traffic that is based on an LP-relaxation formulation and use various efficient methods to obtain integer solutions. The physical layer impairments are included as additional constraint in the LP formulation of the RWA problem, yielding a cross-layer optimization solution between the network and the physical layers. We then proceed and propose a multi-cost IA-RWA algorithm for dynamic traffic. We define a cost vector per link and associative operators to combine these vectors so as to calculate the cost vector of a path. The parameters of these cost vectors are chosen so as to enable the quick and efficient calculation of the quality of transmission of candidate lightpaths. To serve a connection request, the proposed multi-cost algorithm calculates the set of so called non-dominated paths from the given source to the given destination, and then applies an optimization policy to choose the optimal lightpath. We propose and evaluate various optimization policies that correspond to different online IA-RWA algorithms. We then turn our attention to Optical Burst Switched (OBS) networks, which are regarded as the next step from the OCS paradigm towards a more dynamic core network that can provide on demand subwavelength services to users. In OBS networks, the packets that have the same destination and similar quality of service requirements are aggregated into bursts at the ingress nodes. When a burst is aggregated, a control packet is transmitted and is electronically processed at intermediate nodes so as to configure them for the burst that will pass transparently afterwards. We focus on two key elements of an OBS network, and in particular the burst aggregation (or burstification) process and the signaling protocol, and we propose two solutions for the efficient allocation of bandwidth in OBS networks. We propose and evaluate a novel burst assembly algorithm that is based on the average delay of the packets that comprise a burst. We show that the proposed algorithm decreases the packet delay jitter among the packets, which is important for a number of applications, including real-time, video and audio streaming, and TCP applications. Next we propose a two-way reservation signaling protocol that utilizes in-advance and relaxed timed reservation of the bandwidth. In the connection establishment phase of the proposed protocol, bandwidth reservations can exceed the duration of burst transmission (thus, relaxing the timed reservations), so as to increase the acceptance probability for the rest of the path. By controlling the degree of the relaxed timed reservations the protocol can also provide service differentiation to the users. Next we examine the problem of routing and scheduling of connections with flexible starting time in networks that support advance reservations. This problem can arise in slightly different settings in Optical Circuit Switched, Optical Burst Switched, and Optical Packet Switched networks. Such connection requests are served through advanced reservations, a process which is used to provide quality of service to users. We assume that for a connection request we are given the source, the destination, and the size of the data to be transferred with a given rate, and we are asked to provide the path and the time that the transmission should start so as to optimize a certain performance metric. We discretize the time and we use appropriate data structures (in the form of vectors) to map the utilization of the links as a function of time. We use these vectors as cost parameters in a multi-cost algorithm. We initially present a multicost algorithm of non-polynomial complexity that uses a full domination relation between paths. We then propose two mechanisms to prune the solution space in order to obtain polynomial complexity algorithms. In the first mechanism we define pseudo-domination relations that are weaker than the full domination relation. We also propose a branch-and-bound extension to the optimum algorithm that can be used for a given specific optimization function. The performance of the multicost algorithm and its variations are evaluated in an OBS network, but this does not limit the applicability of the algorithm and the conclusions can be extended in the other optical networking paradigms. Finally, we examine the problem of joint reservation of communication and computation resources that are required by a task in a Grid Network. Grid Networks are considered as the next step in distributed systems, introducing the concept of shared usage of geographically distributed and heterogeneous resources (computation, storage, communication, etc.). We assume that the task execution consists of two phases: (a) the transfer of the input data from a data storage resource, or the scheduler to a computation resource (cluster), (b) the execution of a program at the cluster. We extend the multicost algorithm for the routing and scheduling of connections, outlined above, so as to handle the reservation of computation resources as its last leg. In this way the proposed algorithm performs a joint optimization for the communication and computation part required by a task and returns: (i) the cluster to the execute the task, (ii) the path to route the input data, (iii) the time to start the transmission of data, and (iv) the time to start the execution of the task. We start by presenting an algorithm of non-polynomial complexity and then by appropriately pruning the solution space, we give a heuristic algorithm of polynomial complexity. We show that in a Grid network where the tasks are cpu- and data-intensive important performance benefits can be obtained by jointly optimizing the use of the communication and computation resources. Οπτικά δίκτυα 621.382 7 Optical networks Optical burst switched networks Burst assembly process
59	Mechanismy plánování RT úloh při nedostatku výpočetních a energetických zdrojů / Mechanisms for Scheduling RT Tasks during Lack of Computational and Energy Sources Pokorný, Martin January 2012 (has links) This term project deals with the problem of scheduling real-time tasks in overload conditions and techniques for lowering power consumption. Each of these parts features mechanisms and reasons for their using. There are also described specific algorithms, that are implemented, in operating system uC/OS-II, and compared in next phase of master's thesis.

Search results