Global ETD Search

1	GRID teikiamų servisų efektyvumo analizė / GRID QoS efficiency analysis Sandonavičius, Donatas 16 August 2007 (has links) GRID servisų kokybės parametrai labai aktualus klausimas šiais laikais. Šiame darbe pristatoma sukurta metodika GRID Servisų kokybės efektyvumo tyrimui. Taip pat pristatoma metodika kaip atlikti preliminarūs GRID tinklų parametrų skaičiavimai. Bei atliktas eksperimentas ir eksperimento rezultatai. / Nowadays GRID computing issues become urgent. GRID quality of service (QoS) importance grows together with the development of GRID technology. The importance of QoS of the grid lies in at least three aspects: the concept of grid service, the multiplicity of users’ demands, and the heterogeneity of grid resources (Pu Juhua, Xiong Zhang and Wu Zhenxing. The Research On QoS For Grid Computing.). Besides discussing GRID QoS parameters, there are also suggestions about the methods to obtain them. The scope of this document is to research the GRID QoS aspects and the description of developed methodology for estimating QoS efficiency evaluating network and clusters parameters. The results of efficiency experiments done using the developed methodology are presented there. Informatics GRID tinklai Programinė įranga Kompiuterių tinklai GRID networks Software Computer networks
2	Pasaulinio tinklo paslaugos (Web-Servisai) Grid tinkluose / Web-Services and Grid`s Sabestinas, Remigijus 28 August 2009 (has links) Šis darbas aprašo pagrindinius Grid paslaugų kūrimo metodus, kurie bando pasiūlyti operacinius suderinamumus, stabilumą ir produktyvumą, nepaisant to, kad pasaulinio tinklo paslaugos yra jauna technologija su keliais standartais. Tikimės, kad Grid paslaugos sukurtos iš pasaulinio tinklo paslaugų parodys vykdymo problemas, bent jau pradžioje, ir kad eksperimentavimas ir alternatyvų kūrimas, pareikalaus efektyvesnių pasaulinio tinklo paslaugų realizacijų. Pasiūlytas modelis sukurtas Grid architektūros pagrindu, taip išvengiant rizikos perkurti visą infrastuktūrą. / This work has described an approach to building Grid Services that attempts to promote interoperability, stability and productivity, despite the fact that Web Services are a young technology with few standards. We anticipate that Grid services built from Web Services will show performance problems, at least initially, and that experimentation and development of alternate, more efficient implementations of some of these Web Services will be necessary. Proposed model is based on existing Grid architecture, without the risk of rebuilding the whole infrastructure. Informatics Pasaulinio tinklo paslaugos Grid tinklai Grid paslaugos Web Services Grid networks Grid Services
3	Optimisation of large scale network problems Grigoleit, Mark Ted January 2008 (has links) The Constrained Shortest Path Problem (CSPP) consists of finding the shortest path in a graph or network that satisfies one or more resource constraints. Without these constraints, the shortest path problem can be solved in polynomial time; with them, the CSPP is NP-hard and thus far no polynomial-time algorithms exist for solving it optimally. The problem arises in a number of practical situations. In the case of vehicle path planning, the vehicle may be an aircraft flying through a region with obstacles such as mountains or radar detectors, with an upper bound on the fuel consumption, the travel time or the risk of attack. The vehicle may be a submarine travelling through a region with sonar detectors, with a time or risk budget. These problems all involve a network which is a discrete model of the physical domain. Another example would be the routing of voice and data information in a communications network such as a mobile phone network, where the constraints may include maximum call delays or relay node capacities. This is a problem of current economic importance, and one for which time-sensitive solutions are not always available, especially if the networks are large. We consider the simplest form of the problem, large grid networks with a single side constraint, which have been studied in the literature. This thesis explores the application of Constraint Programming combined with Lagrange Relaxation to achieve optimal or near-optimal solutions of the CSPP. The following is a brief outline of the contribution of this thesis. Lagrange Relaxation may or may not achieve optimal or near-optimal results on its own. Often, large duality gaps are present. We make a simple modification to Dijkstra’s algorithm that does not involve any additional computational work in order to generate an estimate of path time at every node. / We then use this information to constrain the network along a bisecting meridian. The combination of Lagrange Relaxation (LR) and a heuristic for filtering along the meridian provide an aggressive method for finding near-optimal solutions in a short time. Two network problems are studied in this work. The first is a Submarine Transit Path problem in which the transit field contains four sonar detectors at known locations, each with the same detection profile. The side constraint is the total transit time, with the submarine capable of 2 speeds. For the single-speed case, the initial LR duality gap may be as high as 30%. The first hybrid method uses a single centre meridian to constrain the network based on the unused time resource, and is able to produce solutions that are generally within 1% of optimal and always below 3%. Using the computation time for the initial Lagrange Relaxation as a baseline, the average computation time for the first hybrid method is about 30% to 50% higher, and the worst case CPU times are 2 to 4 times higher. The second problem is a random valued network from the literature. Edge costs, times, and lengths are uniform, randomly generated integers in a given range. Since the values given in the literature problems do not yield problems with a high duality gap, the values are varied and from a population of approximately 100,000 problems only the worst 200 from each set are chosen for study. These problems have an initial LR duality gap as high as 40%. A second hybrid method is developed, using values for the unused time resource and the lower bound values computed by Dijkstra’s algorithm as part of the LR method. The computed values are then used to position multiple constraining meridians in order to allow LR to find better solutions. / This second hybrid method is able to produce solutions that are generally within 0.1% of optimal, with computation times that are on average 2 times the initial Lagrange Relaxation time, and in the worst case only about 5 times higher. The best method for solving the Constrained Shortest Path Problem reported in the literature thus far is the LRE-A method of Carlyle et al. (2007), which uses Lagrange Relaxation for preprocessing followed by a bounded search using aggregate constraints. We replace Lagrange Relaxation with the second hybrid method and show that optimal solutions are produced for both network problems with computation times that are between one and two orders of magnitude faster than LRE-A. In addition, these hybrid methods combined with the bounded search are up to 2 orders of magnitude faster than the commercial CPlex package using a straightforward MILP formulation of the problem. Finally, the second hybrid method is used as a preprocessing step on both network problems, prior to running CPlex. This preprocessing reduces the network size sufficiently to allow CPlex to solve all cases to optimality up to 3 orders of magnitude faster than without this preprocessing, and up to an order of magnitude faster than using Lagrange Relaxation for preprocessing. Chapter 1 provides a review of the thesis and some terminology used. Chapter 2 reviews previous approaches to the CSPP, in particular the two current best methods. Chapter 3 applies Lagrange Relaxation to the Submarine Transit Path problem with 2 speeds, to provide a baseline for comparison. The problem is reduced to a single speed, which demonstrates the large duality gap problem possible with Lagrange Relaxation, and the first hybrid method is introduced. / Chapter 4 examines a grid network problem using randomly generated edge costs and weights, and introduces the second hybrid method. Chapter 5 then applies the second hybrid method to both network problems as a preprocessing step, using both CPlex and a bounded search method from the literature to solve to optimality. The conclusion of this thesis and directions for future work are discussed in Chapter 6.
4	Χρονοπρογραμματισμός και δρομολόγηση σε δίκτυα πλέγματος και δίκτυα δεδομένων Κόκκινος, Παναγιώτης 05 January 2011 (has links) Τα δίκτυα πλέγματος (grid networks) αποτελούνται από ένα σύνολο ισχυρών υπολογιστικών, αποθηκευτικών και άλλων πόρων. Οι πόροι αυτοί είναι συνήθως γεωγραφικά αλλά και διοικητικά διασκορπισμένοι και συνδέονται με ένα δίκτυο δεδομένων. Τα δίκτυα πλέγματος το τελευταίο καιρό έχουν αποκτήσει μία δυναμική, η οποία εντάσσεται μέσα σε ένα γενικότερο πλαίσιο, αυτό της κατανεμημένης επεξεργασίας και αποθήκευσης δεδομένων. Επιστήμονες, ερευνητές αλλά και απλοί χρήστες χρησιμοποιούν από κοινού τους κατανεμημένους πόρους για την εκτέλεση διεργασιών ή τη χρήση εφαρμογών, για τις οποίες δεν μπορούν να χρησιμοποιήσουν τους τοπικά διαθέσιμους υπολογιστές τους λόγω των περιορισμένων δυνατοτήτων τους. Στην παρούσα διδακτορική διατριβή εξετάζουμε ζητήματα που σχετίζονται με το χρονοπρογραμματισμό (scheduling) των διεργασιών στους διαθέσιμους πόρους, καθώς και με τη δρομολόγηση (routing) των δεδομένων που οι διεργασίες χρειάζονται. Εξετάζουμε τα ζητήματα αυτά είτε χωριστά, είτε σε συνδυασμό, μελετώντας έτσι τις αλληλεπιδράσεις τους. Αρχικά, προτείνουμε ένα πλαίσιο παροχής ποιότητας υπηρεσιών στα δίκτυα πλέγματος, το οποίο μπορεί να εγγυηθεί σε ένα χρήστη μία μέγιστη χρονική καθυστέρηση εκτέλεσης των διεργασιών του. Με τον τρόπο αυτό, ένας χρήστης μπορεί να επιλέξει με απόλυτη βεβαιότητα εκείνον τον υπολογιστικό πόρο που μπορεί να εκτελέσει τη διεργασία του πριν τη λήξη της προθεσμίας της. Το προτεινόμενο πλαίσιο δεν στηρίζεται στην εκ των προτέρων δέσμευση των υπολογιστικών πόρων, αλλά στο ότι οι χρήστες μπορούν να αυτό-περιορίσουν το ρυθμό δημιουργίας διεργασιών τους, ο οποίος συμφωνείται ξεχωριστά με κάθε πόρο κατά τη διάρκεια μίας φάσης εγγραφής τους. Πραγματοποιούμε έναν αριθμό πειραμάτων προσομοίωσης που αποδεικνύουν ότι το προτεινόμενο πλαίσιο μπορεί πράγματι να παρέχει στους χρήστες εγγυημένο μέγιστο χρόνο καθυστέρησης εκτέλεσης των διεργασιών τους, ενώ με τις κατάλληλες επεκτάσεις το πλαίσιο μπορεί να χρησιμοποιηθεί ακόμα και όταν το φορτίο των διεργασιών δεν είναι εκ των προτέρων γνωστό. Στη συνέχεια εξετάζουμε το πρόβλημα της ``Συγκέντρωσης Δεδομένων'' (ΣΔ), που εμφανίζεται όταν μία διεργασία χρειάζεται περισσότερα του ενός τμήματα δεδομένων να μεταφερθούν σε έναν υπολογιστικό πόρο, πριν η διεργασία ξεκινήσει την εκτέλεσή της σε αυτόν. Μελετάμε τα υπό-προβλήματα της επιλογής των αντιγράφων των δεδομένων, του χρονοπρογραμματισμού της διεργασίας και της δρομολόγησης των δεδομένων της και προτείνουμε έναν αριθμό πλαισίων ``Συγκέντρωσης Δεδομένων''. Μερικά πλαίσια εξετάζουν μόνο τις υπολογιστικές ή μόνο τις επικοινωνιακές απαιτήσεις των διεργασιών, ενώ άλλα εξετάζουν και τα δύο είδη απαιτήσεων. Επιπλέον, προτείνονται πλαίσια ``Συγκέντρωσης Δεδομένων'' τα οποία βασίζονται στην κατασκευή ελαχίστων γεννητικών δέντρων(Minimum Spanning Tree - MST), με σκοπό τη μείωση της συμφόρησης στο δίκτυο δεδομένων, που εμφανίζεται κατά την ταυτόχρονη μεταφορά των δεδομένων μίας διεργασίας. Στα πειράματα προσομοίωσης μας αξιολογούμε τα προτεινόμενα πλαίσια και δείχνουμε ότι αν η διαδικασία της ``Συγκέντρωση Δεδομένων'' πραγματοποιηθεί σωστά, τότε η απόδοση του δικτύου πλέγματος, όσον αφορά τη χρήση των πόρων και την εκτέλεση των διεργασιών, μπορεί να βελτιωθεί. Επιπλέον, ερευνούμε την εφαρμογή τεχνικών σύνοψης της πληροφορίας των χαρακτηριστικών των πόρων στα δίκτυα πλέγματος. Προτείνουμε ένα σύνολο μεθόδων και τελεστών σύνοψης, προσπαθώντας να μειώσουμε τον όγκο των πληροφοριών πόρων που μεταφέρονται πάνω από το δίκτυο, ενώ παράλληλα επιθυμούμε οι συνοπτικές πληροφορίες που παράγονται να βοηθούν το χρονοπρογραμματιστή να παίρνει αποδοτικές αποφάσεις ανάθεσης διεργασιών στους διαθέσιμους πόρους. Οι τεχνικές αυτές μπορούν να συνδυαστούν και με τις αντίστοιχες τεχνικές που εφαρμόζονται στα ιεραρχικά δίκτυα δεδομένων για τη δρομολόγηση, εξασφαλίζοντας έτσι τη διαλειτουργικότητα μεταξύ διαφορετικών δικτύων πλέγματος καθώς και το απόρρητο των πληροφοριών που ανήκουν σε διαφορετικούς παρόχους πόρων. Στα πειράματα προσομοίωσης μας χρησιμοποιούμε σαν μετρική της ποιότητας / αποδοτικότητας των αποφάσεων του χρονοπρογραμματιστή τον Stretch Factor (SF), που ορίζεται ως ο λόγος της μέσης καθυστέρησης εκτέλεσης των διεργασιών όταν αυτές χρονοπρογραμματίζονται με βάση ακριβείς πληροφορίες πόρων, προς τη μέση καθυστέρηση τους όταν χρησιμοποιούνται συνοπτικές πληροφορίες. Ακόμα, μετράμε τη συχνότητα με την οποία ο χρονοπρογραμματιστής ενημερώνεται για τις αλλαγές στην κατάσταση των πόρων καθώς και τον όγκο των πληροφοριών πόρων που μεταφέρονται. Μελετάμε, ακόμα, ζητήματα που προκύπτουν από την υλοποίηση αλγορίθμων χρονοπρογραμματισμού που έχουν αρχικά μελετηθεί σε περιβάλλοντα προσομοίωσης, σε πραγματικά συστήματα ενδιάμεσου λογισμικού (middleware) για δίκτυα πλέγματος, όπως το gLite. Το πρώτο ζήτημα που εξετάζουμε είναι το γεγονός ότι οι πληροφορίες που παρέχονται στους αλγορίθμους χρονοπρογραμματισμού στα συστήματα αυτά δεν είναι πάντα έγκυρες, ενώ το δεύτερο ζήτημα είναι ότι δεν υπάρχει ευελιξία στο διαμοιρασμό των πόρων μεταξύ διαφορετικών διεργασιών. Η μελέτη μας δείχνει ότι με απλές αλλαγές στους μηχανισμούς διαχείρισης διεργασιών ενός συστήματος ενδιάμεσου λογισμικού, αυτά αλλά και άλλα ζητήματα μπορούν να αντιμετωπιστούν, επιτυγχάνοντας σημαντικές βελτιώσεις στην απόδοση των δικτύων πλέγματος. Στα πλαίσια αυτά μάλιστα, εξετάζουμε τη χρήση της τεχνολογίας της εικονικοποίησης (virtualization). Υλοποιούμε και αξιολογούμε τους προτεινόμενους μηχανισμούς σε ένα μικρό δοκιμαστικό δίκτυο πλέγματος. Τέλος, προτείνουμε έναν αλγόριθμο πολλαπλών κριτηρίων για τη δρομολόγηση και ανάθεση μήκους κύματος υπό την παρουσία φυσικών εξασθενήσεων (Impairment-Aware Routing and Wavelength Assignment, IA-RWA) για οπτικά δίκτυα δεδομένων. Τα οπτικά δίκτυα είναι η δικτυακή τεχνολογία που χρησιμοποιείται σήμερα για τη διασύνδεση των υπολογιστικών και αποθηκευτικών πόρων των δικτύων πλέγματος, ενώ οι διάφορες φυσικές εξασθενήσεις τείνουν να μειώνουν την ποιότητα μετάδοσης (Quality of Transmission - QoT) των οπτικών σημάτων. Κύριο χαρακτηριστικό του προτεινόμενου αλγορίθμου είναι ότι υπολογίζει την ποιότητα μετάδοσης (Quality of Transmission - QoT) ενός υποψήφιου οπτικού μονοπατιού (lightpath) μη βασιζόμενο σε πραγματικές μετρήσεις ή εκτιμήσεις μέσω αναλυτικών μοντέλων των διαφόρων φυσικών εξασθενήσεων, αλλά μετρώντας τις αιτίες στις οποίες αυτά οφείλονται. Με τον τρόπο αυτό ο αλγόριθμος γίνεται πιο γενικός και εφαρμόσιμος σε διαφορετικές συνθήκες (μέθοδοι διαμόρφωσης του οπτικού σήματος, ρυθμοί μετάδοσης, τιμές διαφόρων φυσικών παραμέτρων, κ.α.). Τα πειράματα προσομοίωσης μας δείχνουν ότι ο προτεινόμενος αλγόριθμος μπορεί να εξυπηρετήσει τις περισσότερες δυναμικές αιτήσεις σύνδεσης, υπολογίζοντας γρήγορα, μονοπάτια με καλή ποιότητα μετάδοσης σήματος. Γενικά, η παρούσα διδακτορική διατριβή παρουσιάζει έναν αριθμό σημαντικών και καινοτόμων μεθόδων, πλαισίων και αλγορίθμων που αφορούν τα δίκτυα πλέγματος. Παράλληλα ωστόσο αποκαλύπτει το εύρος των ζητημάτων και ως ένα βαθμό και τις αλληλεπιδράσεις τους, που σχετίζονται με την αποδοτική λειτουργία των δικτύων πλέγματος, τα οποία απαιτούν τη σύνθεση και τη συνεργασία ερευνητών, μηχανικών και επιστημόνων από διάφορα πεδία. / Grid networks consist of several high capacity, computational, storage and other resources, which are geographically distributed and may belong to different administrative domains. These resources are usually connected through high capacity optical networks. The grid networks evolution follows the current trend of distributedly performed computation and storage. This trend provides several new possibilities to scientists, researchers and to simple users around the world, so as to use the shared resources for executing their tasks and running their applications. These operations are not always possible to perform in local, limited capacity, resources. In this thesis we study issues related to the scheduling of tasks and the routing of their datasets. We study these issues both separately and jointly, along with their interactions. Initially, we present a Quality of Service (QoS) framework for grids that guarantees to users an upper bound on the execution delay of their submitted tasks. Such delay guarantees imply that a user can choose, with absolute certainty, a resource to execute a task before its deadline expires. Our framework is not based on the advance reservation of resources, instead, the users follow a self constrained task generation pattern, which is agreed separately with each resource during a registration phase. We validate experimentally the proposed Quality of Service (QoS) framework for grids, verifying that it satisfies the delay guarantees promised to users. In addition, when the proposed extensions are used, the framework also provides delay guarantees without exact a-priori knowledge of the task workloads. Next, we examine a task scheduling and data migration problem for grid networks, which we refer to as the Data Consolidation (DC) problem. Data Consolidation arises when a task requests concurrently multiple pieces of data, possibly scattered throughout the grid network that have to be present at a selected site before the task's execution starts. In such a case, the scheduler must select the data replicas to be used, the site where these data will be gathered for the task to be executed, and the routing paths to be followed. We propose and experimentally evaluate several Data Consolidation schemes. Some consider only the computational or only the communication requirements of the tasks, while others consider both kinds of requirements. We also propose Data Consolidation (DC) schemes, which are based on Minimum Spanning Trees (MST) that route concurrently the datasets so as to reduce the congestion that may appear in the future, due to these transfers. In our simulation experiments we validate the proposed schemes and show that if the Data Consolidation operation is performed efficiently, then significant benefits can be achieved, in terms of the resources' utilization and task delay. We also consider the use of resource information aggregation in grid networks. We propose a number of aggregation schemes and operators for reducing the information exchanged in a grid network and used by the resource manager in order to make efficient scheduling decisions. These schemes can be integrated with the schemes utilized in hierarchical data networks for data routing, providing interoperability between different grid networks, while the sensitive or detailed information of resource providers is kept private. We perform a large number of experiments to evaluate the proposed aggregation schemes and the used operators. As a metric of the quality of the aggregated information we introduce the Stretch Factor (SF), defined as the ratio of the task delay when the task is scheduled using complete resource information over the task delay when an aggregation scheme is used. We also measure the number of resource information updates triggered by each aggregation scheme and the amount of resource information transferred. In addition, we are interested in the difficulties encountered and the solutions provided in order to develop and evaluate scheduling policies, initially implemented in a simulation environment, in the gLite grid middleware. We identify two important such implementation issues, namely the inaccuracy of the information provided to the scheduler by the information system, and the inflexibility in the sharing of a resource among different jobs. Our study indicates that simple changes in the gLite's scheduling procedures can solve these and other similar issues, yielding significant performance gains. We also investigate the use of the virtualization technology in the gLite middleware. We implement and evaluate the proposed mechanisms in a small gLite testbed. Finally, we propose a multicost impairment-aware routing and wavelength assignment (IA-RWA) algorithm in optical networks. In general, physical impairments tend to degrade the optical signal quality. Also, optical networks is the main networking technology used today for the interconnection of the grid's, computational and storage, resources around the world. The main characteristic of the proposed algorithm is that it calculates the quality of transmission (QoT) of a candidate lightpath by measuring several impairment-generating source parameters and not by using complex formulas to directly account for the effects of physical impairments. In this way, this approach is more generic and more easily applicable to different conditions (modulation formats, bit rates). Our results indicate that the proposed impairment-aware routing and wavelength assignment (IA-RWA) algorithm can efficiently serve the online traffic in an optical network and to guarantee the transmission quality of the found lightpaths, with low running times. In general, in this thesis we present several novel mechanisms and algorithms for grid networks. At the same time, this Thesis reveals the variety of the issues that relate to the efficient operation of the grid networks and their interdependencies. For handling all these issues the cooperation of researches, scientists and engineers from various fields, is required. Δίκτυα πλέγματος 004.6 Grid networks Data networks Task scheduling Routing Quality of service gLite middleware
5	One To Mant And Many To Many Collective Communication Operations On Grids Gupta, Rakhi 12 1900 (has links) Collective Communication Operations are widely used in MPI applications and play an important role in their performance. Hence, various projects have focused on optimization of collective communications for various kinds of parallel computing environments including LAN settings, heterogeneous networks and most recently Grid systems. The distinguishing factor of Grids from all the other environments is heterogeneity of hosts and network, and dynamically changing resource characteristics including load and availability. The ﬁrst part of the thesis develops a solution for MPI broadcast (one-to-many) on Grids. Some current strategies take into consideration static information about network topology for determining an efficient broadcast tree for Grids. Some other strategies take into account only transient network characteristics. We combined both these strategies and cluster the network dynamically on the basis of link bandwidths. Given a set of network parameters we use Simulated Annealing (SA) to obtain the best schedule. Also, we can time tune individual. SAs, to adapt the solution ﬁnding process, on the basis of estimated available times before next broadcast invocations in the application. We also developed software architecture for updation of schedules. We compared our algorithm with the earlier approaches under loaded network conditions, and obtained average performance improvement of 20%. The second part of the thesis extends the work for MPI all gather (many-to-many) operation. Current popular techniques consider strict hierarchical schemes for this operation, wherein from each cluster a representative (or coordinator) node is chosen, and inter cluster communication is done through these representative nodes. This is non optimal as inter cluster communication is usually on high capacity links that can sustain more than one transfer with the same through- put. We developed a cluster based and incremental heuristic algorithm for allgather on Grids. We compared the time taken by allgather schedules determined by this algorithm with current popular implementations. We also compared our algorithm with a strategy where allgather is constructed from a set of broadcast trees. We obtained average performance improvement of 67% over existing strategies. Collective Communication Operations Message Passing Interface Grid Networks Grids MPI Allgather - Algorithms One-To-Many Collective Communication Many-To-Many Collective Communication Computer Science
6	Μελέτη και ανάπτυξη τεχνικών για την αποτελεσματική διαχείριση πόρων σε δίκτυα πλέγματος και υποδομές υπολογιστικών νεφών Κρέτσης, Αριστοτέλης 25 February 2014 (has links) Οι τεχνολογίες κατανεμημένου υπολογισμού, όπως τα δίκτυα πλέγματος και οι υποδομές Νέφους, έχουν διαμορφώσει πλέον ένα καινούργιο περιβάλλον σχετικά με τον τρόπο που εκτελούνται οι εργασίες των χρηστών, αποθηκεύονται τα δεδομένα και γενικότερα χρησιμοποιούνται οι εφαρμογές. Τα δίκτυα πλέγματος αποτέλεσαν το επίκεντρο της σχετικής ερευνητικής δραστηριότητας για μεγάλο χρονικό διάστημα, με βασικό στόχο τη δημιουργία υποδομών για την εκτέλεση ερευνητικών εφαρμογών με πολύ υψηλές υπολογιστικές και αποθηκευτικές απαιτήσεις. Ωστόσο είναι πλέον προφανές ότι υπάρχει μια στροφή προς τις υποδομές Νέφους που προσφέρουν υπηρεσίες κατανεμημένου υπολογισμού και αποθήκευσης μέσω πλήρως διαχειρίσιμων πόρων. Η συγκεκριμένη μετάβαση έχει ως αποτέλεσμα μια μετατόπιση από το μοντέλο των πολλών και ισχυρών πόρων που βρίσκονται κατανεμημένοι σε διάφορες περιοχές του κόσμου (όπως στα δίκτυα πλέγματος) προς σχετικά λιγότερα αλλά πολύ μεγαλύτερα ως προς το μέγεθος κέντρα δεδομένων τα οποία αποτελούνται από χιλιάδες υπολογιστικούς πόρους οι οποίοι φιλοξενούν ακόμη περισσότερες εικονικές μηχανές. Η έρευνα που διεξάγαμε ακολούθησε αυτή την αλλαγή, μελετώντας αλγοριθμικά θέματα για δίκτυα πλέγματος και υποδομές Νεφών και αναπτύσσοντας μια σειρά από εργαλεία και εφαρμογές που διαχειρίζονται, παρακολουθούν και αξιοποιούν τους πόρους που προσφέρουν οι συγκεκριμένες υποδομές. Αρχικά, μελετούμε τα ζητήματα που προκύπτουν κατά την υλοποίηση αλγορίθμων χρονοπρογραμματισμού, που είχαν προηγουμένως μελετηθεί σε περιβάλλοντα προσομοίωσης, σε ένα πραγματικό σύστημα ενδιάμεσου λογισμικού για δίκτυα πλέγματος, και συγκεκριμένα το gLite. Το πρώτο ζήτημα που αντιμετωπίσαμε είναι το γεγονός ότι οι πληροφορίες που παρέχει το ενδιάμεσο λογισμικό gLite στους αλγορίθμους χρονοπρογραμματισμού δεν είναι πάντα έγκυρες, γεγονός που επηρεάζει την αποδοσή τους. Για την αντιμετώπιση του προβλήματος αναπτύξαμε ένα εσωτερικό, στο χρονοπρογραμματιστή, μηχανισμό που καταγράφει τις αποφάσεις του σχετικά με ποιές εργασίες ανατέθηκαν σε ποιούς υπολογιστικούς πόρους και λειτουργεί συµπληρωµατικά µε την υπηρεσία πληροφοριών του gLite. Επιπλέον, εξετάζουμε το ζήτημα του δίκαιου διαμοιρασμού της υπολογιστικής χωρητικότητας ενός πόρου στις εργασίες που έχουν ανατεθεί σε αυτόν. Για το σκοπό αυτό, επεκτείνουμε το ενδιάμεσο λογισμικό gLite ώστε να περιλαμβάνει ένα νέο μηχανισμό που μέσω της αξιοποίησης της τεχνολογίας εικονικοποίησης επιτρέπει τον ταυτόχρονο διαμοιρασμό της υπολογιστικής χωρητικότητας ενός κόμβου σε πολλές εργασίες. Στην συνέχεια εξατάζουμε το πρόβλημα της συνδυασμένης μεταφοράς πολλαπλών εικονικών μηχανών σε σύγχρονες υπολογιστικές υποδομές. Πιο συγκεκριμένα, προτείνουμε μια μεθοδολογία που στοχεύει στην καλύτερη χρησιμοποίηση των διαθέσιμων υπολογιστικών και δικτυακών πόρων, λαμβάνοντας υπόψη στις αποφάσεις σχετικά με τη συνδυασμένη μεταφορά εικονικών μηχανών τις αλληλεξαρτήσεις που δημιουργούνται από την επικοινωνία τους. Η προτεινόμενη μεθοδολογία χρησιμοποιεί την προσέγγιση πολλαπλών κριτηρίων για την επιλογή των εικονικών μηχανών που θα μετακινηθούν, αναθέτοντας διαφορετικά βάρη στα διάφορα κριτήρια ενδιαφέροντος. Επιπλέον, επιλέγει τους υπολογιστικούς κόμβους όπου οι μετακινούμενες εικονικές μηχανές θα φιλοξενηθούν, λαμβάνοντας υπόψη τον τρόπο με τον οποίο οι μετακινήσεις επηρεάζουν τις λογικές (ή εικονικές) τοπολογίες που σχηματίζονται από την επικοινωνία τους και αντιμετωπίζοντας τη συγκεκριμένη επιλογή ως ένα πρόβλημα αναδιάρθρωσης λογικών τοπολογιών. Η αξιολόγηση επιβεβαίωσε τη δυνατότητα της μεθοδολογίας να επιλύει, μέσω των κατάλληλων μετακινήσεων, ένα σημαντικό αριθμό προβλημάτων που οφείλονται σε ελλείψεις υπολογιστικών ή επικοινωνιακών πόρων, ελαχιστοποιώντας παράλληλα τον αριθμό των μετακινήσεων και την προκαλούμενη επιβάρυνση του δικτύου. Το επόμενο θέμα που εξετάζουμε αφορά το πρόβλημα της ανάλυσης δεδομένων επικοινωνίας μεταξύ εικονικών μηχανών οι οποίες φιλοξενούνται σε ένα κέντρο δεδομένων. Προτείνουμε και αξιολογούμε, μέσω της ανάλυσης δεδομένων από ένα πραγματικό κέντρο δεδομένων, την εφαρμογή μετρικών και τεχνικών από τη θεωρία ανάλυσης κοινωνικών δικτύων για τον προσδιορισμό σημαντικών εικονικών μηχανών, για παράδειγμα εικονικές μηχανές οι οποίες απαιτούν περισσότερο εύρος ζώνης σε σχέση με άλλες, και ομάδων εικονικών μηχανών που συσχετίζονται με κάποιο τρόπο μεταξύ τους. Μέσω της συγκεκριμένης προσέγγισης έχουμε τη δυνατότητα να εξάγουμε σημαντικές πληροφορίες οι οποίες μπορούν να αξιοποιηθούν για τη λήψη καλύτερων αποφάσεων σχετικά με τη διαχείριση του πολύ μεγάλου πλήθους των εικονικών μηχανών που φιλοξενούνται στα σύγχρονα κέντρα δεδομένων. Στη συνέχεια προσδιορίζουμε τρόπους με τους οποίους οι πληροφορίες παρακολούθησης που συλλέγονται από τη λειτουργία μιας δημόσιας υποδομής Υπολογιστικού Νέφους, και ιδίως από την υπηρεσία Amazon Web Services (AWS), μπορούν να χρησιμοποιηθούν με ένα αποδοτικό τρόπο προκειμένου να εξάγουμε πολύτιμες πληροφορίες, που μπορούν να αξιοποιηθούν από τους τελικούς χρήστες για την αποτελεσματικότερη διαχείριση των εικονικών πόρων τους. Πιο συγκεκριμένα, παρουσιάζουμε το σχεδιασμό και την υλοποίηση ενός εργαλείου ανοιχτού κώδικα, του SuMo, στο όποιο έχουμε υλοποίησει όλη την απαραίτητη λειτουργικότητα για τη συλλογή και ανάλυση δεδομένων παρακολούθησης από την υπηρεσία AWS. Επιπλέον, προτείνουμε ένα μηχανισμό για τη βελτιστοποίηση του κόστους και της αξιοποίησης (Cost and Utilization Optimization - CUO) των εικονικών υπολογιστικών πόρων της υπηρεσίας AWS. Ο μηχανισμός CUO χρησιμοποιεί πληροφορίες (πλήθος, ακριβή χαρακτηριστικά, ποσοστό αξιοποίησης) για τους διαθέσιμους εικονικούς πόρους ενός χρήστη και προτείνει ένα νέο (βέλτιστο) σύνολο πόρων που θα μπορούσαν να χρησιμοποιηθούν για την αποδοτικότερη εξυπηρέτηση του ίδιου φορτίου εργασίας με μειωμένο κόστος. Τέλος, παρουσιάζουμε την υλοποίηση ενός ολοκληρωμένου εργαλείου, που ονομάζουμε Mantis, για το σχεδιασμό και τη λειτουργία των μελλοντικών ευέλικτων (flex-grid) οπτικών δικτύων που υποστηρίζει επιπλέον οπτικά δίκτυα σταθερού πλέγματος τόσο μοναδικού ρυθμού μετάδοσης όσο και πολλαπλών ρυθμών μετάδοσης. Οι χρήστες έχουν τη δυνατότητα να καθορίζουν δικτυακές τοπολογίες, απαιτήσεις κίνησης, παραμέτρους για το κόστος απόκτησης και λειτουργίας των δικτυακών συσκευών, ενώ επιπλέον έχουν πρόσβαση σε αρκετούς αλγορίθμους για το σχεδιασμό, λειτουργία και αξιολόγηση διαφόρων οπτικών δικτύων. Το εργαλείο έχει σχεδιαστεί ώστε να μπορεί να λειτουργεί είτε ως υπηρεσία (Software as a Service) είτε ως κλασσική εφαρμογή (Desktop Application). Λειτουργώντας ως υπηρεσία παρέχει κλιμάκωση με βάση τις απαιτήσεις των χρηστών, αξιοποιώντας τα πλεονεκτήματα των υποδομών Υπολογιστικού Νέφους, εκτελώντας γρήγορα και αποτελεσματικά τις εργασίες των χρηστών. Για τη λειτουργία αυτή, μπορεί να χρησιμοποιεί τόσο δημόσιες υποδομές Υπολογιστικού Νέφους όπως η υπηρεσία Amazon Web Services (AWS) και η υπηρεσία της ΕΔΕΤ (~okeanos), όσο και ιδιωτικές που βασίζονται στο OpenStack. Επιπλέον, η αρθρωτή αρχιτεκτονική και η υλοποίηση των διαφόρων λειτουργικών τμημάτων επιτρέπουν την εύκολη επέκταση του εργαλείου ώστε να υποστηρίζει μελλοντικά περισσότερες υποδομές Υπολογιστικού Νέφους. / Distributed computing technologies, like grids and clouds, shape today a new environment, regarding the way tasks are executed, data are stored and retrieved, and applications are used. Though grids and desktop grids have been the focus of the research community for a long time, a shift has become evident today towards cloud and virtualization related technologies in general, which are supported by large computing factories, namely the data centers. As a result there is also a shift from the model of several powerful resources distributed at various locations in the world (as in grids) towards fewer huge data centers consisting of thousands of “simple” computers that host Virtual Machines. The research performed over the course of my PhD followed this shift, investigating algorithmic issues in the context of grids and then of clouds and developing a number of tools and applications that manage, monitor and utilize these kinds of resources. Initially, we describe the steps followed, the difficulties encountered, and the solutions provided in developing and evaluating a scheduling policy, initially implemented in a simulation environment, in the gLite grid middleware. Our focus is on a scheduling algorithm that allocates in a fair way the available resources among the requested users or jobs. During the actual implementation of this algorithm in gLite, we observed that the validity of the information used by the scheduler for its decisions affects greatly its performance. To improve the accuracy of this information, we developed an internal feedback mechanism that operates along with the scheduling algorithm. Also, a Grid computation resource cannot be shared concurrently between different users or jobs, making it difficult to provide actual fairness. For this reason we investigated the use of virtualization technology in the gLite middleware. We implement and evaluate our scheduling algorithm and the proposed mechanisms in a small gLite testbed. Next, we present a methodology, called communication-aware virtual infrastructures (COMAVI), for the concurrent migration of multiple Virtual Machines (VMs) in computing infrastructures, which aims at the optimum use of the available computational and network resources, by capturing the interdependencies between the communicating VMs. This methodology uses multiple criteria for selecting the VMs that will migrate, with different weights assigned to each of them. COMAVI also selects the computing sites where the migrating VMs will be hosted, by accounting for the way migration affects the logical (or virtual) topologies formed by the communicating VMs and viewing this selection as a logical topology reconfiguration problem. We apply COMAVI to two basic computing infrastructures that exhibit different constraints/criteria and characteristics: a grid infrastructure operating over a wide area network (WAN) and a data center infrastructure operating over a local area network (LAN). Through the presented methodology different communication-aware VM migration algorithms can be tailored to the needs of the resource provider. The algorithms presented resolve the maximum possible number of VM violations (due to computing or communication resource shortages), while tending to minimize the number of migrations performed, the induced network overhead, the logical topology reconfigurations required, and the corresponding service interruptions. We evaluate the proposed methods through simulations in realistic computing environments, and we exhibit their performance benefits. We also consider the use of social network analysis methods on communication traces, collected from Virtual Machines (VMs) located in computing infrastructures, like a data center. Our aim is to identify important VMs, for example VMs that require more bandwidth than other VMs or VMs that communicate often with other VMs. We believe that this approach can handle the large number of VMs present in computing infrastructures and their interactions in the same way social interactions of millions of people are analyzed in today’s social networks. We are interested in identifying measures that can locate these important VMs or groups of interacting VMs, missed through other usual metrics and also capture the time-dynamicity of their interactions. In our work we use real traces and evaluate the applicability of the considered methods and measures. In addition, we consider the analysis and optimization of public clouds. For this reason, we identify important algorithmic operations that should be part of a cloud analysis and optimization tool, including resource profiling, performance spike detection and prediction, resource resizing, and others, and we investigate ways in which the collected monitoring information can be processed towards these purposes. The analyzed information is valuable since it can drive important virtual resource management decisions. We also present an open-source tool we developed, called SuMo, which contains the necessary functionalities for collecting monitoring data from Amazon Web Services (AWS), analyzing them and providing resource optimization suggestions. We also present a Cost and Utilization Optimization (CUO) mechanism for optimizing the cost and the utilization of a set of running Amazon EC2 instances, which is formulated as an Integer Linear Programming (ILP) problem. This CUO mechanism receives information regarding the current set of instances used (their number, type, utilization) and proposes a new set of instances for serving the same load, so as to minimize cost and maximize utilization and performance efficiency. Finally, we present a network planning and operation tool, called Mantis, for designing the next generation optical networks, supporting both flexible and mixed line rate WDM networks. Through Mantis, the user is able to define the network topology, current and forecasted traffic matrices, CAPEX/OPEX parameters, set up basic configuration parameters, and use a library of algorithms to plan, operate, or run what-if scenarios for an optical network of interest. Mantis is designed to be deployed either as a cloud service or as a desktop application. Using the cloud infrastructures features Mantis can scale according to the user demands, executing fast and efficiently the scenarios requested. Mantis supports different cloud platforms either public such as Amazon Elastic Compute Cloud (Amazon EC2) and ~okeanos the GRNET’s cloud service or private based on OpenStack, while its modular architecture allows other cloud infrastructures to be adopted in the future with minimum effort. Υποδομές πλέγματος Ενδιάμεσο λογισμικό 004.36 Grid networks Cloud computing Glite Optical networks Amazon web services Analysis public clouds Virtual machine migration

1

Page generated in 0.0803 seconds