Global ETD Search

11	Metrics, Models and Methodologies for Energy-Proportional Computing Subramaniam, Balaji 21 August 2015 (has links) Massive data centers housing thousands of computing nodes have become commonplace in enterprise computing, and the power consumption of such data centers is growing at an unprecedented rate. Exacerbating such costs, data centers are often over-provisioned to avoid costly outages associated with the potential overloading of electrical circuitry. However, such over provisioning is often unnecessary since a data center rarely operates at its maximum capacity. It is imperative that we realize effective strategies to control the power consumption of the server and improve the energy efficiency of data centers. Adding to the problem is the inability of the servers to exhibit energy proportionality which diminishes the overall energy efficiency of the data center. Therefore in this dissertation, we investigate whether it is possible to achieve energy proportionality at the server- and cluster-level by efficient power and resource provisioning. Towards this end, we provide a thorough analysis of energy proportionality at the server and cluster-level and provide insight into the power saving opportunity and mechanisms to improve energy proportionality. Specifically, we make the following contribution at the server-level using enterprise-class workloads. We analyze the average power consumption of the full system as well as the subsystems and describe the energy proportionality of these components, characterize the instantaneous power profile of enterprise-class workloads using the on-chip energy meters, design a runtime system based on a load prediction model and an optimization framework to set the appropriate power constraints to meet specific performance targets and then present the effects of our runtime system on energy proportionality, average power, performance and instantaneous power consumption of enterprise applications. We then make the following contributions at the cluster-level. Using data serving, web searching and data caching as our representative workloads, we first analyze the component-level power distribution on a cluster. Second, we characterize how these workloads utilize the cluster. Third, we analyze the potential of power provisioning techniques (i.e., active low-power, turbo and idle low-power modes) to improve the energy proportionality. We then describe the ability of active low-power modes to provide trade-offs in power and latency. Finally, we compare and contrast power provisioning and resource provisioning techniques. This thesis sheds light on mechanisms to tune the power provisioned for a system under strict performance targets and opportunities to improve energy proportionality and instantaneous power consumption via efficient power and resource provisioning at the server- and cluster-level. / Ph. D. Energy Proportionality Resource Provisioning Power Provisioning Running Average Power Limit (RAPL) Scale-Out Workloads Enterprise Workloads Green Computing
12	Service-based applications provisioning in the cloud / Déploiement des applications à base de services dans le cloud Yangui, Sami 02 October 2014 (has links) Le Cloud Computing ou "informatique en nuage" est un nouveau paradigme émergeant pour l’exploitation des services informatiques distribuées à large échelle s’exécutant à des emplacements géographiques répartis. Ce paradigme est de plus en plus utilisé pour le déploiement et l’exécution des applications en général et des applications à base de services en particulier. Les applications à base de services sont décrites à l’aide du standard Service Component Architecture (SOA) et consistent à inter-lier un ensemble de services élémentaires et hétérogènes en utilisant des spécifications de composition de services appropriées telles que Service Component Architecture (SCA) ou encore Business Process Execution Language (BPEL). Provisionner une application dans le Cloud consiste à : (1) allouer les ressources dont elle a besoin pour s’exécuter, (2) déployer ses sources sur les ressources allouées et (3) démarrer l’application. Cependant, les solutions Cloud existantes sont limitées en termes de plateformes d’exécution. Ils ne peuvent pas toujours satisfaire la forte hétérogénéité des composants des applications à base de services. Pour remédier à ces problèmes, les mécanismes de provisioning des applications dans le Cloud doivent être reconsidérés. Ces mécanismes doivent être assez flexibles pour supporter la forte hétérogénéité des composants sans imposer de modifications et/ou d’adaptations du côté du fournisseur Cloud. Elles doivent également permettre le déploiement automatique des composants dans le Cloud. Si l’application à déployer est mono-composant, le déploiement est fait automatiquement et de la même manière, et ce quelque soit le fournisseur Cloud choisi. Si l’application est à base de services hétérogènes, des fonctionnalités appropriées doivent être mises à la disposition des développeurs pour qu’ils puissent définir et créer les ressources nécessaires aux composants avant de déployer l’application. Dans ce travail, nous proposons une approche appelée SPD permettant le provisioning des applications à base de services dans le Cloud. L’approche SPD est constituée de 3 étapes : (1) découper des applications à base de services en un ensemble de services élémentaires et autonomes, (2) encapsuler les services dans des micro-conteneurs spécifiques et (3) déployer les micro-conteneurs dans le Cloud. Pour le découpage, nous avons élaboré un ensemble d’algorithmes formels assurant la préservation de la sémantique des applications une fois découpées. Pour l’encapsulation, nous avons réalisé des prototypes de conteneurs de services permettant l’hébergement et l’exécution des services avec seulement le minimum des fonctionnalités nécessaires. Pour le déploiement, deux cas sont traités i.e. déploiement sur une infrastructure Cloud (IaaS) et déploiement sur une plateforme Cloud (PaaS). Pour automatiser le processus de déploiement, nous avons défini : (i) un modèle de description des ressources unifié basé sur le standard Open Cloud Computing Interface (OCCI) permettant de décrire l’application et ses ressources d’une manière générique quelque soit la plateforme de déploiement cible et (ii) une API appelée COAPS implémentant ce modèle et permettant de l’approvisionnement et la gestion des applications en utilisant des opérations génériques quelque soit la plateforme cible / Cloud Computing is a new supplement, consumption, and delivery model for IT services based on Internet protocols. It is increasingly used for hosting and executing applications in general and service-based applications in particular. Service-based applications are described according to Service Oriented Architecture (SOA) and consist of assembling a set of elementary and heterogeneous services using appropriate service composition specifications like Service Component Architecture (SCA) or Business Process Execution Language (BPEL). Provision an application in the Cloud consists of allocates its required resources from a Cloud provider, upload source codes over their resources before starting the application. However, existing Cloud solutions are limited to static programming frameworks and runtimes. They cannot always meet with the application requirements especially when their components are heterogeneous as service-based applications. To address these issues, application provisioning mechanisms in the Cloud must be reconsidered. The deployment mechanisms must be flexible enough to support the strong application components heterogeneity and requires no modification and/or adaptation on the Cloud provider side. They also should support automatic provisioning procedures. If the application to deploy is mono-block (e.g. one-tier applications), the provisioning is performed automatically and in a unified way whatever is the target Cloud provider through generic operations. If the application is service-based, appropriate features must be provided to developers in order to create themselves dynamically the required resources before the deployment in the target provider using generic operations. In this work, we propose an approach (called SPD) to provision service-based applications in the Cloud. The SPD approach consists of 3 steps: (1) Slicing the service-based application into a set of elementary and autonomous services, (2) Packaging the services in micro-containers and (3) Deploying the micro-containers in the Cloud. Slicing the applications is carried out by formal algorithms that we have defined. For the slicing, proofs of preservation of application semantics are established. For the packaging, we performed prototype of service containers which provide the minimal functionalities to manage hosted services life cycle. For the deployment, both cases are treated i.e. deployment in Cloud infrastructure (IaaS) and deployment in Cloud platforms (PaaS). To automate the deployment, we defined: (i) a unified description model based on the Open Cloud Computing Interface (OCCI) standard that allows the representation of applications and its required resources independently of the targeted PaaS and (ii) a generic PaaS application provisioning and management API (called COAPS API) that implements this model Application à base de service Approvisionnement de ressource cloud Informatique dans les nuages Micro-conteneur de services Modélisation de ressource cloud Cloud Computing Cloud resource modeling Cloud resource provisioning Service-based application Service micro-container
13	Contrôle des applications fondé sur la qualité de service pour les plate-formes logicielles dématérialisées (Cloud) / Control of applications based on quality of service in Cloud software platforms Li, Ge 21 July 2015 (has links) Le « Cloud computing » est un nouveau modèle de systèmes de calcul. L’infrastructure, les applications et les données sont déplacées de machines localisées sur des systèmes dématérialisés accédés sous forme de service via Internet. Le modèle « coût à l’utilisation » permet des économies de coût en modifiant la configuration à l’exécution (élasticité). L’objectif de cette thèse est de contribuer à la gestion de la Qualité de Service (QdS) des applications s’exécutant dans le Cloud. Les services Cloud prétendent fournir une flexibilité importante dans l’attribution des ressources de calcul tenant compte des variations perçues, telles qu’une fluctuation de la charge. Les capacités de variation doivent être précisément exprimées dans un contrat (le Service Level Agreement, SLA) lorsque l’application est hébergée par un fournisseur de Plateform as a Service (PaaS). Dans cette thèse, nous proposons et nous décrivons formellement le langage de description de SLA PSLA. PSLA est fondé sur WS-Agreement qui est lui-même un langage extensible de description de SLA. Des négociations préalables à la signature du SLA sont indispensables, pendant lesquelles le fournisseur de PaaS doit évaluer la faisabilité des termes du contrat. Cette évaluation, par exemple le temps de réponse, le débit maximal de requêtes servies, etc, est fondée sur une analyse du comportement de l’application déployée dans l’infrastructure cloud. Une analyse du comportement de l’application est donc nécessaire et habituellement assurée par des tests (benchmarks). Ces tests sont relativement coûteux et une étude précise de faisabilité demande en général de nombreux tests. Dans cette thèse, nous proposons une méthode d’étude de faisabilité concernant les critères de performance, à partir d’une proposition de SLA exprimée en PSLA. Cette méthode est un compromis entre la précision d’une étude exhaustive de faisabilité et les coûts de tests. Les résultats de cette étude constituent le modèle initial de la correspondance charge entrante-allocation de ressources utilisée à l’exécution. Le contrôle à l’exécution (runtime control) d’une application gère l’allocation de ressources en fonction des besoins, s’appuyant en particulier sur les capacités de passage à l’échelle (scalability) des infrastructures de cloud. Nous proposons RCSREPRO (Runtime Control method based on Schedule, REactive and PROactive methods), une méthode de contrôle à l’exécution fondée sur la planification et des contrôles réactifs et prédictifs. Les besoins d’adaptation à l’exécution sont essentiellement dus à une variation de la charge soumise à l’application, variations difficiles à estimer avant exécution et seulement grossièrement décrites dans le SLA. Il est donc nécessaire de reporter à l’exécution les décisions d’adaptation et d’y évaluer les possibles variations de charge. Comme les actions de modification des ressources attribuées peuvent prendre plusieurs minutes, RCSREPRO réalise un contrôle prédictif fondée sur l’analyse de charge et la correspondance indicateurs de performance-ressources attribuées, initialement définie via des tests. Cette correspondance est améliorée en permanence à l’exécution. En résumé, les contributions de cette thèse sont la proposition de langage PSLA pour décrire les SLA ; une proposition de méthode pour l’étude de faisabilité d’un SLA ; une proposition de méthode (RCSREPRO) de contrôle à l’exécution de l’application pour garantir le SLA. Les travaux de cette thèse s’inscrivent dans le contexte du projet FSN OpenCloudware (www.opencloudware.org) et ont été financés en partie par celui-ci. / Cloud computing is a new computing model. Infrastructure, application and data are moved from local machines to internet and provided as services. Cloud users, such as application owners, can greatly save budgets from the elasticity feature, which refers to the “pay as you go” and on-demand characteristics, of cloud service. The goal of this thesis is to manage the Quality of Service (QoS) for applications running in cloud environments Cloud services provide application owners with great flexibility to assign “suitable” amount of resources according to the changing needs, for example caused by fluctuating request rate. “Suitable” or not needs to be clearly documented in Service Level Agreements (SLA) if this resource demanding task is hosted in a third party, such as a Platform as a Service (PaaS) provider. In this thesis, we propose and formally describe PSLA, which is a SLA description language for PaaS. PSLA is based on WS-Agreement, which is extendable and widely accepted as a SLA description language. Before signing the SLA contract, negotiations are unavoidable. During negotiations, the PaaS provider needs to evaluate if the SLA drafts are feasible or not. These evaluations are based on the analysis of the behavior of the application deployed in the cloud infrastructure, for instance throughput of served requests, response time, etc. Therefore, application dependent analysis, such as benchmark, is needed. Benchmarks are relatively costly and precise feasibility study usually imply large amount of benchmarks. In this thesis, we propose a benchmark based SLA feasibility study method to evaluate whether or not a SLA expressed in PSLA, including QoS targets, resource constraints, cost constraints and workload constraints can be achieved. This method makes tradeoff between the accuracy of a SLA feasibility study and benchmark costs. The intermediate of this benchmark based feasibility study process will be used as the workload-resource mapping model of our runtime control method. When application is running in a cloud infrastructure, the scalability feature of cloud infrastructures allows us to allocate and release resources according to changing needs. These resource provisioning activities are named runtime control. We propose the Runtime Control method based onSchedule, REactive and PROactive methods (RCSREPRO). Changing needs are mainly caused by the fluctuating workload for majority of the applications running in the cloud. The detailed workload information, for example the request arrival rates at scheduled points in time, is difficult to be known before running the application. Moreover, workload information listed in PSLA is too rough to give a fitted resource provisioning schedule before runtime. Therefore, runtime control decisions are needed to be performed in real time. Since resource provisioning actions usually require several minutes, RCSREPRO performs a proactive runtime control which means that it predicts future needs and assign resources in advance to have them ready when they are needed. Hence, prediction of the workload and workload-resource mapping are two problems involved in proactive runtime control. The workload-resource mapping model, which is initially derived from benchmarks in SLA feasibility study is continuously improved in a feedback way at runtime, increasing the accuracy of the control.To sum up, we contribute with three aspects to the QoS management of application running in the cloud: creation of PSLA, a PaaS level SLA description language; proposal of a benchmark based SLA feasibility study method; proposal of a runtime control method, RCSREPRO, to ensure the SLA when the application is running. The work described in this thesis is motivated and funded by the FSN OpenCloudware project (www.opencloudware.org). PaaS Qualité de service Service Level Agreement (SLA) SLA étude de faisabilité Provisioning des ressources Contrôle Runtime Platform as a Service (PaaS) Quality of service Service Level Agreement (SLA) SLA feasibility study Resource provisioning Runtime control 004
14	Predictive vertical CPU autoscaling in Kubernetes based on time-series forecasting with Holt-Winters exponential smoothing and long short-term memory / Prediktiv vertikal CPU-autoskalning i Kubernetes baserat på tidsserieprediktion med Holt-Winters exponentiell utjämning och långt korttidsminne Wang, Thomas January 2021 (has links) Private and public clouds require users to specify requests for resources such as CPU and memory (RAM) to be provisioned for their applications. The values of these requests do not necessarily relate to the application’s run-time requirements, but only help the cloud infrastructure resource manager to map requested virtual resources to physical resources. If an application exceeds these values, it might be throttled or even terminated. Consequently, requested values are often overestimated, resulting in poor resource utilization in the cloud infrastructure. Autoscaling is a technique used to overcome these problems. In this research, we formulated two new predictive CPU autoscaling strategies forKubernetes containerized applications, using time-series analysis, based on Holt-Winters exponential smoothing and long short-term memory (LSTM) artificial recurrent neural networks. The two approaches were analyzed, and their performances were compared to that of the default Kubernetes Vertical Pod Autoscaler (VPA). Efficiency was evaluated in terms of CPU resource wastage, and insufficient CPU percentage and amount for container workloads from Alibaba Cluster Trace 2018, and others. In our experiments, we observed that Kubernetes Vertical Pod Autoscaler (VPA) tended to perform poorly on workloads that periodically change. Our results showed that compared to VPA, predictive methods based on Holt- Winters exponential smoothing (HW) and Long Short-Term Memory (LSTM) can decrease CPU wastage by over 40% while avoiding CPU insufficiency for various CPU workloads. Furthermore, LSTM has been shown to generate stabler predictions compared to that of HW, which allowed for more robust scaling decisions. / Privata och offentliga moln kräver att användare begär mängden CPU och minne (RAM) som ska fördelas till sina applikationer. Mängden resurser är inte nödvändigtvis relaterat till applikationernas körtidskrav, utan är till för att molninfrastrukturresurshanteraren ska kunna kartlägga begärda virtuella resurser till fysiska resurser. Om en applikation överskrider dessa värden kan den saktas ner eller till och med krascha. För att undvika störningar överskattas begärda värden oftast, vilket kan resultera i ineffektiv resursutnyttjande i molninfrastrukturen. Autoskalning är en teknik som används för att överkomma dessa problem. I denna forskning formulerade vi två nya prediktiva CPU autoskalningsstrategier för containeriserade applikationer i Kubernetes, med hjälp av tidsserieanalys baserad på metoderna Holt-Winters exponentiell utjämning och långt korttidsminne (LSTM) återkommande neurala nätverk. De två metoderna analyserades, och deras prestationer jämfördes med Kubernetes Vertical Pod Autoscaler (VPA). Prestation utvärderades genom att observera under- och överutilisering av CPU-resurser, för diverse containerarbetsbelastningar från bl. a. Alibaba Cluster Trace 2018. Vi observerade att Kubernetes Vertical Pod Autoscaler (VPA) i våra experiment tenderade att prestera dåligt på arbetsbelastningar som förändras periodvist. Våra resultat visar att jämfört med VPA kan prediktiva metoder baserade på Holt-Winters exponentiell utjämning (HW) och långt korttidsminne (LSTM) minska överflödig CPU-användning med över 40 % samtidigt som de undviker CPU-brist för olika arbetsbelastningar. Ytterligare visade sig LSTM generera stabilare prediktioner jämfört med HW, vilket ledde till mer robusta autoskalningsbeslut. Kubernetes Docker Container Cloud Native Cloud Computing Resource Provisioning Autoscaling Predictive scaling CPU Usage Seasonality Exponential Smoothing Long short-term memory Time-series Analysis Kubernetes Docker Container Cloud Native Cloud Computing Resursförsörjning Autoskalning prediktiv skalning CPU Användning Säsongsmässighet Exponentiell utjämning långt korttidsminne tidsserieanalys Computer and Information Sciences Data- och informationsvetenskap

Page generated in 0.1008 seconds