• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 18
  • 1
  • Tagged with
  • 20
  • 20
  • 15
  • 12
  • 11
  • 9
  • 9
  • 9
  • 9
  • 6
  • 4
  • 4
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Managing Microservices with a Service Mesh : An implementation of a service mesh with Kubernetes and Istio

Mara Jösch, Ronja January 2020 (has links)
The adoption of microservices facilitates extending computer systems in size, complexity, and distribution. Alongside their benefits, they introduce the possibility of partial failures. Besides focusing on the business logic, developers have to tackle cross-cutting concerns of service-to-service communication which now defines the applications' reliability and performance. Currently, developers use libraries embedded into the application code to address these concerns. However, this increases the complexity of the code and requires the maintenance and management of various libraries. The service mesh is a relatively new technology that possibly enables developers staying focused on their business logic. This thesis investigates one of the available service meshes called Istio, to identify its benefits and limitations. The main benefits found are that Istio adds resilience and security, allows features currently difficult to implement, and enables a cleaner structure and a standard implementation of features within and across teams. Drawbacks are that it decreases performance by adding CPU usage, memory usage, and latency. Furthermore, the main disadvantage of Istio is its limited testing tools. Based on the findings, the Webcore Infra team of the company can make a more informed decision whether or not Istio is to be introduced. / Tillämpningen av microservices underlättar utvidgningen av datorsystem i storlek, komplexitet och distribution. Utöver fördelarna introducerar de möjligheten till partiella misslyckanden. Förutom att fokusera på affärslogiken måste utvecklare hantera övergripande problem med kommunikation mellan olika tjänster som nu definierar applikationernas pålitlighet och prestanda. För närvarande använder utvecklare bibliotek inbäddade i programkoden för att hantera dessa problem. Detta ökar dock kodens komplexitet och kräver underhåll och hantering av olika bibliotek. Service mesh är en relativt ny teknik som kan möjliggöra för utvecklare att hålla fokus på sin affärslogik. Denna avhandling undersöker ett av de tillgängliga service mesh som kallas Istio för att identifiera dess fördelar och begränsningar. De viktigaste fördelarna som hittas är att Istio lägger till resistens och säkerhet, tillåter funktioner som för närvarande är svåra att implementera och möjliggör en renare struktur och en standardimplementering av funktioner inom och över olika team. Nackdelarna är att det minskar prestandan genom att öka CPU-användning, minnesanvändning och latens. Dessutom är Istios största nackdel dess begränsade testverktyg. Baserat på resultaten kan Webcore Infra-teamet i företaget fatta ett mer informerat beslut om Istio ska införas eller inte.
12

An I/O-aware scheduler for containerized data-intensive HPC tasks in Kubernetes-based heterogeneous clusters / En I/O-medveten schemaläggare för containeriserade dataintensiva HPC-uppgifter i Kubernetes-baserade heterogena kluster

Wu, Zheyun January 2022 (has links)
Cloud-native is a new computing paradigm that takes advantage of key characteristics of cloud computing, where applications are packaged as containers. The lifecycle of containerized applications is typically managed by container orchestration tools such as Kubernetes, the most popular container orchestration system that automates the containers’ deployment, maintenance, and scaling. Kubernetes has become the de facto standard for container orchestrators in the cloud-native era. Meanwhile, with the increasing demand for High-Performance Computing (HPC) over the past years, containerization is being adopted by the HPC community and various processors and special-purpose hardware are utilized to accelerate HPC applications. The architecture of cloud systems has been gradually shifting from homogeneous to heterogeneous with different processors and hardware accelerators, which raises a new challenge: how to exploit different computing resources efficiently? Much effort has been devoted to improving the use efficiency of computing resources in heterogeneous systems from the perspective of task scheduling, which aims to match different types of tasks to optimal computing devices for execution. Existing proposals do not take into account the variation in I/O performance between heterogeneous nodes when scheduling tasks. However, I/O performance is an important but often overlooked factor that can be a potential performance bottleneck for HPC tasks. This thesis proposes an I/O-aware scheduler named cmio-scheduler for containerized data-intensive HPC tasks in Kubernetes-based heterogeneous clusters, which is aware of the I/O throughput of compute nodes when making task placement decisions. In principle, cmio-scheduler assigns data-intensive HPC tasks to the node that fulfills the tasks’ requirements for CPU, memory, and GPU and has the highest I/O throughput. The experimental results demonstrate that cmio-scheduler reduces the execution time by 19.32% for the overall workflow and 15.125% for parallelizable tasks on average. / Cloud-native är ett nytt dataparadigm som drar nytta av de viktigaste egenskaperna hos molntjänster, där applikationer paketeras som behållare. Livscykeln för applikationer i containrar hanteras vanligtvis av verktyg för containerorkestrering, t.ex. Kubernetes, det mest populära systemet för containerorkestrering, som automatiserar installation, underhåll och skalning av containrar. Kubernetes har blivit de facto-standard för containerorkestrar i den molnnativa eran. Med den ökande efterfrågan på högpresterande beräkningar (HPC) under de senaste åren har containerisering antagits av HPC-samhället och olika processorer och specialhårdvara används för att påskynda HPC-tillämpningar. Arkitekturen för molnsystem har gradvis skiftat från homogen till heterogen med olika processorer och hårdvaruacceleratorer, vilket ger upphov till en ny utmaning: hur kan man utnyttja olika datorresurser på ett effektivt sätt? Mycket arbete har ägnats åt att förbättra utnyttjandet av datorresurser i heterogena system ur perspektivet för uppgiftsfördelning, som syftar till att matcha olika typer av uppgifter till optimala datorutrustning för utförande. Befintliga förslag tar inte hänsyn till variationen i I/O-prestanda mellan heterogena noder vid schemaläggning av uppgifter. I/O-prestanda är dock en viktig men ofta förbisedd faktor som kan vara en potentiell flaskhals för HPC-uppgifter. I den här avhandlingen föreslås en I/O-medveten schemaläggare vid namn cmio-scheduler för containeriserade dataintensiva HPC-uppdrag i Kubernetes-baserade heterogena kluster, som är medveten om beräkningsnodernas I/O-genomströmning när den fattar beslut om placering av uppdrag. I princip tilldelar cmio-scheduler dataintensiva HPC-uppgifter till den nod som uppfyller uppgifternas krav på CPU, minne och GPU och som har den högsta I/O-genomströmningen. De experimentella resultaten visar att cmio-scheduler i genomsnitt minskar exekveringstiden med 19,32 % för det totala arbetsflödet och med 15,125 % för parallelliserbara uppgifter.
13

Security als komplexe Anforderung an agile Softwareentwicklung: Erarbeitung eines Anwendungsmusters zur Betrachtung der IT-Security in agilen Entwickungszyklen anhand eines metadatengestützen Testing-Verfahrens

Matkowitz, Max 26 April 2022 (has links)
Agile Softwareentwicklung steht mit seinen Prinzipien für offene Kollaboration, leichtgewichtige Rahmenwerke und schnelle Anpassung an Änderungen. Mit diesen Charakteristika konnte sich Problemen und Unzufriedenheit in der traditionellen Software-Entwicklung gewidmet werden. Auf der Seite der IT-Sicherheit haben sich allerdings vielfältige Herausforderungen offenbart. Mit Static Application Security Testing (SAST) und Dynamic Application Security Testing (DAST) wurden erste Lösungsansätze dafür geliefert. Eine zufriedenstellende Möglichkeit zur Integration von Security-Testing in agile Softwareentwicklung, insbesondere im Cloud-Kontext, stellen diese allerdings nicht dar. Die vorliegende Arbeit soll unter folgender Fragestellung bearbeitet werden: Wie kann ein praktisches Konzept zur Betrachtung der Sicherheit von Anwendungs-Code, Container und Cluster innerhalb von agilen Entwicklungszyklen realisiert werden, wenn ein metadatenbasiertes Testverfahren verwendet werden soll? Das Ziel teilt sich damit in die Konzeption und Realisierung von zwei Aspekten: das metadatenbasierte Security-Testing von Code/Container/Cluster und den Entwicklungsablauf zur Anwendung des Testing-Verfahrens. Ein Fallbeispiel der Webentwicklung wurde zur qualitativen Evaluation eines Prototypen herangezogen, welcher mittels Python und GitLab umgesetzt wurde. Nach Erläuterung der Rahmenbedingungen, konnten konkrete Szenarien eines Entwicklungsprozesses durchlaufen werden. Die qualitative Untersuchung zeigte eine erfolgreiche Erkennung von Schwachstellen unterschiedlicher Kategorien (z.B. Broken Access Control). Insgesamt konnte eine gute Einbettung in den beispielhaften Entwicklungsablauf beobachtet werden. Der Aufwand für die Pflege der Metadaten ist nicht zu vernachlässigen, jedoch sollte dieser aufgrund der Orientierung am etablierten OpenAPI Schema nicht zu stark gewichtet werden. Dies gilt insbesondere dann, wenn durch den Einfluss von Metadaten Mehrwerte (Durchführbarkeit, Schnelligkeit, Komfortabilität) generiert werden können.:1 Einleitung 1.1 Problembeschreibung 1.2 Zielstellung 1.3 Stand der Technik und Entwicklungsmethoden 1.4 Methodik 2 Theoretische und Technische Grundlagen 2.1 Grundlagen der agilen Software-Entwicklung 2.2 GitLab 2.3 Grundlagen zum metadatengestützten Security-Testing 3 Konzeption 3.1 Low-Level Modell (Testablauf) 3.2 Synthese der beispielhaften Testfälle 3.3 Beschreibungsdatei 3.4 High-Level Modell (Entwicklungsablauf) 4 Implementation 4.1 Testablauf 4.2 CI/CD Pipeline 4.3 Fallbeispiel der agilen Softwareentwicklung 5 Auswertung und Ausblick
14

High-Performing Cloud Native SW Using Key-Value Storage or Database for Externalized States / Högpresterande moln-nativ mjukvara med användning av nyckelvärdeslagring eller databas för externaliserade tillstånd

Sikh, Ahmed, Axén, Joel January 2023 (has links)
To meet the demands of 5G and what comes after, telecommunications companies will need to replace their old embedded systems with new technology. One such solution could be to develop cloud-native applications that offer many benefits but are less reliable than embedded systems. Having the different units in the 5G system store their state, or their operational context, in cloud-based databases could reduce downtime in case of failing processes, but various database systems have their advantages and disadvantages. Thus, the choice of implementation must be carefully considered.   This study primarily aims to create a simulator that can measure latency, or the time it takes to write or read dummy data to or from one of two different kinds of databases. Its secondary aim is to produce use cases that mimic situations that a database for state data would need to handle and to collect measurements from them with the help of the simulator.   The simulator was implemented using C++17 and contains a simulator object and separate database clients. The actors representing the units interacting with a 5G network were created by the clients and their state data was stored in either Redis or PostgreSQL databases. Various use cases were designed with the instruction from Ericsson to simulate real-life scenarios and to measure latencies. Quantitative data analysis was performed on the collected data to compare the performances of Redis and PostgreSQL databases in the different use cases.   The study found that Redis on average worked the fastest and that its latency was largely the same regardless of data sizes, while PostgreSQL's latencies, and thus the differences between the databases, varied more depending on the scenario.   The results of the study show that Redis is the one of the two databases that operates more consistently and predictably, which may partly be explained by the fact it is mainly based in the RAM, while PostgreSQL is mainly disk-based.   Future work could involve testing the databases under higher workloads, exploring what would be the impact of running simulations in environments with reduced RAM and not letting Redis use it to its full advantage, and maybe analyzing more latency figures by creating new use cases and running them. Future work can also include an investigation of the effect of Redis database crashes. Moreover, the simulator implementation allows for changing to other types of databases.
15

Increasing Reproducibility Through Provenance, Transparency and Reusability in a Cloud-Native Application for Collaborative Machine Learning

Ekström Hagevall, Adam, Wikström, Carl January 2021 (has links)
The purpose of this thesis paper was to develop new features in the cloud-native and open-source machine learning platform STACKn, aiming to strengthen the platform's support for conducting reproducible machine learning experiments through provenance, transparency and reusability. Adhering to the definition of reproducibility as the ability of independent researchers to exactly duplicate scientific results with the same material as in the original experiment, two concepts were explored as alternatives for this specific goal: 1) Increased support for standardized textual documentation of machine learning models and their corresponding datasets; and 2) Increased support for provenance to track the lineage of machine learning models by making code, data and metadata readily available and stored for future reference. We set out to investigate to what degree these features could increase reproducibility in STACKn, both when used in isolation and when combined.  When these features had been implemented through an exhaustive software engineering process, an evaluation of the implemented features was conducted to quantify the degree of reproducibility that STACKn supports. The evaluation showed that the implemented features, especially provenance features, substantially increase the possibilities to conduct reproducible experiments in STACKn, as opposed to when none of the developed features are used. While the employed evaluation method was not entirely objective, these features are clearly a good first initiative in meeting current recommendations and guidelines on how computational science can be made reproducible.
16

Cloud-native storage solutions for Kubernetes : A performance comparison

Andersson, Filip January 2023 (has links)
Kubernetes is a container orchestration system that has been rising in popularity in recent years. The modular nature of Kubernetes allows the usage of different storage solutions, and for cloud environments, cloud-native distributed storage solutions maybe attractive due to their redundant nature. There are many tools for cloud-native distributed storage available on the market today with differing features and performance. Choosing the correct one for an organisation can be difficult. Organisations utilising Kubernetes in cloud environments would like to be as performance efficient as possible to save on costs and resources. This study aims to offer a benchmark and analysis for some of the most popular tools, to help organisations choose the ‘best’ solution for their operational needs, from a performance perspective. The benchmarks compare three cloud-native distributed storage solutions, OpenEBS, Portworx, and Rook-Ceph on both Amazon Elastic Kubernetes Service (EKS) and Azure Kubernetes Service (AKS). For a baseline comparison, the study will also benchmark the cloud providers own solutions; Azure Disk Storage, and Amazon Elastic Block Storage. The study compares these solutions from three key metrics; bandwidth, latency, and IOPS, in both read and write performance. / <p>Det finns övrigt digitalt material (t.ex. film-, bild- eller ljudfiler) eller modeller/artefakter tillhörande examensarbetet som ska skickas till arkivet.</p><p>There are other digital material (eg film, image or audio files) or models/artifacts that belongs to the thesis and need to be archived.</p>
17

Scalable Architecture for Automating Machine Learning Model Monitoring

de la Rúa Martínez, Javier January 2020 (has links)
Last years, due to the advent of more sophisticated tools for exploratory data analysis, data management, Machine Learning (ML) model training and model serving into production, the concept of MLOps has gained more popularity. As an effort to bring DevOps processes to the ML lifecycle, MLOps aims at more automation in the execution of diverse and repetitive tasks along the cycle and at smoother interoperability between teams and tools involved. In this context, the main cloud providers have built their own ML platforms [4, 34, 61], offered as services in their cloud solutions. Moreover, multiple frameworks have emerged to solve concrete problems such as data testing, data labelling, distributed training or prediction interpretability, and new monitoring approaches have been proposed [32, 33, 65]. Among all the stages in the ML lifecycle, one of the most commonly overlooked although relevant is model monitoring. Recently, cloud providers have presented their own tools to use within their platforms [4, 61] while work is ongoing to integrate existent frameworks [72] into open-source model serving solutions [38]. Most of these frameworks are either built as an extension of an existent platform (i.e lack portability), follow a scheduled batch processing approach at a minimum rate of hours, or present limitations for certain outliers and drift algorithms due to the platform architecture design in which they are integrated. In this work, a scalable automated cloudnative architecture is designed and evaluated for ML model monitoring in a streaming approach. An experimentation conducted on a 7-node cluster with 250.000 requests at different concurrency rates shows maximum latencies of 5.9, 29.92 and 30.86 seconds after request time for 75% of distance-based outliers detection, windowed statistics and distribution-based data drift detection, respectively, using windows of 15 seconds length and 6 seconds of watermark delay. / Under de senaste åren har konceptet MLOps blivit alltmer populärt på grund av tillkomsten av mer sofistikerade verktyg för explorativ dataanalys, datahantering, modell-träning och model serving som tjänstgör i produktion. Som ett försök att föra DevOps processer till Machine Learning (ML)-livscykeln, siktar MLOps på mer automatisering i utförandet av mångfaldiga och repetitiva uppgifter längs cykeln samt på smidigare interoperabilitet mellan team och verktyg inblandade. I det här sammanhanget har de största molnleverantörerna byggt sina egna ML-plattformar [4, 34, 61], vilka erbjuds som tjänster i deras molnlösningar. Dessutom har flera ramar tagits fram för att lösa konkreta problem såsom datatestning, datamärkning, distribuerad träning eller tolkning av förutsägelse, och nya övervakningsmetoder har föreslagits [32, 33, 65]. Av alla stadier i ML-livscykeln förbises ofta modellövervakning trots att det är relevant. På senare tid har molnleverantörer presenterat sina egna verktyg att kunna användas inom sina plattformar [4, 61] medan arbetet pågår för att integrera befintliga ramverk [72] med lösningar för modellplatformer med öppen källkod [38]. De flesta av dessa ramverk är antingen byggda som ett tillägg till en befintlig plattform (dvs. saknar portabilitet), följer en schemalagd batchbearbetningsmetod med en lägsta hastighet av ett antal timmar, eller innebär begränsningar för vissa extremvärden och drivalgoritmer på grund av plattformsarkitekturens design där de är integrerade. I det här arbetet utformas och utvärderas en skalbar automatiserad molnbaserad arkitektur för MLmodellövervakning i en streaming-metod. Ett experiment som utförts på ett 7nodskluster med 250.000 förfrågningar vid olika samtidigheter visar maximala latenser på 5,9, 29,92 respektive 30,86 sekunder efter tid för förfrågningen för 75% av avståndsbaserad detektering av extremvärden, windowed statistics och distributionsbaserad datadriftdetektering, med hjälp av windows med 15 sekunders längd och 6 sekunders fördröjning av vattenstämpel.
18

Predictive vertical CPU autoscaling in Kubernetes based on time-series forecasting with Holt-Winters exponential smoothing and long short-term memory / Prediktiv vertikal CPU-autoskalning i Kubernetes baserat på tidsserieprediktion med Holt-Winters exponentiell utjämning och långt korttidsminne

Wang, Thomas January 2021 (has links)
Private and public clouds require users to specify requests for resources such as CPU and memory (RAM) to be provisioned for their applications. The values of these requests do not necessarily relate to the application’s run-time requirements, but only help the cloud infrastructure resource manager to map requested virtual resources to physical resources. If an application exceeds these values, it might be throttled or even terminated. Consequently, requested values are often overestimated, resulting in poor resource utilization in the cloud infrastructure. Autoscaling is a technique used to overcome these problems. In this research, we formulated two new predictive CPU autoscaling strategies forKubernetes containerized applications, using time-series analysis, based on Holt-Winters exponential smoothing and long short-term memory (LSTM) artificial recurrent neural networks. The two approaches were analyzed, and their performances were compared to that of the default Kubernetes Vertical Pod Autoscaler (VPA). Efficiency was evaluated in terms of CPU resource wastage, and insufficient CPU percentage and amount for container workloads from Alibaba Cluster Trace 2018, and others. In our experiments, we observed that Kubernetes Vertical Pod Autoscaler (VPA) tended to perform poorly on workloads that periodically change. Our results showed that compared to VPA, predictive methods based on Holt- Winters exponential smoothing (HW) and Long Short-Term Memory (LSTM) can decrease CPU wastage by over 40% while avoiding CPU insufficiency for various CPU workloads. Furthermore, LSTM has been shown to generate stabler predictions compared to that of HW, which allowed for more robust scaling decisions. / Privata och offentliga moln kräver att användare begär mängden CPU och minne (RAM) som ska fördelas till sina applikationer. Mängden resurser är inte nödvändigtvis relaterat till applikationernas körtidskrav, utan är till för att molninfrastrukturresurshanteraren ska kunna kartlägga begärda virtuella resurser till fysiska resurser. Om en applikation överskrider dessa värden kan den saktas ner eller till och med krascha. För att undvika störningar överskattas begärda värden oftast, vilket kan resultera i ineffektiv resursutnyttjande i molninfrastrukturen. Autoskalning är en teknik som används för att överkomma dessa problem. I denna forskning formulerade vi två nya prediktiva CPU autoskalningsstrategier för containeriserade applikationer i Kubernetes, med hjälp av tidsserieanalys baserad på metoderna Holt-Winters exponentiell utjämning och långt korttidsminne (LSTM) återkommande neurala nätverk. De två metoderna analyserades, och deras prestationer jämfördes med Kubernetes Vertical Pod Autoscaler (VPA). Prestation utvärderades genom att observera under- och överutilisering av CPU-resurser, för diverse containerarbetsbelastningar från bl. a. Alibaba Cluster Trace 2018. Vi observerade att Kubernetes Vertical Pod Autoscaler (VPA) i våra experiment tenderade att prestera dåligt på arbetsbelastningar som förändras periodvist. Våra resultat visar att jämfört med VPA kan prediktiva metoder baserade på Holt-Winters exponentiell utjämning (HW) och långt korttidsminne (LSTM) minska överflödig CPU-användning med över 40 % samtidigt som de undviker CPU-brist för olika arbetsbelastningar. Ytterligare visade sig LSTM generera stabilare prediktioner jämfört med HW, vilket ledde till mer robusta autoskalningsbeslut.
19

Scaling cloud-native Apache Spark on Kubernetes for workloads in external storages

Mrowczynski, Piotr January 2018 (has links)
CERN Scalable Analytics Section currently offers shared YARN clusters to its users as monitoring, security and experiment operations. YARN clusters with data in HDFS are difficult to provision, complex to manage and resize. This imposes new data and operational challenges to satisfy future physics data processing requirements. As of 2018, there were over 250 PB of physics data stored in CERN’s mass storage called EOS. Hadoop-XRootD Connector allows to read over network data stored in CERN EOS. CERN’s on-premise private cloud based on OpenStack allows to provision on-demand compute resources. Emergence of technologies as Containers-as-a-Service in Openstack Magnum and support for Kubernetes as native resource scheduler for Apache Spark, give opportunity to increase workflow reproducability on different compute infrastructures with use of containers, reduce operational effort of maintaining computing cluster and increase resource utilization via cloud elastic resource provisioning. This trades-off the operational features with datalocality known from traditional systems as Spark/YARN with data in HDFS.In the proposed architecture of cloud-managed Spark/Kubernetes with data stored in external storage systems as EOS, Ceph S3 or Kafka, physicists and other CERN communities can on-demand spawn and resize Spark/Kubernetes cluster, having fine-grained control of Spark Applications. This work focuses on Kubernetes CRD Operator for idiomatically defining and running Apache Spark applications on Kubernetes, with automated scheduling and on-failure resubmission of long-running applications. Spark Operator was introduced with design principle to allow Spark on Kubernetes to be easy to deploy, scale and maintain with similar usability of Spark/YARN.The analysis of concerns related to non-cluster local persistent storage and memory handling has been performed. The architecture scalability has been evaluated on the use case of sustained workload as physics data reduction, with files in ROOT format being stored in CERN mass-storage called EOS. The series of microbenchmarks has been performed to evaluate the architecture properties compared to state-of-the-art Spark/YARN cluster at CERN. Finally, Spark on Kubernetes workload use-cases have been classified, and possible bottlenecks and requirements identified. / CERN Scalable Analytics Section erbjuder för närvarande delade YARN-kluster till sina användare och för övervakning, säkerhet, experimentoperationer, samt till andra grupper som är intresserade av att bearbeta data med hjälp av Big Data-tekniker. Dock är YARNkluster med data i HDFS svåra att tillhandahålla, samt komplexa att hantera och ändra storlek på. Detta innebär nya data och operativa utmaningar för att uppfylla krav på dataprocessering för petabyte-skalning av fysikdata.Från och med 2018 fanns över 250 PB fysikdata lagrade i CERNs masslagring, kallad EOS. CERNs privata moln, baserat på OpenStack, gör det möjligt att tillhandahålla beräkningsresurser på begäran. Uppkomsten av teknik som Containers-as-a-Service i Openstack Magnum och stöd för Kubernetes som inbyggd resursschemaläggare för Apache Spark, ger möjlighet att öka arbetsflödesreproducerbarheten på olika databaser med användning av containers, minska operativa ansträngningar för att upprätthålla datakluster, öka resursutnyttjande via elasiska resurser, samt tillhandahålla delning av resurser mellan olika typer av arbetsbelastningar med kvoter och namnrymder.I den föreslagna arkitekturen av molnstyrda Spark / Kubernetes med data lagrade i externa lagringssystem som EOS, Ceph S3 eller Kafka, kan fysiker och andra CERN-samhällen på begäran skapa och ändra storlek på Spark / Kubernetes-klustrer med finkorrigerad kontroll över Spark Applikationer. Detta arbete fokuserar på Kubernetes CRD Operator för idiomatiskt definierande och körning av Apache Spark-applikationer på Kubernetes, med automatiserad schemaläggning och felåterkoppling av långvariga applikationer. Spark Operator introducerades med designprincipen att tillåta Spark över Kubernetes att vara enkel att distribuera, skala och underhålla. Analys av problem relaterade till icke-lokal kluster persistent lagring och minneshantering har utförts. Arkitekturen har utvärderats med användning av fysikdatareduktion, med filer i ROOT-format som lagras i CERNs masslagringsystem som kallas EOS. En serie av mikrobenchmarks har utförts för att utvärdera arkitekturegenskaperna såsom prestanda jämfört med toppmoderna Spark / YARN-kluster vid CERN, och skalbarhet för långvariga dataprocesseringsjobb.
20

Analysis Of Fastlane For Digitalization Through Low-Code ML Platforms

Raghavendran, Krishnaraj January 2022 (has links)
Even a professional photographer sometimes uses automatic default settings that come up with the camera to take a photo. One can debate the quality of outcome from manual vs automatic mode. Until and unless we have a professional level of competence in taking a photo, updating our skills/knowledge as per the latest market trends and having enough time to try out different settings manually, it is worthwhile to use Auto-mode. As camera manufacturers, after several iterations of testing, comes up with the list of ideal parameter values, which is embedded as a factory default setting when we choose auto-mode. For non-professional photographers or amateurs recommend using the auto-mode that comes with the camera for not missing the moment. Similarly, in the context of developing machine learning models, until and unless we have the required data-engineering and ML development competence, time to train and test different ML models and tune different hyper parameter settings, it is worth to try out to Automatic Machine learning feature provided out-of-shelf by all the Cloud-based and Cloud-agnostic ML platforms. This thesis deep dives into evaluating possibility of generating automatic machine learning models with no-code/low-code experience provided by GCP, AWS, Azure and Databricks. We have made a comparison between different ML platforms on generating automatic ML model and presenting the results. It also covers the lessons learnt by developing automatic ML models from a sample dataset across all four ML platforms. Later, we have outlined machine learning subject matter expert’s viewpoints about using Automatic Machine learning models. From this research, we found automatic machine learning can come handy for many off-the-shelf analytical use-cases, this can be highly beneficial especially for time-constrained projects, when resource competence or staffing is a bottleneck and even when competent data scientists want a second-opinion or compare AutoML results with the custom ML model built.

Page generated in 0.0617 seconds