161 |
Val av hårdvara för cybersäker kommunikation på järnvägen / Hardware Selection for Cybersecure Communication on RailwaysHakkarainen, Mikko, Holmström, Linus January 2024 (has links)
På grund av den ökande digitaliseringen inom järnvägen ökar även antalet digitala anslutningar. Detta gör att fientliga aktörer kan påverka den operativa driften och personsäkerheten på distans av järnvägen via oskyddade anslutningar. Syftet med arbetet är därför att identifiera hårdvarulösningar för att öka cybersäkerheten av kommunikation mellan datorställverk och banobjekt via utdelar i datorställverken. Undersökningen fokuserar på att hitta den mest lämpliga processorenheten (CPU) eller Trusted Platform Module (TPM) för Alstoms utdel (OC950), med hänsyn till specifika cybersäkerhetskriterier enligt standarden IEC—63442. Genom att använda en Pugh-matris jämfördes fem CPU-lösningar och fyra TPM-lösningar. Resultatet visade att de två bästa alternativen var CPU-lösningar, där ”AM64x” från Texas Instruments utmärkte sig som det bästa valet tack vare dess goda cybersäkerhetsfunktioner, processorkapacitet och energieffektivitet. Denna funktionalitet tillät lösningen att ge ett tillfredställande cyberskydd samt gav driftfördelar och framtidsäkran. Sammanfattningsvis konstateras att processorenheter är att föredra för att förbättra prestanda och framtidssäkra hårdvaran på OCS950. TPM-lösningar kan vara ett lämpligt alternativ för att hantera cybersäkerhetsfunktioner men riskerar att bli en flaskhals för kommunikation. Därför är CPU-lösning att föredra, då det kan öka prestandan på utdelen samtidigt som det tillåter implantering av ett tillfredställande cyberskydd. Arbetet bidrar till att förbättra cybersäkerheten mellan utdel och en central ställverksdator och föreslår samtidigt en metod för att jämföra olika hårdvarulösningar genom Pugh-matriser. / Due to the increasing digitalization within the railway sector, the number of digital connections is also rising. This allows hostile actors to remotely impact the operational functioning and personal safety of the railway through unprotected connections. Therefore, the purpose of this work is to identify hardware solutions to enhance the cybersecurity of communication between interlocking computers and trackside objects via object controllers in the interlocking systems. The study focuses on finding the most suitable processor unit (CPU) or Trusted Platform Module (TPM) for Alstom's object controller (OC950), with considering for specific cybersecurity criteria according to the IEC-63442 standard. Using a Pugh matrix, five CPU solutions and four TPM solutions were considered. The results showed that the two best options were CPU solutions, with the "AM64x" from Texas Instruments standing out as the best choice due to its strong cybersecurity features, processing capacity, and energy efficiency. This functionality allowed the solution to provide satisfactory cyber protection as well as operational advantages and futureproofing. In summary, it is noted that processor units are preferred to improve performance and future-proof the hardware on OCS950. TPM solutions may be a suitable alternative for managing cybersecurity functions but risk becoming a communication bottleneck. Therefore, a CPU solution is preferred, as it can enhance the performance of the object controller while allowing the implementation of satisfactory cyber protection. The work contributes to improving cybersecurity between object controllers and central interlocking computers and simultaneously proposes a method for comparing different hardware solutions using Pugh matrices. / Digitalisaation lisääntyessä rautateillä myös digitaalisten yhteyksien määrä kasvaa. Tämä mahdollistaa vihamielisten toimijoiden vaikuttamisen rautateiden operatiiviseen toimintaan ja henkilöturvallisuuteen etäyhteyksien kautta suojaamattomien yhteyksien avulla. Työn tarkoituksena on siksi tunnistaa laitteistoratkaisuja kyberturvallisuuden parantamiseksi viestinnässä tietokonekeskusten ja ratakohteiden välillä jakelijoiden kautta tietokonekeskuksissa. Tutkimus keskittyy sopivimman prosessoriyksikön (CPU) tai Trusted Platform Module (TPM) löytämiseen Alstomin jakelijalle (OC950), ottaen huomioon tietyt kyberturvallisuuskriteerit standardin IEC—63442 mukaisesti. Pugh-matriisin avulla verrattiin viittä CPU-ratkaisua ja neljää TPM-ratkaisua. Tulokset osoittivat, että kaksi parasta vaihtoehtoa olivat CPU-ratkaisuja, joista Texas Instrumentsin “AM64x” erottui parhaana vaihtoehtona sen hyvien kyberturvallisuusominaisuuksien, prosessorikapasiteetin ja energiatehokkuuden ansiosta. Tämä toiminnallisuus mahdollisti ratkaisun tarjoavan tyydyttävän kybersuojan sekä toi operatiivisia etuja ja tulevaisuuden varmuutta. Yhteenvetona todetaan, että prosessoriyksiköt ovat suositeltavia suorituskyvyn parantamiseksi ja laitteiston tulevaisuuden varmistamiseksi OCS950:ssa. TPM-ratkaisut voivat olla sopiva vaihtoehto kyberturvallisuustoimintojen hallintaan, mutta ne voivat muodostaa pullonkaulan viestinnässä. Siksi CPU-ratkaisu on suositeltava, koska se voi parantaa suorituskykyä jakelussa samalla kun se mahdollistaa tyydyttävän kybersuojan toteuttamisen. Työ edistää kyberturvallisuuden parantamista jakelun ja keskus tietokonekeskuksen välillä ja ehdottaa samalla menetelmää eri laitteistoratkaisujen vertailemiseen Pugh-matriisien avulla.
|
162 |
A Unifying Interface Abstraction for Accelerated Computing in Sensor NodesIyer, Srikrishna 31 August 2011 (has links)
Hardware-software co-design techniques are very suitable to develop the next generation of sensornet applications, which have high computational demands. By making use of a low power FPGA, the peak computational performance of a sensor node can be improved without significant degradation of the standby power dissipation. In this contribution, we present a methodology and tool to enable hardware/software co-design for sensor node application development. We present the integration of nesC, a sensornet programming language, with GEZEL, an easy-to-use hardware description language. We describe the hardware/software interface at different levels of abstraction: at the level of the design language, at the level of the co-simulator, and in the hardware implementation. We use a layered, uniform approach that is particularly suited to deal with the heterogeneous interfaces typically found on small embedded processors. We illustrate the strengths of our approach by means of a prototype application: the integration of a hardware-accelerated crypto-application in a nesC application. / Master of Science
|
163 |
Beräkningar med GPU vs CPU : En jämförelsestudie av beräkningseffektivitet med avseende på energi- och tidsförbrukning / Calculations with the CPU vs CPU : A Comparative Study of Computational Efficiency in Terms of Energy and Time ConsumptionLöfgren, Robin, Dahl, Kristoffer January 2010 (has links)
<p>Examensarbetet handlar om en jämförelsestudie av beräkningseffektivitet med avseende på energi- och tidsförbrukning mellan grafikkort och processorer i persondatorer och PlayStation 3.</p><p>Problemet studeras för att göra allmänheten uppmärksam på att det går att lösa en del av energiproblematiken med beräkningar genom att öka energieffektiviteten av beräkningsenheterna.</p><p>Undersökningen har genomförts på ett explorativt sätt och studerar förhållandet mellan processorer, grafikkort och vilken som presterar bäst i vilket sammanhang. Prestandatest genomförs med molekylberäkningsprogrammet F@H och med filkomprimeringsprogrammet WinRAR. Testerna utförs på MultiCore- och SingleCorePCs och PS3s av olika karaktär. I vissa test mäts effektförbrukning för att kunna räkna ut hur energieffektiva vissa system är.</p><p>Resultatet visar tydligt hur den genomsnittliga effektförbrukningen och energieffektiviteten för olika testsystem skiljer sig vid belastning, viloläge och olika typer beräkningar.</p> / <p>The thesis is a comparative study of computational efficiency in terms of energy and time consumption of graphics cards and processors in personal computers and Playstation3’s.</p><p>The problem is studied in order to make the public aware that it is possible to solve some of the energy problems with computations by increasing energy efficiency of the computational units.</p><p>The audit was conducted in an exploratory way, studying the relationship between the processors, graphics cards and which one performs best in which context. Performance tests are carried out by the molecule calculating F@H-program and the file compression program WinRAR. Tests performed on MultiCore and SingleCore PC’s and PS3’s with different characteristics. In some tests power consumption is measured in order to figure out how energy-efficient certain systems are.</p><p>The results clearly show how the average power consumption and energy efficiency for various test systems at differ at load, sleep and various calculations.</p><p> </p>
|
164 |
Optimisation de code Galerkin discontinu sur ordinateur hybride : application à la simulation numérique en électromagnétisme / Discontinuous Galerkin code optimization on hybrid computer : application to the numerical simulation in electromagnetismWeber, Bruno 26 November 2018 (has links)
Nous présentons dans cette thèse les évolutions apportées au solveur Galerkin Discontinu Teta-CLAC, issu de la collaboration IRMA-AxesSim, au cours du projet HOROCH (2015-2018). Ce solveur permet de résoudre les équations de Maxwell en 3D, en parallèle sur un grand nombre d'accélérateurs OpenCL. L'objectif du projet HOROCH était d'effectuer des simulations de grande envergure sur un modèle numérique complet de corps humain. Ce modèle comporte 24 millions de mailles hexaédriques pour des calculs dans la bande de fréquences des objets connectés allant de 1 à 3 GHz (Bluetooth). Les applications sont nombreuses : téléphonie et accessoires, sport (maillots connectés), médecine (sondes : gélules, patchs), etc. Les évolutions ainsi apportées comprennent, entre autres : l'optimisation des kernels OpenCL à destination des CPU dans le but d'utiliser au mieux les architectures hybrides ; l'expérimentation du runtime StarPU ; le design d'un schéma d'intégration à pas de temps local ; et bon nombre d'optimisations permettant au solveur de traiter des simulations de plusieurs millions de mailles. / In this thesis, we present the evolutions made to the Discontinuous Galerkin solver Teta-CLAC – resulting from the IRMA-AxesSim collaboration – during the HOROCH project (2015-2018). This solver allows to solve the Maxwell equations in 3D and in parallel on a large amount of OpenCL accelerators. The goal of the HOROCH project was to perform large-scale simulations on a complete digital human body model. This model is composed of 24 million hexahedral cells in order to perform calculations in the frequency band of connected objects going from 1 to 3 GHz (Bluetooth). The applications are numerous: telephony and accessories, sport (connected shirts), medicine (probes: capsules, patches), etc. The changes thus made include, among others: optimization of OpenCL kernels for CPUs in order to make the best use of hybrid architectures; StarPU runtime experimentation; the design of an integration scheme using local time steps; and many optimizations allowing the solver to process simulations of several millions of cells.
|
165 |
Effekterna av brandväggsregler för FreeBSD PF & IPtables / The impact of firewall rule sets for FreeBSD PF & IPtablesPolnäs, Andreas January 2018 (has links)
Paketfiltrering är en av nyckelfunktionerna i de flesta av dagens brandväggar, vilket gör paketfiltrering till en viktig del av det dagliga arbetet för många systemadministratörer. Sedan uppkomsten av paketfiltrering har nätverkskomplexiteten ökat drastiskt, Många av dagens tjänster har behov av olika protokoll för att kommunicera. I kombination med detta måste brandväggen bearbeta en större mängd data än tidigare för att tillgodose dagens nätverkstopologier.Denna studie syftar till att undersöka om det finns någon skillnad i prestanda mellan två moderna iterationer av de populära UNIX-brandväggarna IPtables och FreeBSD PF. Detta sker genom att de två brandväggarna utsätts för olika antal regler, samtidigt som de genomströmmas av olika stora paketflöden.De båda brandväggarna kommer att jämföras baserat på tre attribut, CPU, genomströmning och latens. tre olika bandbredder testas. 100, 500 och 1000Mbit/s. Testet omfattar längre tester som upprepas flera gånger för att öka studiens giltighet. Testerna som utförs görs på ursprungliga operativsystemet för varje brandvägg. Linux Ubuntu 16 för IPtables och FreeBSD 11 för FreeBSD PF.Studien kom fram till att brandväggarnas prestanda är likvärdiga i genomströmning och latens vid lägre regelmängder. Vid högre regelmängder skiljer sig prestandan och PF är bättre anpassad för stora regeluppsättningar. IPtables anses vara den bättre brandväggen för låga regeluppsättningar på grund av dess låga CPU-användning. / Packetfiltering is one of the key features in most of today’s firewalls. With many packetfilters being used daily in a system administrator’s work. Over the years since founding of the packetfilter technology the complexity of the network has increased drastically, where many of today’s services relies on different protocols to communicate, combined with a much larger amount of data that the firewall must process to satisfy todays network topologies.This study aims to explore if there is any difference in performance between two modern iterations of popular UNIX firewalls, IPtables and FreeBSD PF. By submitting them to different number of rulesets while at the same testing them under a series of different packet flows through the firewall.Both firewalls will be compared based on three attributes, CPU, throughput and latency, and three different bandwidths will be tested. 100, 500 and 1000Mbits/s. The test include longer tests that is repeated multiple times to increase the validity of the study. The tests were performed on the native operating system of each firewall. Linux Ubuntu 16 for IPtables and FreeBSD 11 for FreeBSD PF.The study concluded that the performance of the firewalls is equal in throughput and latency at lower volumes. At higher amounts of rulesets, performance is different between the firewalls and PF is considered better for large rules, while IPtables are considered to be a better firewall for low rulesets due to its low CPU usage.
|
166 |
FairCPU: Uma Arquitetura para Provisionamento de MÃquinas Virtuais Utilizando CaracterÃsticas de Processamento / FairCPU: An Architecture for Provisioning Virtual Machines Using Processing FeaturesPaulo Antonio Leal Rego 02 March 2012 (has links)
FundaÃÃo Cearense de Apoio ao Desenvolvimento Cientifico e TecnolÃgico / O escalonamento de recursos à um processo chave para a plataforma de ComputaÃÃo em Nuvem, que geralmente utiliza mÃquinas virtuais (MVs) como unidades de escalonamento. O uso de tÃcnicas de virtualizaÃÃo fornece grande flexibilidade com a habilidade de instanciar vÃrias MVs em uma mesma mÃquina fÃsica (MF), modificar a capacidade das MVs e migrÃ-las entre as MFs. As tÃcnicas de consolidaÃÃo e alocaÃÃo dinÃmica de MVs tÃm tratado o impacto da sua utilizaÃÃo como uma medida independente de localizaÃÃo. à geralmente aceito que o desempenho de uma MV serà o mesmo, independentemente da MF em que ela à alocada. Esta à uma suposiÃÃo razoÃvel para um ambiente homogÃneo, onde as MFs sÃo idÃnticas e as MVs estÃo executando o mesmo sistema operacional e aplicativos. No entanto, em um ambiente de ComputaÃÃo em Nuvem, espera-se compartilhar um conjunto composto por recursos heterogÃneos, onde as MFs podem variar em termos de capacidades de seus recursos e afinidades de dados. O objetivo principal deste trabalho à apresentar uma arquitetura que possibilite a padronizaÃÃo da representaÃÃo do poder de processamento das MFs e MVs, em funÃÃo de Unidades de Processamento (UPs), apoiando-se na limitaÃÃo do uso da CPU para prover isolamento de desempenho e manter a capacidade de processamento das MVs independente da MF subjacente. Este trabalho busca suprir a necessidade de uma soluÃÃo que considere a heterogeneidade das MFs presentes na infraestrutura da Nuvem e apresenta polÃticas de escalonamento baseadas na utilizaÃÃo das UPs. A arquitetura proposta, chamada FairCPU, foi implementada para trabalhar com os hipervisores KVM e Xen, e foi incorporada a uma nuvem privada, construÃda com o middleware OpenNebula, onde diversos experimentos foram realizados para avaliar a soluÃÃo proposta. Os resultados comprovam a eficiÃncia da arquitetura FairCPU em utilizar as UPs para reduzir a variabilidade no desempenho das MVs, bem como para prover uma nova maneira de representar e gerenciar o poder de processamento das MVs e MFs da infraestrutura. / Resource scheduling is a key process for cloud computing platform, which generally
uses virtual machines (VMs) as scheduling units. The use of virtualization techniques
provides great flexibility with the ability to instantiate multiple VMs on one physical machine
(PM), migrate them between the PMs and dynamically scale VMâs resources. The techniques
of consolidation and dynamic allocation of VMs have addressed the impact of its use as an
independent measure of location. It is generally accepted that the performance of a VM will be
the same regardless of which PM it is allocated. This assumption is reasonable for a homogeneous
environment where the PMs are identical and the VMs are running the same operating
system and applications. Nevertheless, in a cloud computing environment, we expect that a set
of heterogeneous resources will be shared, where PMs will face changes both in terms of their
resource capacities and as also in data affinities. The main objective of this work is to propose
an architecture to standardize the representation of the processing power by using processing
units (PUs). Adding to that, the limitation of CPU usage is used to provide performance isolation
and maintain the VMâs processing power at the same level regardless the underlying PM.
The proposed solution considers the PMs heterogeneity present in the cloud infrastructure and
provides scheduling policies based on PUs. The proposed architecture is called FairCPU and
was implemented to work with KVM and Xen hypervisors. As study case, it was incorporated
into a private cloud, built with the middleware OpenNebula, where several experiments were
conducted. The results prove the efficiency of FairCPU architecture to use PUs to reduce VMsâ
performance variability, as well as to provide a new way to represent and manage the processing
power of the infrastructureâs physical and virtual machines.
|
167 |
Beräkningar med GPU vs CPU : En jämförelsestudie av beräkningseffektivitet med avseende på energi- och tidsförbrukning / Calculations with the CPU vs CPU : A Comparative Study of Computational Efficiency in Terms of Energy and Time ConsumptionLöfgren, Robin, Dahl, Kristoffer January 2010 (has links)
Examensarbetet handlar om en jämförelsestudie av beräkningseffektivitet med avseende på energi- och tidsförbrukning mellan grafikkort och processorer i persondatorer och PlayStation 3. Problemet studeras för att göra allmänheten uppmärksam på att det går att lösa en del av energiproblematiken med beräkningar genom att öka energieffektiviteten av beräkningsenheterna. Undersökningen har genomförts på ett explorativt sätt och studerar förhållandet mellan processorer, grafikkort och vilken som presterar bäst i vilket sammanhang. Prestandatest genomförs med molekylberäkningsprogrammet F@H och med filkomprimeringsprogrammet WinRAR. Testerna utförs på MultiCore- och SingleCorePCs och PS3s av olika karaktär. I vissa test mäts effektförbrukning för att kunna räkna ut hur energieffektiva vissa system är. Resultatet visar tydligt hur den genomsnittliga effektförbrukningen och energieffektiviteten för olika testsystem skiljer sig vid belastning, viloläge och olika typer beräkningar. / The thesis is a comparative study of computational efficiency in terms of energy and time consumption of graphics cards and processors in personal computers and Playstation3’s. The problem is studied in order to make the public aware that it is possible to solve some of the energy problems with computations by increasing energy efficiency of the computational units. The audit was conducted in an exploratory way, studying the relationship between the processors, graphics cards and which one performs best in which context. Performance tests are carried out by the molecule calculating F@H-program and the file compression program WinRAR. Tests performed on MultiCore and SingleCore PC’s and PS3’s with different characteristics. In some tests power consumption is measured in order to figure out how energy-efficient certain systems are. The results clearly show how the average power consumption and energy efficiency for various test systems at differ at load, sleep and various calculations.
|
168 |
Efficient betweenness Centrality Computations on Hybrid CPU-GPU SystemsMishra, Ashirbad January 2016 (has links) (PDF)
Analysis of networks is quite interesting, because they can be interpreted for several purposes. Various features require different metrics to measure and interpret them. Measuring the relative importance of each vertex in a network is one of the most fundamental building blocks in network analysis. Between’s Centrality (BC) is one such metric that plays a key role in many real world applications. BC is an important graph analytics application for large-scale graphs. However it is one of the most computationally intensive kernels to execute, and measuring centrality in billion-scale graphs is quite challenging.
While there are several existing e orts towards parallelizing BC algorithms on multi-core CPUs and many-core GPUs, in this work, we propose a novel ne-grained CPU-GPU hybrid algorithm that partitions a graph into two partitions, one each for CPU and GPU. Our method performs BC computations for the graph on both the CPU and GPU resources simultaneously, resulting in a very small number of CPU-GPU synchronizations, hence taking less time for communications. The BC algorithm consists of two phases, the forward phase and the backward phase. In the forward phase, we initially and the paths that are needed by either partitions, after which each partition is executed on each processor in an asynchronous manner. We initially compute border matrices for each partition which stores the relative distances between each pair of border vertex in a partition. The matrices are used in the forward phase calculations of all the sources. In this way, our hybrid BC algorithm leverages the multi-source property inherent in the BC problem. We present proof of correctness and the bounds for the number of iterations for each source. We also perform a novel hybrid and asynchronous backward phase, in which each partition communicates with the other only when there is a path that crosses the partition, hence it performs minimal CPU-GPU synchronizations.
We use a variety of implementations for our work, like node-based and edge based parallelism, which includes data-driven and topology based techniques. In the implementation we show that our method also works using variable partitioning technique. The technique partitions the graph into unequal parts accounting for the processing power of each processor. Our implementations achieve almost equal percentage of utilization on both the processors due to the technique. For large scale graphs, the size of the border matrix also becomes large, hence to accommodate the matrix we present various techniques. The techniques use the properties inherent in the shortest path problem for reduction. We mention the drawbacks of performing shortest path computations on a large scale and also provide various solutions to it.
Evaluations using a large number of graphs with different characteristics show that our hybrid approach without variable partitioning and border matrix reduction gives 67% improvement in performance, and 64-98.5% less CPU-GPU communications than the state of art hybrid algorithm based on the popular Bulk Synchronous Paradigm (BSP) approach implemented in TOTEM. This shows our algorithm's strength which reduces the need for larger synchronizations. Implementing variable partitioning, border matrix reduction and backward phase optimizations on our hybrid algorithm provides up to 10x speedup. We compare our optimized implementation, with CPU and GPU standalone codes based on our forward phase and backward phase kernels, and show around 2-8x speedup over the CPU-only code and can accommodate large graphs that cannot be accommodated in the GPU-only code. We also show that our method`s performance is competitive to the state of art multi-core CPU and performs 40-52% better than GPU implementations, on large graphs. We show the drawbacks of CPU and GPU only implementations and try to motivate the reader about the challenges that graph algorithms face in large scale computing, suggesting that a hybrid or distributed way of approaching the problem is a better way of overcoming the hurdles.
|
169 |
Predictive vertical CPU autoscaling in Kubernetes based on time-series forecasting with Holt-Winters exponential smoothing and long short-term memory / Prediktiv vertikal CPU-autoskalning i Kubernetes baserat på tidsserieprediktion med Holt-Winters exponentiell utjämning och långt korttidsminneWang, Thomas January 2021 (has links)
Private and public clouds require users to specify requests for resources such as CPU and memory (RAM) to be provisioned for their applications. The values of these requests do not necessarily relate to the application’s run-time requirements, but only help the cloud infrastructure resource manager to map requested virtual resources to physical resources. If an application exceeds these values, it might be throttled or even terminated. Consequently, requested values are often overestimated, resulting in poor resource utilization in the cloud infrastructure. Autoscaling is a technique used to overcome these problems. In this research, we formulated two new predictive CPU autoscaling strategies forKubernetes containerized applications, using time-series analysis, based on Holt-Winters exponential smoothing and long short-term memory (LSTM) artificial recurrent neural networks. The two approaches were analyzed, and their performances were compared to that of the default Kubernetes Vertical Pod Autoscaler (VPA). Efficiency was evaluated in terms of CPU resource wastage, and insufficient CPU percentage and amount for container workloads from Alibaba Cluster Trace 2018, and others. In our experiments, we observed that Kubernetes Vertical Pod Autoscaler (VPA) tended to perform poorly on workloads that periodically change. Our results showed that compared to VPA, predictive methods based on Holt- Winters exponential smoothing (HW) and Long Short-Term Memory (LSTM) can decrease CPU wastage by over 40% while avoiding CPU insufficiency for various CPU workloads. Furthermore, LSTM has been shown to generate stabler predictions compared to that of HW, which allowed for more robust scaling decisions. / Privata och offentliga moln kräver att användare begär mängden CPU och minne (RAM) som ska fördelas till sina applikationer. Mängden resurser är inte nödvändigtvis relaterat till applikationernas körtidskrav, utan är till för att molninfrastrukturresurshanteraren ska kunna kartlägga begärda virtuella resurser till fysiska resurser. Om en applikation överskrider dessa värden kan den saktas ner eller till och med krascha. För att undvika störningar överskattas begärda värden oftast, vilket kan resultera i ineffektiv resursutnyttjande i molninfrastrukturen. Autoskalning är en teknik som används för att överkomma dessa problem. I denna forskning formulerade vi två nya prediktiva CPU autoskalningsstrategier för containeriserade applikationer i Kubernetes, med hjälp av tidsserieanalys baserad på metoderna Holt-Winters exponentiell utjämning och långt korttidsminne (LSTM) återkommande neurala nätverk. De två metoderna analyserades, och deras prestationer jämfördes med Kubernetes Vertical Pod Autoscaler (VPA). Prestation utvärderades genom att observera under- och överutilisering av CPU-resurser, för diverse containerarbetsbelastningar från bl. a. Alibaba Cluster Trace 2018. Vi observerade att Kubernetes Vertical Pod Autoscaler (VPA) i våra experiment tenderade att prestera dåligt på arbetsbelastningar som förändras periodvist. Våra resultat visar att jämfört med VPA kan prediktiva metoder baserade på Holt-Winters exponentiell utjämning (HW) och långt korttidsminne (LSTM) minska överflödig CPU-användning med över 40 % samtidigt som de undviker CPU-brist för olika arbetsbelastningar. Ytterligare visade sig LSTM generera stabilare prediktioner jämfört med HW, vilket ledde till mer robusta autoskalningsbeslut.
|
170 |
Méthode de type Galerkin discontinu en maillages multi-éléments pour la résolution numérique des équations de Maxwell instationnaires / High order non-conforming multi-element Discontinuous Galerkin method for time-domain electromagneticsDurochat, Clément 30 January 2013 (has links)
Cette thèse porte sur l’étude d’une méthode de type Galerkin discontinu en domaine temporel (GDDT), afin de résoudre numériquement les équations de Maxwell instationnaires sur des maillages hybrides tétraédriques/hexaédriques en 3D (triangulaires/quadrangulaires en 2D) et non-conformes, que l’on note méthode GDDT-PpQk. Comme dans différents travaux déjà réalisés sur plusieurs méthodes hybrides (par exemple des combinaisons entre des méthodes Volumes Finis et Différences Finies, Éléments Finis et Différences Finies, etc.), notre objectif principal est de mailler des objets ayant une géométrie complexe à l’aide de tétraèdres, pour obtenir une précision optimale, et de mailler le reste du domaine (le vide environnant) à l’aide d’hexaèdres impliquant un gain en terme de mémoire et de temps de calcul. Dans la méthode GDDT considérée, nous utilisons des schémas de discrétisation spatiale basés sur une interpolation polynomiale nodale, d’ordre arbitraire, pour approximer le champ électromagnétique. Nous utilisons un flux centré pour approcher les intégrales de surface et un schéma d’intégration en temps de type saute-mouton d’ordre deux ou d’ordre quatre. Après avoir introduit le contexte historique et physique des équations de Maxwell, nous présentons les étapes détaillées de la méthode GDDT-PpQk. Nous réalisons ensuite une analyse de stabilité L2 théorique, en montrant que cette méthode conserve une énergie discrète et en exhibant une condition suffisante de stabilité de type CFL sur le pas de temps, ainsi que l’analyse de convergence en h (théorique également), conduisant à un estimateur d’erreur a-priori. Ensuite, nous menons une étude numérique complète en 2D (ondes TMz), pour différents cas tests, des maillages hybrides et non-conformes, et pour des milieux de propagation homogènes ou hétérogènes. Nous faisons enfin de même pour la mise en oeuvre en 3D, avec des simulations réalistes, comme par exemple la propagation d’une onde électromagnétique dans un modèle hétérogène de tête humaine. Nous montrons alors la cohérence entre les résultats mathématiques et numériques de cette méthode GDDT-PpQk, ainsi que ses apports en termes de précision et de temps de calcul. / This thesis is concerned with the study of a Discontinuous Galerkin Time-Domain method (DGTD), for the numerical resolution of the unsteady Maxwell equations on hybrid tetrahedral/hexahedral in 3D (triangular/quadrangular in 2D) and non-conforming meshes, denoted by DGTD-PpQk method. Like in several studies on various hybrid time domain methods (such as a combination of Finite Volume with Finite Difference methods, or Finite Element with Finite Difference, etc.), our general objective is to mesh objects with complex geometry by tetrahedra for high precision and mesh the surrounding space by square elements for simplicity and speed. In the discretization scheme of the DGTD method considered here, the electromagnetic field components are approximated by a high order nodal polynomial, using a centered approximation for the surface integrals. Time integration of the associated semi-discrete equations is achieved by a second or fourth order Leap-Frog scheme. After introducing the historical and physical context of Maxwell equations, we present the details of the DGTD-PpQk method. We prove the L2 stability of this method by establishing the conservation of a discrete analog of the electromagnetic energy and a sufficient CFL-like stability condition is exhibited. The theoritical convergence of the scheme is also studied, this leads to a-priori error estimate that takes into account the hybrid nature of the mesh. Afterward, we perform a complete numerical study in 2D (TMz waves), for several test problems, on hybrid and non-conforming meshes, and for homogeneous or heterogeneous media. We do the same for the 3D implementation, with more realistic simulations, for example the propagation in a heterogeneous human head model. We show the consistency between the mathematical and numerical results of this DGTD-PpQk method, and its contribution in terms of accuracy and CPU time.
|
Page generated in 0.0264 seconds