Global ETD Search

1	Improving the flexibility of DPDK Service Cores / Förbättring av flexibiliteten hos DPDK Service Cores Blazevic, Denis Ivan, Jansson, Magnus January 2019 (has links) Data Plane Development Kit is a highly used library for creating network applications that can be run on all hardware. Data Plane Development Kit has a component called Service Cores, which allows the main applications to create services that will run independently. These services are manually mapped to specific CPU cores, and are scheduled in a round-robin method. Because of the manual mapping, and the scheduling, the different load for each service can impact the start time for each service. By having services not run when supposed to, the throughput will degrade. In this thesis, we investigate and try to solve the issue by implementing a basic load balancer into the Service Core component. Our results show that an basic load balancer, that will balance upon reaching a CPU upper threshold, will increase the throughput of services while decreasing the delay between each service run. dpdk load balancer service core service cores data plane development kit dpdk lastbalanserare data plane development kit Computer Systems Datorsystem
2	Design and Implementation of an Architecture-aware In-memory Key- Value Store Giordano, Omar January 2021 (has links) Key-Value Stores (KVSs) are a type of non-relational databases whose data is represented as a key-value pair and are often used to represent cache and session data storage. Among them, Memcached is one of the most popular ones, as it is widely used in various Internet services such as social networks and streaming platforms. Given the continuous and increasingly rapid growth of networked devices that use these services, the commodity hardware on which the databases are based must process packets faster to meet the needs of the market. However, in recent years, the performance improvements characterising the new hardware has become thinner and thinner. From here, as the purchase of new products is no longer synonymous with significant performance improvements, companies need to exploit the full potential of the hardware already in their possession, consequently postponing the purchase of more recent hardware. One of the latest ideas for increasing the performance of commodity hardware is the use of slice-aware memory management. This technique exploits the Last Level of Cache (LLC) by making sure that the individual cores take data from memory locations that are mapped to their respective cache portions (i.e., LLC slices). This thesis focuses on the realisation of a KVS prototype—based on Intel Haswell micro-architecture—built on top of the Data Plane Development Kit (DPDK), and to which the principles of slice-aware memory management are applied. To test its performance, given the non-existence of a DPDKbased traffic generator that supports the Memcached protocol, an additional prototype of a traffic generator that supports these features has also been developed. The performances were measured using two distinct machines: one for the traffic generator and one for the KVS. First, the “regular” KVS prototype was tested, then, to see the actual benefits, the slice-aware one. Both KVS prototypeswere subjected to two types of traffic: (i) uniformtraffic where the keys are always different from each other, and (ii) skewed traffic, where keys are repeated and some keys are more likely to be repeated than others. The experiments show that, in real-world scenario (i.e., characterised by skewed key distributions), the employment of a slice-aware memory management technique in a KVS can slightly improve the end-to-end latency (i.e.,~2%). Additionally, such technique highly impacts the look-up time required by the CPU to find the key and the corresponding value in the database, decreasing the mean time by ~22.5%, and improving the 99th percentile by ~62.7%. / Key-Value Stores (KVSs) är en typ av icke-relationsdatabaser vars data representeras som ett nyckel-värdepar och används ofta för att representera lagring av cache och session. Bland dem är Memcached en av de mest populära, eftersom den används ofta i olika internettjänster som sociala nätverk och strömmande plattformar. Med tanke på den kontinuerliga och allt snabbare tillväxten av nätverksenheter som använder dessa tjänster måste den råvaruhårdvara som databaserna bygger på bearbeta paket snabbare för att möta marknadens behov. Under de senaste åren har dock prestandaförbättringarna som kännetecknar den nya hårdvaran blivit tunnare och tunnare. Härifrån, eftersom inköp av nya produkter inte längre är synonymt med betydande prestandaförbättringar, måste företagen utnyttja den fulla potentialen för hårdvaran som redan finns i deras besittning, vilket skjuter upp köpet av nyare hårdvara. En av de senaste idéerna för att öka prestanda för råvaruhårdvara är användningen av skivmedveten minneshantering. Denna teknik utnyttjar den Sista Nivån av Cache (SNC) genom att se till att de enskilda kärnorna tar data från minnesplatser som är mappade till deras respektive cachepartier (dvs. SNCskivor). Denna avhandling fokuserar på förverkligandet av en KVS-prototyp— baserad på Intel Haswell mikroarkitektur—byggd ovanpå Data Plane Development Kit (DPDK), och på vilken principerna för skivmedveten minneshantering tillämpas. För att testa dess prestanda, med tanke på att det inte finns en DPDK-baserad trafikgenerator som stöder Memcachedprotokollet, har en ytterligare prototyp av en trafikgenerator som stöder dessa funktioner också utvecklats. Föreställningarna mättes med två olika maskiner: en för trafikgeneratorn och en för KVS. Först testades den “vanliga” KVSprototypen, för att se de faktiska fördelarna, den skivmedvetna. Båda KVSprototyperna utsattes för två typer av trafik: (i) enhetlig trafik där nycklarna alltid skiljer sig från varandra och (ii) sned trafik, där nycklar upprepas och vissa nycklar är mer benägna att upprepas än andra. Experimenten visar att i verkliga scenarier (dvs. kännetecknas av snedställda nyckelfördelningar) kan användningen av en skivmedveten minneshanteringsteknik i en KVS förbättra förbättringen från slut till slut (dvs. ~2%). Dessutom påverkar sådan teknik i hög grad uppslagstiden som krävs av CPU: n för att hitta nyckeln och motsvarande värde i databasen, vilket minskar medeltiden med ~22, 5% och förbättrar 99th percentilen med ~62, 7%. Key-Value Store Data Plane Development Kit Last Level of Cache Sliceaware memory management Memcached Haswell microrchitecture. Key-Value Store Data Plane Development Kit Sista Nivån av Cache Skivmedveten Minneshantering Memcached Haswell mikroarkitektur. Computer and Information Sciences Data- och informationsvetenskap
3	Diffuser: Packet Spraying While Maintaining Order : Distributed Event Scheduler for Maintaining Packet Order while Packet Spraying in DPDK / Diffusor: Packet Spraying While Upprätthålla Ordning : Distribuerad händelseschemaläggare för att upprätthålla paketordning medan Paketsprutning i DPDK Purushotham Srinivas, Vignesh January 2023 (has links) The demand for high-speed networking applications has made Network Processors (NPs) and Central Computing Units (CPUs) increasingly parallel and complex, containing numerous on-chip processing cores. This parallelism can only be exploited fully by the underlying packet scheduler by efficiently utilizing all the available cores. Classically, packets have been directed towards the processing cores at flow granularity, making them susceptible to traffic locality. Ensuring a good load balance among the processors improves the application’s throughput and packet loss characteristics. Hence, packet-level schedulers dispatch flows to the processing core at a packet granularity to improve the load balance. However, packet-level scheduling combined with advanced parallelism introduces out-of-order departure of the processed packets. Simultaneously optimizing both the load balance and packet order is challenging. In this degree project, we micro-benchmark the DPDK’s (Dataplane Development Kit) event scheduler and identify many performance and scalability bottlenecks. We find the event scheduler consumes around 40% of the cycles on each participating core for event scheduling. Additionally, we find that DSW (Distributed Software Scheduler) cannot saturate all the workers with traffic because a single NIC (Network Interface Card) queue is polled for packets in our test setup. Then we propose Diffuser, an event scheduler for DPDK that combines the functional properties of both the flow and packet-level schedulers. The diffuser aims to achieve optimal load balance while minimizing out-of-order packet transmission. Diffuser uses stochastic flow assignments along with a load imbalance feedback mechanism to adaptively control the rate of flow migrations to optimize the scheduler’s load distribution. Diffuser reduces packet reordering by at least 65% with ten flows of 100 bytes at 25 MPPS (Million Packet Per Second) and at least 50% with one flow. While Diffuser improves the reordering performance, it slightly reduces throughput and increases latency due to flow migrations and reduced cache locality / Efterfrågan på höghastighets-nätverksapplikationer har gjort nätverkspro-cessorer (NP) och centrala beräkningsenheter (CPU:er) alltmer parallella, komplexa och innehållande många processorkärnor. Denna parallellitet kan endast utnyttjas fullt ut av den underliggande paketschemaläggaren genom att effektivt utnyttja alla tillgängliga kärnor. Vanligtvis har paketschemaläggaren skickat paket till olika kärnor baserat på flödesgranularitet, vilket medför trafik-lokalitet. En bra belastningsbalans mellan processorerna förbättrar applikationens genomströmning och minskar förlorade paket. Därför skickar schemaläggare på paketnivå istället flöden till kärnan med en paketgranularitet för att förbättra lastbalansen. Schemaläggning på paketnivå kombinerat med avancerad parallellism innebär dock att de behandlade paketen avgår i oordning. Att samtidigt optimera både lastbalans och paketordning är en utmaning. I detta examensprojekt utvärderar vi DPDKs (Dataplane Development Kit) händelseschemaläggare och hittar många flaskhalsar i prestanda och skalbarhet. Vi finner att händelseschemaläggaren konsume-rar cirka 40 % av cyklerna på varje kärna.Dessutom finner vi att DSW (Schemaläggare för distribuerad programvara) inte kan mätta alla arbetande kärnor med trafik eftersom en enda nätverkskorts-kö används i vår testmiljö. Vi introducerar också Diffuser, en händelse-schemaläggare för DPDK som kombinerar egenskaperna hos både flödes-och paketnivåschemaläggare. Diffuser ämnar att uppnå optimal lastbalans samtidigt som den minimerar paketöverföring i oordning. Den använder stokastiska flödestilldelningar tillsammans med en återkopplingsmekanism för lastobalans för att adaptivt kontrollera flödesmigreringar för att optimera lastfördelningen. Diffuser minskar omordning av paket med minst 65 % med tio flöden på 100 byte vid 25 MPPS (Miljoner paket per sekund) och minst 50 % med endast ett flöde. Även om Diffuser förbättrar omordningsprestandan, minskar den genomströmningen något och ökar latensen på grund av flödesmigreringar och minskad cache-lokalitet. Packet scheduling Scheduling Out of order Data plane development kit Parallel processing Network processor Paketschemaläggning Schemaläggning oordning Dataplansutvecklingskit Parallell bearbetning Nätverksprocessor Computer and Information Sciences Data- och informationsvetenskap

1

Page generated in 0.1298 seconds