Global ETD Search

1	System-Level Synthesis of Dataplane Subsystems for MPSoCs January 2013 (has links) abstract: In recent years we have witnessed a shift towards multi-processor system-on-chips (MPSoCs) to address the demands of embedded devices (such as cell phones, GPS devices, luxury car features, etc.). Highly optimized MPSoCs are well-suited to tackle the complex application demands desired by the end user customer. These MPSoCs incorporate a constellation of heterogeneous processing elements (PEs) (general purpose PEs and application-specific integrated circuits (ASICS)). A typical MPSoC will be composed of a application processor, such as an ARM Coretex-A9 with cache coherent memory hierarchy, and several application sub-systems. Each of these sub-systems are composed of highly optimized instruction processors, graphics/DSP processors, and custom hardware accelerators. Typically, these sub-systems utilize scratchpad memories (SPM) rather than support cache coherency. The overall architecture is an integration of the various sub-systems through a high bandwidth system-level interconnect (such as a Network-on-Chip (NoC)). The shift to MPSoCs has been fueled by three major factors: demand for high performance, the use of component libraries, and short design turn around time. As customers continue to desire more and more complex applications on their embedded devices the performance demand for these devices continues to increase. Designers have turned to using MPSoCs to address this demand. By using pre-made IP libraries designers can quickly piece together a MPSoC that will meet the application demands of the end user with minimal time spent designing new hardware. Additionally, the use of MPSoCs allows designers to generate new devices very quickly and thus reducing the time to market. In this work, a complete MPSoC synthesis design flow is presented. We first present a technique \cite{leary1_intro} to address the synthesis of the interconnect architecture (particularly Network-on-Chip (NoC)). We then address the synthesis of the memory architecture of a MPSoC sub-system \cite{leary2_intro}. Lastly, we present a co-synthesis technique to generate the functional and memory architectures simultaneously. The validity and quality of each synthesis technique is demonstrated through extensive experimentation. / Dissertation/Thesis / Ph.D. Computer Science 2013 Computer science Computer Science Dataplane MPSoC Network-on-Chip PhD Synthesis
2	A Dataplane Programmable Traffic Marker using Packet Value Concept / En Paket Värde Markerare För DataPlan Programerbara Enheter Shaker, Maher January 2021 (has links) Real-time sensitive network applications are emerging and require ultra-low latency to reach the desired QoS. A main issue that contributes to latency is excessive buffering at intermediate switches and routers. Existing queuing strategies that aim to reduce buffering induced latency typically apply a single queue AQM that does not support service differentiation and treats all packets equally. The recently proposed per packet value framework utilizes a packet value marker and a packet value aware AQM to solve this issue by supporting service differentiation in a single queue and introducing more advanced policies for resource sharing. However, the per packet value framework is implemented and tested in a software environment with no possibility to study the performance on hardware equipment. This thesis utilizes P4 to design and implement a packet value marker on dataplane programmable devices. The marker should be capable of supporting multiple resource sharing policies, following resource sharing policies accurately, and not being the bottleneck in the network. A target-independent packet value marker is designed and modified with target-dependent P4 constructs to fit the implementation requirements of a Tofino switch and a Netronome smart NIC. An accurate Tofino implementation using this approach is difficult to achieve because of a complicated random number generation process and resource limitation. Evaluation using a testbed with a Netronome marker shows that the marker achieves desired functionality with accurate packet value distribution for throughputs larger than 5000 Kbps. However, the challenge of concurrent packet processing combined with a smart NIC that does not have powerful packet processing cores results in the marker having lower throughput and higher latency than expected. The evaluation also shows that resource limitation in terms of available memory and the number of supported policies affects the maximum number of supported users. We also ported a version to a switching ASIC with limited functionality due to the restrictions of the hardware platform. Our evaluation also provides insights into how such a marking scheme performs on different hardware targets and the limitation imposed by such target specific architecture. / Realtids Känsliga nätverksapplikationer utvecklas och kräver ultra-låg latens för att nå önskad QoS. Befintliga lösningar på detta problem tillämpar AQM på en enda kö och stöder inte tjänst differentiering och behandlar alla paket lika. Det nyligen föreslagna ramverket per packet value löser problemet genom att stödja tjänst differentiering på en kö och införa mer avancerade policyer för resursdelning. Ramverket per packet value implementeras och testas i en mjukvaru miljö utan möjlighet att studera prestanda på hårdvaru utrustning. Denna avhandling använder P4 för att designa och implementera en packet value marker på dataplan programmerbara enheter. Markern bör kunna stödja flera resursdelning principer, följa resursdelning principer exakt, och inte vara bottlenecken i nätverket. En hårdvaruoberoende packet value marker är designad och modifierad med hårdvaruberoende P4-konstruktioner för att passa implementerings kraven för en Tofino switch och en Netronome smart NIC. Slumpmässig talgenerering och resursbegränsning resulterar i en misslyckad implementering av en marker på Tofino med detta tillvägagångssätt. Utvärdering med hjälp av en testbädd med en Netronome marker visar att ett enanvändarscenario och en slumptalsgenerator orsakar lägre genomströmning och högre latens jämfört med forwarding. Resultaten visar att denna metod för Markern är felaktig när man tillämpar policyer vid lägre genomströmningar. Utvärderingen visar också att det maximala antalet användare begränsas av minnet och antalet policyer som stöds. Denna utvärdering ger inblick i hur en sådan marking algoritm är designad och svårigheterna med implementering för olika hårdvara. P4 Dataplane Tofino Netronome Marker Computer Sciences Datavetenskap (datalogi)
3	Enhancing Quality of Service metrics for high fan-in Node.js applications by optimising the network stack : Leveraging IX: The Dataplane Operating System / Förbättran av Quality of Service för högbelastade Node.js-webbapplikationer genom effektivare operativsystem Lilkaer, Fredrik Peter January 2015 (has links) This thesis investigates the feasibility of porting Node.js, a JavaScript web application framework and server, to IX, a dataplane operating system specifically developed to meet the needs of high performance microsecond-computing type of applications in a datacentre setting. We show that porting requires extensions to the IX kernel to support UDS polling, which we implement. We develop a distributed load generator to benchmark the framework. The results show that running Node.js on IX improves throughput by up to 20.6\%, latency by up to 5.23×, and tail latency by up to 5.68× compared to a Linux baseline. We show how server side request level reordering affect the latency distribution, predominantly in cases where the server is load saturated. Finally, due to various limitations of IX, we are unable at this time to recommend running Node.js on IX in a production environment, despite improved metrics in all test cases. However, the limitations are not fundamental, and could be resolved in future work. / Detta exjobb undersöker möjligheterna till att använda IX, ett specialiserat dataplansoperativsystem avsett för högpresterande datacentertillämpningar, för att köra Node.js, ett webapplikationramverk för JavaScript-applikationer. För att porta Node.js till IX krävs att vi utvidgar IX med funktionalitet för samtidig pollning av Unix Domain Sockets och nätverksflöden, vilket visas samt genomförs. Vidare utvecklas en distribuerad lastgenerator för att utvärdera applikationsramverket under IX jämfört baslinje som utgörs av en omodifierad Linuxdistribution. Resultaten visar att throughput förbättras med upp till 20.6\%, latens upp till 5.23× och tail latency upp till 5.68×. Sedermera undersöker vi huruvida latensvariansen ökat på grund av request-omordningar på serversidan, vilket tycks vara fallet vid hög serverbelastning, även om andra faktorer tycks ha större inverkan vid låg serverbelastning. Slutligen, även om alla storheter förbättrats vid alla observerade mätpunkter, kan ännu inte vidspredd adoption av IX för att köra Node.js applikationer rekommenderas, främst på grund av problem med horisontal skalning samt problem att ingå som frontend-server i en klassisk tiered-datacentre arkitektur. node.js IX dataplane os QoS Computer Sciences Datavetenskap (datalogi)

1

Page generated in 0.0271 seconds