The correlator is a key component of the digital backend of a modern radio telescope array. The 64 antenna MeerKAT telescope has an FX architecture correlator consisting of 64 F-Engines and 256 X-Engines. These F- and X-Engines are all hosted on 128 custom designed FPGA processing boards. This custom board is known as a SKARAB. One SKARAB X-Engine board hosts four logical X-Engines. This SKARAB ingests data at 27.2 Gbps over a 40 GbE connection. It correlates this data in real time. GPU technology has improved significantly since SKARAB was designed. GPUs are now becoming viable alternatives to FPGAs in high performance streaming applications. The objective of this dissertation is to investigate how to build a GPU drop-in replacement X-Engine for MeerKAT and to compare this implementation to a SKARAB X-Engine. This includes the construction and analysis of a prototype GPU X-Engine. The 40 GbE ingest, GPU correlation algorithm and the software pipeline framework that links these two together were identified as the three main sub-systems to focus on in this dissertation. A number of different tools implementing these sub-systems were examined with the most suitable ones being chosen for the prototype. A prototype dual socket system was built that could process the equivalent of two SKARABs worth of X-Engine data. This prototype has two 40 GbE Mellanox NICS running the SPEAD2 library and a single Nvidia GeForce 1080Ti GPU running the xGPU library. A custom pipeline framework built on top of the Intel Threaded Building Blocks (TBB) library was designed to facilitate the ow of data between these sub-systems. The prototype system was compared to two SKARABs. For an equivalent amount of processing, the GPU X-Engine cost R143 000 while the two SKARABs cost R490 000. The power consumption of the GPU X-Engine was more than twice that of the SKARABs (400W compared 180W), while only requiring half as much rack space. GPUs as X-Engines were found to be more suitable than FPGAs when cost and density are the main priorities. When power consumption is the priority, then FPGAs should be used. When running eight logical X-Engines, 85% of the prototype's CPU cores were used while only 75% of the GPU's compute capacity was utilised. The main bottleneck on the GPU X-Engine was on the CPU side of the server. This report suggests that the next iteration of the system should offload some CPU side processing to the GPU and double the number of 40 GbE ports. This could potentially double the system throughput. When considering methods to improve this system, an FPGA/GPU hybrid X-Engine concept was developed that would combine the power saving advantage of FPGAs and the low cost to compute ratio of GPUs.
Identifer | oai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:uct/oai:localhost:11427/32531 |
Date | January 2020 |
Creators | Callanan, Gareth Mitchell |
Contributors | Winberg, Simon |
Publisher | University of Cape Town, Faculty of Engineering and the Built Environment, Department of Electrical Engineering |
Source Sets | South African National ETD Portal |
Language | English |
Detected Language | English |
Type | Master Thesis, Masters, MSc (Eng) |
Format | application/pdf |
Page generated in 0.0022 seconds