• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Offloading the sampling stage of GNN training to smart storage

Kritharakis, Emmanouil 16 February 2024 (has links)
Graph Neural Networks (GNNs) have emerged as a robust model for machine learning, addressing complex graph-structured data, in contrast to traditional deep learning techniques primarily used for image and text data. However, the scalability of GNNs on large graphs with billions of nodes and trillions of edges remains a challenge. Existing approaches propose partitioning across distributed systems or employing single machines with GPU caching techniques during the sampling phase. While the former encounters issues related to maintenance costs and increased latency, the latter faces bottlenecks in data movement, resulting in inefficient resource utilization and suboptimal training. To address the limitations of single-machine techniques, we direct our attention to the sampling stage and introduce a novel approach utilizing the Samsung smartSSD computational storage device. This approach significantly reduces unnecessary data movement overhead and minimizes overall training time. Computational storage devices enable the offloading of computations to their computational units. In our method, we calculate the required sampling subset on its Field programmable gate array (FPGA) of the smartSSD and transfer it to the host DRAM. Our experimental section illustrates that our proposed solution, compared to the baseline MMAP sampling method, achieves a speedup of up to 9 times in terms of sampling time and 5 times in host DRAM utilization.

Page generated in 0.019 seconds