• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • No language data
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

ACTION : Adaptive Cache Block Migration in Distributed Cache Architectures

Mummidi, Chandra Sekhar 20 October 2021 (has links)
Increasing number of cores in chip multiprocessors (CMP) result in increasing traffic to last-level cache (LLC). Without commensurate increase in LLC bandwidth, such traffic cannot be sustained resulting in loss of performance. Further, as the number of cores increases, it is necessary to scale up the LLC size; otherwise, the LLC miss rate will rise, resulting in a loss of performance. Unfortunately, for a unified LLC with uniform cache access time, access latency increases with cache size, resulting in performance loss. Previously, researchers have proposed partitioning the cache into multiple smaller caches interconnected by a communication network which increases aggregate cache bandwidth but causes non-uniform access latency. Such a cache architecture is called non-uniform cache architecture (NUCA). While NUCA addresses the LLC bandwidth issue, partitioning by itself does not address the access latency problem. Consequently, researchers have previously considered data placement techniques to improve access latency. However, earlier data placement work did not account for the frequency with which specific memory references are accessed. A major reason for that is access frequency for all memory references is difficult to track. In this research, we present a hardware-assisted solution called ACTION (Adaptive Cache Block Migration) to track the access frequency of individual memory references and prioritize their placement closer to the affine core. ACTION mechanism implements cache block migration when there is a detectable change in access frequencies due to a change in the program phase. To keep the hardware overhead low, ACTION counts access references in the LLC stream using a simple and approximate method, and uses simple algorithms for placement and migration. We tested ACTION on a 4-core CMP with a 5x5 mesh LLC network implementing a partitioned D-NUCA against workloads exhibiting distinct asymmetry in cache block access frequency. Our simulation results indicate that ACTION can improve CMP performance by as much as 8% over the state-of-the-art (SOTA) solutions.

Page generated in 0.0137 seconds