• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

SoftCache Architecture

Fryman, Joshua Bruce 19 July 2005 (has links)
Multiple trends in computer architecture are beginning to collide as process technology reaches ever smaller feature sizes. Problems with managing power, access times across a die, and increasing complexity to sustain growth are now blocking commercial products like the Pentium 4. These problems also occur in the embedded system space, albeit in a slightly different form. However, as process technology marches on, today's high-performance space is becoming tomorrow's embedded space. New techniques are needed to overcome these problems. In this thesis, we propose a novel architecture called SoftCache to address these emerging issues for embedded systems. We reduce the on-die memory controller infrastructure which reduces both power and space requirements, using the ubiquitous network device arena as a proving ground of viability. In addition, the SoftCache achieves further power and area savings by converting on-die cache structures into directly addressable SRAM and reducing or eliminating the external DRAM. To avoid the burden of programming complexity this approach presents to the application developer, we provide a transparent client-server dynamic binary translation system that runs arbitrary ELF executables on a stripped-down embedded target. One drawback to such a scheme lies in the overhead of additional instructions required to effect cache behavior, particularly with respect to data caching. Another drawback is the power use when fetching from remote memory over the network. The SoftCache comprises a dynamic client-server translation system on simplified hardware, targeted at Intel XScale client devices controlled from servers over the network. Reliance upon a network server as a ``backing store' introduces new levels of complexity, yet also allows for more efficient use of local space. The explicitly software managed aspects create a cache of variable line size, full associativity, and high flexibility. This thesis explores these particular issues, while approaching everything from the perspective of feasibility and actual architectural changes.
2

Proximity coherence for chip-multiprocessors

Barrow-Williams, Nick January 2011 (has links)
Many-core architectures provide an efficient way of harnessing the growing numbers of transistors available in modern fabrication processes; however, the parallel programs run on these platforms are increasingly limited by the energy and latency costs of communication. Existing designs provide a functional communication layer but do not necessarily implement the most efficient solution for chip-multiprocessors, placing limits on the performance of these complex systems. In an era of increasingly power limited silicon design, efficiency is now a primary concern that motivates designers to look again at the challenge of cache coherence. The first step in the design process is to analyse the communication behaviour of parallel benchmark suites such as Parsec and SPLASH-2. This thesis presents work detailing the sharing patterns observed when running the full benchmarks on a simulated 32-core x86 machine. The results reveal considerable locality of shared data accesses between threads with consecutive operating system assigned thread IDs. This pattern, although of little consequence in a multi-node system, corresponds to strong physical locality of shared data between adjacent cores on a chip-multiprocessor platform. Traditional cache coherence protocols, although often used in chip-multiprocessor designs, have been developed in the context of older multi-node systems. By redesign- ing coherence protocols to exploit new patterns such as the physical locality of shared data, improving the efficiency of communication, specifically in chip-multiprocessors, is possible. This thesis explores such a design - Proximity Coherence - a novel scheme in which L1 load misses are optimistically forwarded to nearby caches via new dedicated links rather than always being indirected via a directory structure.

Page generated in 0.0548 seconds