The implementation of multiple processors on a single chip has been made
possible with advancements in process technology. The benefits of having multiple
cores on a single chip bring with it a new set of constraints for maintaining fast
and consistent memory accesses. Cache coherence protocols are needed to maintain
the consistency of shared memory on individual caches. Current cache coherency
protocols are either snoop based, which is not scalable but provides fast access for
small number of cores, or directory based, which involves a directory that acts as
the ordering point providing scalability with relatively slower access. Our focus is on
improving the memory access time of the scalable directory protocol.
We have observed that most memory requests follow a pattern where in one
of the processors, which we will dub the Producer, repeatedly writes to a particular
memory location. A subset of the remaining cores, which we will dub the Consumers,
repeatedly read the data from that same memory location. In our implementation
we utilize this relationship to provide direct cache to cache transfers and minimize
the access time by avoiding the indirection through the directory. We move the
directory temporarily to the Producer node so that the consumer can directly request
the producer for the cache line. Our technique improves the memory access time by
13 percent and reduces network traffic by 30 percent over standard directory coherence protocol
with very little area overhead.
Identifer | oai:union.ndltd.org:tamu.edu/oai:repository.tamu.edu:1969.1/ETD-TAMU-2010-08-8548 |
Date | 2010 August 1900 |
Creators | Soni, Tarun |
Contributors | Gratz, Paul V. |
Source Sets | Texas A and M University |
Language | en_US |
Detected Language | English |
Type | thesis, text |
Format | application/pdf |
Page generated in 0.0018 seconds