Multi-core architectures are the future for high-performance computing and are omnipresent these days; what was a vision some twenty years back is now a reality with most personal computers/laptops now running on multi-cores making them ubiquitous in today's world. However, as the number of cores continue scaling with time, there will be serious throughput and performance issues with relation to the network topologies used in connecting the cores. Among possible network topologies under consideration in modern multi-core systems, the `Mesh' topology is widely used. In terms of performance, the `Point to Point topology' would outperform all other topologies such as Crossbar, Mesh and Torus. The `Point to Point' topology does include additional expenses with respect to more links needed to connect each core to every other core in the network. Its expensive implementation cost is the reason it is not preferred in the industry for general use systems. But, for research purposes it serves as the best network topology alternative to the `Mesh' for higher speed in computer systems. However, the characteristics of the tasks executing on the cores will also have a significant impact on topology performance. So, with the scaling of multi-cores from 10 to 1000 cores per chip and more, selection of the right network topology is of importance. Another interesting factor to consider is the effect of the cache on these multi-core systems with respect to each of these topologies. Cache coherency is and will be a major cause for throughput decrease as cores scale. In our work, we are using the Modified-Exclusive-Shared-Invalid (MESI) Cache Coherency protocol for all the above mentioned network topologies considered. In this thesis, we investigate the effect of varying cache parameters such as the sizes of L1 Instruction cache, L1 Data cache and L2 cache and their respective associativities on each network topology. Various combinations of all these four parameters were considered as we ran experiments. We use the gem5 Computer Architecture Simulator for running our experiments with 4 core models. For benchmark purposes, we use the SPLASH-2 set of {\it `High Performance Computing'} benchmarks. A benchmark is assigned to each core. We also observe the effects of running benchmarks with similar characteristics on all cores versus comparing them with a set of different benchmarks while keeping all other parameters constant. Through our results, we attempt to give researchers and the industry at large a better view of the advantages and disadvantages along with the relationship between multi-cores, the cache and network topologies for multi-core systems.
Identifer | oai:union.ndltd.org:siu.edu/oai:opensiuc.lib.siu.edu:theses-1998 |
Date | 01 December 2012 |
Creators | Rajkumar, Robin Kingsley |
Publisher | OpenSIUC |
Source Sets | Southern Illinois University Carbondale |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Theses |
Page generated in 0.0021 seconds