1 |
Implementation of coarse-grain coherence tracking support in ring-based multiprocessorsCoté, Edmond A. 25 October 2007 (has links)
As the number of processors in multiprocessor system-on-chip devices continues to increase, the complexity required for full cache coherence support is often unwarranted for application-specific designs. Bus-based interconnects are no longer suitable for larger-scale systems, and the logic and storage overhead associated with the use of a complex packet-switched network and directory-based cache coherence may be undesirable in single-chip systems. Unidirectional rings are a suitable alternative because they offer many properties favorable to both on-chip implementation and to supporting cache coherence. Reducing the overhead of cache coherence traffic is, however, a concern for these systems.
This thesis adapts two filter structures that are based on principles of coarse-grained coherence tracking, and applies them to a ring-based multiprocessor. The first structure tracks the total number of blocks of remote data cached by all processors in a node for a set of regions, where a region is a large area of memory referenced by the upper bits of an address. The second structure records regions of local data whose contents are not cached by any remote node. When used together to filter incoming or outgoing requests, these structures reduce the extent of coherence traffic and limit the transmission of coherent requests to the necessary parts of the system.
A complete single-chip multiprocessor system that includes the proposed filters is designed and implemented in programmable logic for this thesis. The system is composed of nodes of bus-based multiprocessors, and each node includes a common memory, two or more pipelined 32-bit processors with coherent data caches, a split-transaction bus with separate lines for requests and responses, and an interface for the system-level ring interconnect. Two coarse-grained filters are attached to each node to reduce the impact of coherence traffic on the system. Cache coherence within the node is enforced through bus snooping, while coherence across the interconnect is supported by a reduced-complexity ring snooping protocol. Main memory is globally shared and is physically distributed among the nodes.
Results are presented to highlight the system's key implementation points. Synthesis results are presented in order to evaluate hardware overhead, and operational results are shown to demonstrate the functionality of the multiprocessor system and of the filter structures. / Thesis (Master, Electrical & Computer Engineering) -- Queen's University, 2007-10-24 10:16:47.81 / Financial support for this work was provided by the National Sciences and Engineering Research Council of Canada, Communications and Information Technology Ontario, and Queen's University.
|
2 |
Integration of Production Scheduling and Energy Management : Software DevelopmentAit-Ali, Abderrahman January 2015 (has links)
Demand-Side Management concepts have the potential to positively impact the financial as well as the environmental aspects of energy-intensive industries. More specifically, they allow reducing the energy cost for the industrial plants by dealing with energy-availability fluctuations. In this context, efficient frameworks for scheduling with energy awareness have been studied and showed potential to reduce the overall energy bill for energy-intensive industries, for instance stainless steel and paper plants. Those frameworks usually combine scheduling and energy optimization into one monolithic system. This work investigates the possibility of integrating the two systems by specific exchange of signals, while keeping the scheduling model separated from the energy-cost optimization model. Such integration means that the pre-existent schedulers and energy optimizers could be easily modified and reused without re-implementing the whole new system. Two industrial problems with different scheduling approaches are studied. The first problem is about pulp and paper production which uses the Resource Task Network (RTN) scheduling approach. The second one is about stainless steel production which is based on a bi-level heuristic implementation of an improved energy-aware scheduler. This work presents the decomposition methods that are available in literature and their application to the two industrial problems. Besides an improvement in the RTN approach for handling storages, this thesis describes a prototype implementation of the energy-aware RTN scheduler for paper and pulp production. Furthermore, this work investigates the performance of the application of different decomposition methods on different problem instances. The numerical case studies show that even though the decomposition decreases the solution quality compared to the monolithic system, it still gives good solutions within an acceptable duration with the advantage of having two separate pre-existent systems which are simply exchanging signals.
|
Page generated in 0.1392 seconds