Return to search

Modeling and Optimization of Delay and Power for Key Components of Modern High-performance Processors

In designing a new processor, computer architects consider a myriad of possible organizations and designs to decide which best meets the constraints on performance, power and cost for each particular processor. To identify practical designs, architects need to have insight into the physical-level characteristics (delay, power and area) of various components of modern processors implemented in recent fabrication technologies. During early stages of design exploration, however, developing physical-level implementations for various design options (often in the order of thousands) is impractical or undesirable due to time and/or cost constraints. In lieu of actual measurements, analytical and/or empirical models can offer reasonable estimates of these physical-level characteristics. However, existing models tend to be out-dated for three reasons: (i) They have been developed based on old circuits in old fabrication technologies; (ii) The high-level designs of the components have evolved and older designs may no longer be representative; and, (iii) The overall architecture of processors has changed significantly, and new components for which no models exist have been introduced or are being considered.
This thesis studies three key components of modern high-performance processors: Counting Bloom Filters (CBFs), Checkpointed Register Alias Tables (RATs), and Compacted Matrix Schedulers (CMSs). CBFs optimize membership tests (e.g., whether a block is cached). RAT and CMS increase the opportunities for exploiting instruction-level parallelism; RAT is the core of the renaming stage, and CMS is an implementation for the instruction scheduler. Physical-level studies or models for these components have been limited or non-existent. In addition to investigating these components at the physical level, this thesis (i) proposes a novel speed- and energy-efficient CBF implementation; (ii) studies how the number of RAT checkpoints affects its latency and energy, and overall processor performance; and, (iii) studies the CMS and its accompanying logic at the physical level. This thesis also develops empirical and analytical latency and energy models that can be adapted for newer fabrication technologies. Additionally, this thesis proposes physical-level latency and energy optimizations for these components motivated by design inefficiencies exposed during the physical-level study phase.

Identiferoai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:OTU.1807/24327
Date13 April 2010
CreatorsSafi, Elham
ContributorsMoshovos, Andreas
Source SetsLibrary and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
Languageen_ca
Detected LanguageEnglish
TypeThesis

Page generated in 0.0022 seconds