Global ETD Search

31	Petri Net Model Based Energy Optimization Of Programs Using Dynamic Voltage And Frequency Scaling Arun, R 06 1900 (has links) (PDF) High power dissipation and on-chip temperature limit performance and affect reliability in modern microprocessors. For servers and data centers, they determine the cooling cost, whereas for handheld and mobile systems, they limit the continuous usage of these systems. For mobile systems, energy consumption affects the battery life. It can not be ignored for desktop and server systems as well, as the contribution of energy continues to go up in organizations’ budgets, influencing strategic decisions, and its implications on the environment are getting appreciated. Intelligent trade-offs involving these quantities are critical to meet the performance demands of many modern applications. Dynamic Voltage and Frequency Scaling (DVFS) offers a huge potential for designing trade-offs involving energy, power, temperature and performance of computing systems. In our work, we propose and evaluate DVFS schemes that aim at minimizing energy consumption while meeting a performance constraint, for both sequential and parallel applications. We propose a Petri net based program performance model, parameterized by application properties, microarchitectural settings and system resource configuration, and use this model to find energy efficient DVFS settings. We first propose a DVFS scheme using this model for sequential programs running on single core multiple clock domain (MCD) processors, and evaluate this on a MCD processor simulator. We then extend this scheme for data parallel (Single Program Multiple Data style) applications, and then generalize it for stream applications as well, and evaluate these two schemes on a full system CMP simulator. Our experimental evaluation shows that the proposed schemes achieve significant energy savings for a small performance degradation. Dynamic Voltage Frequency Scaling Petri Net Model Mobile Telephones Cellular Telephones Batteries (Electric) Stream Programs Multiple Clock Domain (MCD) Petri Nets Data Parallel Programs Multithreaded Programs Electronic Engineering
32	Design and Characterization of SRAMs for Ultra Dynamic Voltage Scalable (U-DVS) Systems Viveka, K R January 2016 (has links) (PDF) The ever expanding range of applications for embedded systems continues to offer new challenges (and opportunities) to chip manufacturers. Applications ranging from exciting high resolution gaming to routine tasks like temperature control need to be supported on increasingly small devices with shrinking dimensions and tighter energy budgets. These systems benefit greatly by having the capability to operate over a wide range of supply voltages, known as ultra dynamic voltage scaling (U-DVS). This refers to systems capable of operating from nominal voltages down to sub-threshold voltages. Memories play an important role in these systems with future chips estimated to have over 80% of chip area occupied by memories. This thesis presents the design and characterization of an ultra dynamic voltage scalable memory (SRAM) that functions from nominal voltages down to sub-threshold voltages without the need for external support. The key contributions of the thesis are as follows: 1) A variation tolerant reference generation for single ended sensing: We present a reference generator, for U-DVS memories, that tracks the memory over a wide range of voltages and is tunable to allow functioning down to sub-threshold voltages. Replica columns are used to generate the reference voltage which allows the technique to track slow changes such as temperature and aging. A few configurable cells in the replica column are found to be sufficient to cover the whole range of voltages of interest. The use of tunable delay line to generate timing is shown to help in overcoming the effects of process variations. 2) Random-sampling based tuning algorithm: Tuning is necessary to overcome the in-creased effects of variation at lower voltages. We present an random-sampling based BIST tuning algorithm that significantly speed-up the tuning ensuring that the time required to tune is comparable to a single MBIST algorithm. Further, the use of redundancy after delay tuning enables maximum utilization of redundancy infrastructure to reduce power consumption and enhance performance. 3) Testing and Characterization for U-DVS systems: Testing and characterization is an important challenge in U-DVS systems that have remained largely unexplored. We propose an iterative technique that allows realization of an on-chip oscilloscope with minimal area overhead. The all digital nature of the technique makes it simple to design and implement across technology nodes. Combining the proposed techniques allows the designed 4 Kb SRAM array to function from 1.2 V down to 310 mV with reads functioning down to 190 mV. This would contribute towards moving ultra wide voltage operation a step closer towards implementation in commercial designs. Ultra Dynamic Voltage Scalable System SRAM Array Design U-DVS Systems U-DVS SRAM Design SRAM Array Design Ultra-Low Voltage Systems Ultra Dynamic Voltage Scalable Memory Tuning SRAMs SRAM Sense Amplifiers Communication Engineering
33	Software Controlled Clock Modulation for Energy Efficiency Optimization on Intel Processors Schöne, Robert, Ilsche, Thomas, Bielert, Mario, Molka, Daniel, Hackenberg, Daniel 24 October 2017 (has links) Current Intel processors implement a variety of power saving features like frequency scaling and idle states. These mechanisms limit the power draw and thereby decrease the thermal dissipation of the processors. However, they also have an impact on the achievable performance. The various mechanisms significantly differ regarding the amount of power savings, the latency of mode changes, and the associated overhead. In this paper, we describe and closely examine the so-called software controlled clock modulation mechanism for different processor generations. We present results that imply that the available documentation is not always correct and describe when this feature can be used to improve energy efficiency. We additionally compare it against the more popular feature of dynamic voltage and frequency scaling and develop a model to decide which feature should be used to optimize inter-process synchronizations on Intel Haswell-EP processors. info:eu-repo/classification/ddc/004 ddc:004
34	Selective Core Boosting: The Return of the Turbo Button Wamhoff, Jons-Tobias, Diestelhorst, Stephan, Fetzer, Christof, Marlier, Patrick, Felber, Pascal, Dice, Dave 26 November 2013 (has links) Several modern multi-core architectures support the dynamic control of the CPU's clock rate, allowing processor cores to temporarily operate at speeds exceeding the operational base frequency. Conversely, cores can operate at a lower speed or be disabled altogether to save power. Such facilities are notably provided by Intel's Turbo Boost and AMD's Turbo CORE technologies. Frequency control is typically driven by the operating system which requests changes to the performance state of the processor based on the current load of the system. In this paper, we investigate the use of dynamic frequency scaling from user space to speed up multi-threaded applications that must occasionally execute time-critical tasks or to solve problems that have heterogeneous computing requirements. We propose a general-purpose library that allows selective control of the frequency of the cores - subject to the limitations of the target architecture. We analyze the performance trade-offs and illustrate its benefits using several benchmarks and real-world workloads when temporarily boosting selected cores executing time-critical operations. While our study primarily focuses on AMD's architecture, we also provide a comparative evaluation of the features, limitations, and runtime overheads of both Turbo Boost and Turbo CORE technologies. Our results show that we can successful exploit these new hardware facilities to accelerate the execution of key sections of code (critical paths) improving overall performance of some multi-threaded applications. Unlike prior research, we focus on performance instead of power conservation. Our results further can give guidelines for the design of hardware power management facilities and the operating system interfaces to those facilities. info:eu-repo/classification/ddc/004 ddc:004
35	Application-Directed DVFS using Multiple Clock Domains on Graphics Hardware Li, Juan 14 January 2009 (has links) As handheld devices have become increasingly popular, powerful programmable graphics hardware for mobile and handheld devices has been deployed. While many resources on mobile devices are limited, the predominant problem for mobile devices is their limited battery power. Several techniques have been proposed to increase the energy efficiency of mobile applications and improve battery life. In this thesis, we propose a new dynamic voltage and frequency scaling (DVFS) on Graphics Processing Units (GPU). In most cases, cues within the graphics appli- cation can be used to predict portions of a GPU that will be used or unused when the application is run. We partition the GPU into six clock domains that can be clocked at different rates. Specifically, each domain it has its own voltage and frequency set- ting based on its predicted workload to save energy without reducing applications frame rates. In addition, we propose an signature-based algorithm for predicting the workload offered to our six clock domains by a given application to decide voltage and frequency settings. We conduct experiments and compare the results of our new signature based workload prediction algorithm with some other traditional interval based workload prediction algorithms. Our results show that our signature-based prediction can save 30-50% energy without afecting application frame rates. Energy Graphics Process Unit(GPU) Multiple Clock Domain(MCD) Pocket computers Computer graphics
36	A Nonlinear Programming Approach for Dynamic Voltage Scaling Ardi, Shanai January 2005 (has links) <p>Embedded computing systems in portable devices need to be energy efficient, yet they have to deliver adequate performance to the often computationally expensive applications. Dynamic voltage scaling is a technique that offers a speed versus power trade-off, allowing the application to achieve considerable energy savings and, at the same time, to meet the imposed time constraints.</p><p>In this thesis, we explore the possibility of using optimal voltage scaling algorithms based on nonlinear programming at the system level, for a complex multiprocessor scheduling problem. We present an optimization approach to the modeled nonlinear programming formulation of the continuous voltage selection problem excluding the consideration of transition overheads. Our approach achieves the same optimal results as the previous work using the same model, but due to its speed, can be efficiently used for design space exploration. We validate our results using numerous automatically generated benchmarks.</p> Datorsystem Low Power Design Dynamic Voltage Scaling Nonlinear Programming AMPL Application Program Interface. Datorsystem Computer and systems science Data- och systemvetenskap
37	A Study on Wind Turbine Low Voltage Ride Through Capability Enhancement by STATCOM and DVR Lin, Chih-peng 05 February 2010 (has links) When more induction generator based wind farms are integrated into the power system, the system voltage dips and stability problems may arise due to the draw of reactive power by induction generators. The power system short-circuit event induced wind turbine trips could result in power imbalance and lead to power system instability. This thesis studies the influence of two compensation techniques on the wind turbine low voltage ride-through (LVRT) capability. One of which is based on a parallel compensation by a static synchronous compensator (STATCOM), and the other one is a series compensation by a dynamic voltage restorer (DVR). In this study, Matlab tools and models are used to simulate an active-stall controlled fixed-speed induction generator connected to a power system. Two system configurations are used to simulate three phase faults and compare the improvement of wind turbine LVRT capability due to the two studied compensation techniques. Simulation results indicate that wind turbine compensated by DVR would have better LVRT performance than that by STATCOM in dealing with the low voltage situations due to system faults. Static Synchronous Compensator Wind Farm Low Voltage Ride-Through Capability Induction Wind Generator Dynamic Voltage Restorer Voltage Tolerance Curve
38	A Generalized Framework for Energy Savings in Real-Time Multiprocessor Systems Zeng, Gang, Yokoyama, Tetsuo, Tomiyama, Hiroyuki, Takada, Hiroaki 11 1900 (has links) No description available. embedded real-time systems energy-aware multiprocessor scheduling dynamic hardware resource configuration dynamic voltage frequency scaling dynamic power management
39	E³ : energy-efficient EDGE architectures Govindan, Madhu Sarava 13 December 2010 (has links) Increasing power dissipation is one of the most serious challenges facing designers in the microprocessor industry. Power dissipation, increasing wire delays, and increasing design complexity have forced industry to embrace multi-core architectures or chip multiprocessors (CMPs). While CMPs mitigate wire delays and design complexity, they do not directly address single-threaded performance. Additionally, programs must be parallelized, either manually or automatically, to fully exploit the performance of CMPs. Researchers have recently proposed an architecture called Explicit Data Graph Execution (EDGE) as an alternative to conventional CMPs. EDGE architectures are designed to be technology-scalable and to provide good single-threaded performance as well as exploit other types of parallelism including data-level and thread-level parallelism. In this dissertation, we examine the energy efficiency of a specific EDGE architecture called TRIPS Instruction Set Architecture (ISA) and two microarchitectures called TRIPS and TFlex that implement the TRIPS ISA. TRIPS microarchitecture is a first-generation design that proves the feasibility of the TRIPS ISA and distributed tiled microarchitectures. The second-generation TFlex microarchitecture addresses key inefficiencies of the TRIPS microarchitecture by matching the resource needs of applications to a composable hardware substrate. First, we perform a thorough power analysis of the TRIPS microarchitecture. We describe how we develop architectural power models for TRIPS. We then improve power-modeling accuracy using hardware power measurements on the TRIPS prototype combined with detailed Register Transfer Level (RTL) power models from the TRIPS design. Using these refined architectural power models and normalized power modeling methodologies, we perform a detailed performance and power comparison of the TRIPS microarchitecture with two different processors: 1) a low-end processor designed for power efficiency (ARM/XScale) and 2) a high-end superscalar processor designed for high performance (a variant of Power4). This detailed power analysis provides key insights into the advantages and disadvantages of the TRIPS ISA and microarchitecture compared to processors on either end of the performance-power spectrum. Our results indicate that the TRIPS microarchitecture achieves 11.7 times better energy efficiency compared to ARM, and approximately 12% better energy efficiency than Power4, in terms of the Energy-Delay-Squared (ED²) metric. Second, we evaluate the energy efficiency of the TFlex microarchitecture in comparison to TRIPS, ARM, and Power4. TFlex belongs to a class of microarchitectures called Composable Lightweight Processors (CLPs). CLPs are distributed microarchitectures designed with simple cores and are highly configurable at runtime to adapt to resource needs of applications. We develop power models for the TFlex microarchitecture based on the validated TRIPS power models. Our quantitative results indicate that by better matching execution resources to the needs of applications, the composable TFlex system can operate in both regimes of low power (similar to ARM) and high performance (similar to Power4). We also show that the composability feature of TFlex achieves a signification improvement (2 times) in the ED² metric compared to TRIPS. Third, using TFlex as our experimental platform, we examine the efficacy of processor composability as a potential performance-power trade-off mechanism. Most modern processors support a form of dynamic voltage and frequency scaling (DVFS) as a performance-power trade-off mechanism. Since the rate of voltage scaling has slowed significantly in recent process technologies, processor designers are in dire need of alternatives to DVFS. In this dissertation, we explore processor composability as an architectural alternative to DVFS. Through experimental results we show that processor composability achieves almost as good performance-power trade-offs as pure frequency scaling (no changes in supply voltages), and a much better performance-power trade-off compared to voltage and frequency scaling (both supply voltage and frequency change). Next, we explore the effects of additional performance-improving techniques for the TFlex system on its energy efficiency. Researchers have proposed a variety of techniques for improving the performance of the TFlex system. These include: (1) block mapping techniques to trade off intra-block concurrency with communication across the operand network; (2) predicate prediction and (3) operand multi-cast/broadcast mechanism. We examine each of these mechanisms in terms of its effect on the energy efficiency of TFlex, and our experimental results demonstrate the effects of operand communication, and speculation on the energy efficiency of TFlex. Finally, this dissertation evaluates a set of fine-grained power management (FGPM) policies for TFlex: instruction criticality and controlled speculation. These policies rely on a temporally and spatially fine-grained dynamic voltage and frequency scaling (DVFS) mechanism for improving power efficiency. The instruction criticality policy seeks to improve power efficiency by mapping critical computation in a program to higher performance-power levels, and by mapping non-critical computation to lower performance-power levels. Controlled speculation policy, on the other hand, maps blocks that are highly likely to be on correct execution path in a program to higher performance levels, and the other blocks to lower performance levels. Our experimental results indicate that idealized instruction criticality and controlled speculation policies improve the operating range and flexibility of the TFlex system. However, when the actual overheads of fine-grained DVFS, especially energy conversion losses of voltage regulator modules (VRMs), are considered the power efficiency advantages of these idealized policies quickly diminish. Our results also indicate that the current conversion efficiencies of on-chip VRMs need to improve to as high as 95% for the realistic policies to be feasible. / text Energy efficiency EDGE architectures Power efficiency Composability DVFS Power management Dynamic voltage and frequency scaling
40	System Level Energy Optimization Techniques for a Digital Load Supplied with a DC-DC Converter Parayandeh, Amir 09 August 2013 (has links) The demand to integrate more features has significantly increased the complexity and power consumption of smart portable devices. Therefore extending the battery life-time has become a major challenge and new approaches are required to decrease the power consumed from the source. Traditionally the focus has been on reducing the dynamic power consumption of the digital circuits used in these devices. However as process technologies scale, reducing the dynamic power has become less effective due to the increased impact of the leakage power. Alternatively, a more effective approach to minimize the power consumption is to continuously optimize the ratio of the dynamic and leakage power while delivering the required performance. This works presents a novel power-aware system for dynamic minimum power point tracking of digital loads in portable applications. The system integrates a dc-dc converter power-stage and the supplied digital circuit. The integrated dc-dc converter IC utilizes a mixed-signal current program mode (CPM) controller to regulate the supply voltage of the digital load IC. This embedded converter inherently measures the power consumption of the load in real-time, eliminating the need for additional power sensing circuitry. Based on the information available in the CPM controller, a minimum power point tracking (MiPPT) controller sets the supply and threshold voltages for the digital load to minimize its power consumption while maintaining a target frequency. The 10MHz mixed-signal CPM controlled dc-dc converter and the digital load are fabricated in 0.13µm IBM technology. Experimental results verify that the introduced system results in up to 30% lower power consumption from the battery source. 0544

Search results