Spelling suggestions: "subject:"bfrequency dcaling"" "subject:"bfrequency fcaling""
1 |
An Experimental Evaluation of Real-Time DVFS Scheduling AlgorithmsSaha, Sonal 12 September 2011 (has links)
Dynamic voltage and frequency scaling (DVFS) is an extensively studied energy manage ment technique, which aims to reduce the energy consumption of computing platforms by dynamically scaling the CPU frequency. Real-Time DVFS (RT-DVFS) is a branch of DVFS, which reduces CPU energy consumption through DVFS, while at the same time ensures that task time constraints are satisfied by constructing appropriate real-time task schedules. The literature presents numerous RT-DVFS scheduling algorithms, which employ different techniques to utilize the CPU idle time to scale the frequency. Many of these algorithms have been experimentally studied through simulations, but have not been implemented on real hardware platforms. Though simulation-based experimental studies can provide a first-order understanding, implementation-based studies can reveal actual timeliness and energy consumption behaviours. This is particularly important, when it is difficult to devise accurate simulation models of hardware, which is increasingly the case with modern systems.
In this thesis, we study the timeliness and energy consumption behaviours of fourteen state- of-the-art RT-DVFS schedulers by implementing and evaluating them on two hardware platforms. The schedulers include CC-EDF, LA-EDF, REUA, DRA andd AGR1 among others, and the hardware platforms include ASUS laptop with the Intel I5 processor and a mother- board with the AMD Zacate processor. We implemented these schedulers in the ChronOS real-time Linux kernel and measured their actual timeliness and energy behaviours under a range of workloads including CPU-intensive, memory-intensive, mutual exclusion lock-intensive, and processor-underloaded and overloaded workloads.
Our studies reveal that measuring the CPU power consumption as the cube of CPU frequency can lead to incorrect conclusions. In particular, it ignores the idle state CPU power consumption, which is orders of magnitude smaller than the active power consumption. Consequently, power savings obtained by exclusively optimizing active power consumption (i.e., RT-DVFS) may be offset by completing tasks sooner by running them at the highest frequency and transitioning to the idle state earlier (i.e., no DVFS). Thus, the active power consumption savings of the RT-DVFS techniques' that we report are orders of magnitude smaller than their simulation-based savings reported in the literature. / Master of Science
|
2 |
MIMO channel modelling for indoor wireless communicationsMaharaj, Bodhaswar Tikanath Jugpershad 29 July 2008 (has links)
This thesis investigates multiple-input-multiple-output (MIMO) channel modelling for a wideband indoor environment. Initially the theoretical basis of geometric modelling for a typical indoor environment is looked at, and a space-time model is formulated. The transmit and receive antenna correlation is then separated and is expressed in terms of antenna element spacing, the scattering parameter, mean angle of arrival and number of antenna elements employed. These parameters are used to analyze their effect on the capacity for this environment. Then the wideband indoor channel operating at center frequencies of 2.4 GHz and 5.2 GHz is investigated. The concept of MIMO frequency scaling is introduced and applied to the data obtained in the measurement campaign undertaken at the University of Pretoria. Issues of frequency scaling of capacity, spatial correlation and the joint RX/TX double direction channel response for this indoor environment are investigated. The maximum entropy (ME) approach to MIMO channel modelling is investigated and a new basis is developed for the determination of the covariance matrix when only the RX/TX covariance is known. Finally, results comparing this model with the established Kronecker model and its application for the joint RX/TX spatial power spectra, using a beamformer, are evaluated. Conclusions are then drawn and future research opportunities are highlighted. / Thesis (PhD)--University of Pretoria, 2008. / Electrical, Electronic and Computer Engineering / unrestricted
|
3 |
Power-Performance-Predictability: Managing the Three Cornerstones of Resource Constrained Real-Time System DesignMukherjee, Anway 08 November 2019 (has links)
This dissertation explores several challenges that plague the hardware-software co-design of popular resource constrained real-time embedded systems. We specifically tackle existing real-world problems, and address them through our design solutions which are highly scalable, and have practical feasibility as verified through our solution implementation on real-world hardware.
We address the problem of poor battery life in mobile embedded devices caused due to side-by-side execution of multiple applications in split-screen mode. Existing industry solutions either restricts the number of applications that can run simultaneously, limit their functionality, and/or increase the hardware capacity of the battery associated with the system. We exploit the gap in research on performance and power trade-off in smartphones to propose an integrated energy management solution, that judiciously minimizes the system-wide energy consumption with negligible effect on its quality of service (QoS).
Another important real-world requirement in today's interconnected world is the need for security. In the domain of real-time computing, it is not only necessary to secure the system but also maintain its timeliness. Some example security mechanisms that may be used in a hard real-time system include, but are not limited to, security keys, protection of intellectual property (IP) of firmware and application software, one time password (OTP) for software certification on-the-fly, and authenticated computational off-loading. Existing design solutions require expensive, custom-built hardware with long time-to-market or time-to-deployment cycle. A readily available alternative is the use of trusted execution environment (TEE) on commercial off-the-shelf (COTS) embedded processors. However, utilizing TEE creates multiple challenges from a real-time perspective, which includes additional time overhead resulting in possible deadline misses. Second, trusted execution may adversely affect the deterministic execution of the system, as tasks running inside a TEE may need to communicate with other tasks that are executing on the native real-time operating system. We propose three different solutions to address the need for a new task model that can capture the complex relationship between performance and predictability for real-time tasks that require secure execution inside TEE. We also present novel task assignment and scheduling frameworks for real-time trusted execution on COTS processors to improve task set schedulability. We extensively assess the pros and cons of our proposed approaches in comparison to the state-of-the-art techniques in custom-built real-world hardware for feasibility, and simulated environments to test our solutions' scalability. / Doctor of Philosophy / Today's real-world problems demand real-time solutions. These solutions need to be practically feasible, and scale well with increasing end user demands. They also need to maintain a balance between system performance and predictability, while achieving minimum energy consumption. A recent example of technological design problem involves ways to improve the battery lifetime of mobile embedded devices, for example, smartphones, while still achieving the required performance objectives. For instance, smartphones that run Android OS has the capability to run multiple applications concurrently using a newly introduced split-screen mode of execution, where applications can run side-by-side at the same time on screen while using the same shared resources (e.g., CPU, memory bandwidth, peripheral devices etc.). While this can improve the overall performance of the system, it can also lead to increased energy consumption, thereby directly affecting the battery life.
Another technological design problem involves ways to protect confidential proprietary information from being siphoned out of devices by external attackers. Let us consider a surveillance unmanned aerial vehicle (UAV) as an example. The UAV must perform sensitive tasks, such as obtaining coordinates of interest for surveillance, within a given time duration, also known as task deadline. However, an attacker may learn how the UAV communicates with ground control, and take control of the UAV, along with the sensitive information it carries. Therefore, it is crucial to protect such sensitive information from access by an unauthorized party, while maintaining the system's task deadlines.
In this dissertation, we explore these two real-world design problems in depth, observe the challenges associated with them, while presenting several solutions to tackle the issues. We extensively assess the pros and cons of our proposed approaches in comparison to the state-of- the-art techniques in custom-built real-world hardware, and simulated environments to test our solutions' scalability.
|
4 |
Selective Core Boosting: The Return of the Turbo ButtonWamhoff, Jons-Tobias, Diestelhorst, Stephan, Fetzer, Christof, Marlier, Patrick, Felber, Pascal, Dice, Dave 26 November 2013 (has links) (PDF)
Several modern multi-core architectures support the dynamic control of the CPU's clock rate, allowing processor cores to temporarily operate at speeds exceeding the operational base frequency. Conversely, cores can operate at a lower speed or be disabled altogether to save power. Such facilities are notably provided by Intel's Turbo Boost and AMD's Turbo CORE technologies. Frequency control is typically driven by the operating system which requests changes to the performance state of the processor based on the current load of the system.
In this paper, we investigate the use of dynamic frequency scaling from user space to speed up multi-threaded applications that must occasionally execute time-critical tasks or to solve problems that have heterogeneous computing requirements. We propose a general-purpose library that allows selective control of the frequency of the cores - subject to the limitations of the target architecture. We analyze the performance trade-offs and illustrate its benefits using several benchmarks and real-world workloads when temporarily boosting selected cores executing time-critical operations. While our study primarily focuses on AMD's architecture, we also provide a comparative evaluation of the features, limitations, and runtime overheads of both Turbo Boost and Turbo CORE technologies. Our results show that we can successful exploit these new hardware facilities to
accelerate the execution of key sections of code (critical paths) improving overall performance of some multi-threaded applications. Unlike prior research, we focus on performance instead of power conservation. Our results further can give guidelines for the design of hardware power management facilities and the operating system interfaces to those facilities.
|
5 |
High efficiency smart voltage regulating module for green mobile computingTapou, Monaf Sabri January 2014 (has links)
In this thesis a design for a smart high efficiency voltage regulating module capable of supplying the core of modern microprocessors incorporating dynamic voltage and frequency scaling (DVS) capability is accomplished using a RISC based microcontroller to facilitate all the functions required to control, protect, and supply the core with the required variable operating voltage as set by the DVS management system. Normally voltage regulating modules provide maximum power efficiency at designed peak load, and the efficiency falls off as the load moves towards lesser values. A mathematical model has been derived for the main converter and small signal analysis has been performed in order to determine system operation stability and select a control scheme that would improve converter operation response to transients and not requiring intense computational power to realize. A Simulation model was built using Matlab/Simulink and after experimenting with tuned PID controller and fuzzy logic controllers, a simple fuzzy logic control scheme was selected to control the pulse width modulated converter and several methods were devised to reduce the requirements for computational power making the whole system operation realizable using a low power RISC based microcontroller. The same microcontroller provides circuit adaptations operation in addition to providing protection to load in terms of over voltage and over current protection. A novel circuit technique and operation control scheme enables the designed module to selectively change some of the circuit elements in the main pulse width modulated buck converter so as to improve efficiency over a wider range of loads. In case of very light loads as the case when the device goes into standby, sleep or hibernation mode, a secondary converter starts operating and the main converter stops. The secondary converter adapts a different operation scheme using switched capacitor technique which provides high efficiency at low load currents. A fuzzy logic control scheme was chosen for the main converter for its lighter computational power requirement promoting implementation using ultra low power embedded controllers. Passive and active components were carefully selected to augment operational efficiency. These aspects enabled the designed voltage regulating module to operate with efficiency improvement in off peak load region in the range of 3% to 5%. At low loads as the case when the computer system goes to standby or sleep mode, the efficiency improvent is better than 13% which will have noticeable contribution in extending battery run time thus contributing to lowering the carbon footprint of human consumption.
|
6 |
Performance prediction for dynamic voltage and frequency scalingMiftakhutdinov, Rustam Raisovich 28 October 2014 (has links)
This dissertation proves the feasibility of accurate runtime prediction of processor performance under frequency scaling. The performance predictors developed in this dissertation allow processors capable of dynamic voltage and frequency scaling (DVFS) to improve their performance or energy efficiency by dynamically adapting chip or core voltages and frequencies to workload characteristics. The dissertation considers three processor configurations: the uniprocessor capable of chip-level DVFS, the private cache chip multiprocessor capable of per-core DVFS, and the shared cache chip multiprocessor capable of per-core DVFS. Depending on processor configuration, the presented performance predictors help the processor realize 72–85% of average oracle performance or energy efficiency gains. / text
|
7 |
Energy and speed exploration in digital CMOS circuits in the near-threshold regime for very-wide voltage-frequency scalingStangherlin, Kleber Hugo January 2013 (has links)
Esta tese avalia os benefícios e desafios associados com a operação em uma ampla faixa de frequências e tensões próximas ao limiar do transistor. A diminuição da tensão de alimentação em circuitos digitais CMOS apresenta grandes vantagens em termos de potência consumida pelo circuito. Esta diminuição da potência é acompanhada por uma redução da performance, reflexo da diminuição na tensão de alimentação. A operação de circuitos digitais no ponto de energia mínima é comumente associada ao regime de operação abaixo do limiar do transistor, trazendo enormes penalidades em performance e variabilidade. Esta dissertação mostra que é possível obter 8X mais eficiência energética com uma ampla faixa dinâmica de tensão e frequência, da tensão nominal até o limite inferior da operação próximo ao limiar do transistor. Como parte deste estudo, uma biblioteca de células digitais CMOS para esta ampla faixa de frequências foi desenvolvida. A biblioteca de células lógicas foi exercitada em um PDK comercial de 65nm para operação próximo ao limiar do transistor, reduzindo os efeitos da variabilidade sem comprometer o projeto em termos de área e energia quando operando em inversão forte. Para operar próximo e abaixo do limiar do transistor as células devem ser desenvolvidas com um número limitado de transistores em série. Nosso estudo mostra que uma performance aceitável em termos de margens de ruído estático é obtida para um conjunto restrito de células, onde são empregados no máximo dois transistores em série. Reportamos resultados para projetos de média complexidade que incluem um filtro notch de 25kgates, um microcontrolador 8051 de 20kgates, e 4 circuitos combinacionais/ sequenciais do conjunto de avaliação ISCAS. Neste trabalho, é estudada a máxima frequência atingida em cada tensão de alimentação, desde 0.15V até 1.2V. O ponto de mínima energia é demonstrado em operação abaixo do limiar do transistor, aproximadamente 0.29V, oque representa um ganho de 2X em eficiência energética comparado ao regime de operação próximo ao limiar do transistor. Embora o pico de eficiência energética ocorra abaixo do limiar do transistor para os circuitos estudados, nós também demonstramos que nesta tensão de alimentação ultra-baixa o atraso e a potência sofrem um impacto substancial devido ao aumento na variabilidade, atigindo uma degradação em performance de 30X, com respeito à operação próxima ao limiar do transistor. / This thesis assesses the benefits and drawbacks associated with a very wide range of frequency when operation at near-threshold is considered. Scaling down the supply voltage in digital CMOS circuits presents great benefits in terms of power reduction. Such scaling comes with a performance penalty, hence in digital synchronous circuits the reduction in frequency of operation follows, for a given circuit layout, the VDD reduction. Minimum-energy operation of digital CMOS circuits is commonly associated to the sub-VT regime, carrying huge performance and variability penalties. This thesis shows that it is possible to achieve 8X higher energy-efficiency with a very-wide range of dynamic voltage-frequency scaling, from nominal voltages down to the lower boundary of near-VT operation. As part of this study, a CMOS digital cell-library for such wide range of frequencies was developed. The cell-library is exercised in a 65nm commercial PDK and targets near-VT operation, mitigating the variability effects without compromising the design in terms of area and energy at strong inversion. For near-VT or sub-VT operation the cells have to be designed with few stacked transistors. Our study shows that acceptable performance in terms of static-noise margins is obtained for a constrained set of cells, for which a maximum of 2-stacked transistors are allowed. In this set we include master-slave registers. We report results for medium complexity designs which include a 25kgates notch filter, a 20kgates 8051 compatible core, and 4-combinational/4-sequential ISCAS benchmark circuits. In this work the maximum frequency attainable at each supply for a wide variation of voltage is studied from 150mV up to nominal voltage (1.2V). The sub-VT operation is shown to hold the minimum energy-point at roughly 0.29V, which represents a 2X energy-saving compared to the near-VT regime. Although energy-efficiency peaks in sub-VT for the circuits studied, we also show that in this ultra-low VDD the circuit timing and power suffer from substantially increased variability impact and a 30X performance drawback, with respect to near-VT.
|
8 |
Performance and Power Optimization of GPU Architectures for General-purpose ComputingWang, Yue 18 June 2014 (has links)
Power-performance efficiency has become a central focus that is challenging in heterogeneous processing platforms as the power constraints have to be established without hindering the high performance. In this dissertation, a framework for optimizing the power and performance of GPUs in the context of general-purpose computing in GPUs (GPGPU) is proposed. To optimize the leakage power of caches in GPUs, we dynamically switch the L1 and L2 caches into low power modes during periods of inactivity to reduce leakage power. The L1 cache can be put into a low-leakage (sleep) state when a processing unit is stalled due to no ready threads to be scheduled and the L2 can be put into sleep state during its idle period when there is no memory request. The sleep mode is state-retentive, which obviates the necessity to flush the caches after they are woken up, thereby, avoiding any performance degradation. Experimental results indicate that this technique can reduce the leakage power by 52% on average. Further, to improve performance, we redistribute the GPGPU workload across the computing units of the GPU during application execution. The fundamental idea is to monitor the workload on each multi-processing unit and redistribute it by having a portion of its unfinished threads executed in a neighboring multi-processing unit. Experimental results show this technique improves the performance of the GPGPU workload by 15.7%. Finally, to improve both performance and dynamic power of GPUs, we propose two dynamic frequency scaling (DFS) techniques implemented on CPU host threads, one of which is motivated by the significance of the pipeline stalls during GPGPU execution. It applies a feedback controlling algorithm, Proportional-Integral-Derivative (PID), to regulate the frequency of parallel processors and memory channels based on the occupancy of the memory buffering queues. The other technique targets on maximizing the average throughput of all parallel processors under the dynamic power constraints. We formalize this target as a linear programming problem and solve it on the runtime. According to the simulation results, the first technique achieves more than 22% power savings with a 4% improvement in performance and the second technique saves 11% power consumption with 9% performance improvement. The contributions of this dissertation represent a significant advancement in the quest for improving performance and reducing energy consumption of GPGPU.
|
9 |
Ordonnancement de tâches efficace et à complexité maîtrisée pour des systèmes temps-réelMuhammad, F. 09 April 2009 (has links) (PDF)
Les performances des algorithmes d'ordonnancement ont un impact direct sur les performances du système complet. Les algorithmes d'ordonnancement temps réel possèdent des bornes théoriques d'ordonnançabilité optimales mais cette optimalité est souvent atteinte au prix d'un nombre élevé d'événements d'ordonnancement à considérer (préemptions et migrations de tâches) et d'une complexité algorithmique importante. Notre opinion est qu'en exploitant plus efficacement les paramètres des tâches il est possible de rendre ces algorithmes plus efficaces et à coût maitrisé, et ce dans le but d'améliorer la Qualité de Service (QoS) des applications. Nous proposons dans un premier temps des algorithmes d'ordonnancement monoprocesseur qui augmentent la qualité de service d'applications hybrides c'est-à-dire qu'en situation de surcharge, les tâches à contraintes souples ont leur exécution maximisée et les échéances des tâches à contraintes strictes sont garanties. Le coût d'ordonnancement de ces algorithmes est aussi réduit (nombre de préemptions) par une meilleure exploitation des paramètres implicites et explicites des tâches. Cette réduction est bénéfique non seulement pour les performances du système mais elle agit aussi positivement sur la consommation d'énergie. Aussi nous proposons une technique associée à celle de DVFS (dynamic voltage and frequency scaling) afin de minimiser le nombre de changements de points de fonctionnement du fait qu'un changement de fréquence implique un temps d'inactivité du processeur et une consommation d'énergie. Les algorithmes d'ordonnancement multiprocesseur basés sur le modèle d'ordonnancement fluide (notion d'équité) atteignent des bornes d'ordonnançabilité optimales. Cependant cette équité n'est garantie qu'au prix d'hypothèses irréalistes en pratique du fait des nombres très élevés de préemptions et de migrations de tâches qu'ils induisent. Dans cette thèse un algorithme est proposé (ASEDZL) qui n'est pas basé sur le modèle d'ordonnancement fluide. Il permet non seulement de réduire les préemptions et les migrations de tâches mais aussi de relâcher les hypothèses imposées par ce modèle d'ordonnancement. Enfin, nous proposons d'utiliser ASEDZL dans une approche d'ordonnancement hiérarchique ce qui permet d'obtenir de meilleurs résultats que les techniques classiques.
|
10 |
An Adaptive Fuzzy Proportional-Integral Predictor for Power Management of 3D Graphics System-On-ChipYeh, Jia-huei 02 August 2010 (has links)
As time goes by rapid development of 3D graphics technique and 3C portable product output, 3D graphics have been widely applied to handheld devices, such as notebooks, PDAs, and smart cellular phones. Generally, to process 3D graphics applications in mobile devices, processor needs strong capability of handling large computational-intensive workloads. Complex computation consumes a great quantity of electric power. But the lifetime of handheld device battery is limited. Therefore, the cost, to satisfy this demand, will be shortening the supply time of device battery. Moreover, Moore¡¦ law said that the number of transistors in a chip is double in every eighteen months. But these days the advance in manufacturing batteries still cannot get up with the advance in developing processors. In addition, the improvement of chip size has led to more small, supply voltage of kernel processor in portable device. Considering system efficiency and battery lifetime simultaneously increase the difficulty of designing power management scheme. So, how to manage power effectively has become one of the important key for designing handheld products.
For 3D graphics system, dynamic voltage and frequency scaling (DVFS) is one of good solutions to implement power management policy. DVFS needs an efficient online prediction method to predict the workload of frames and then appropriately adjust voltage and frequency for saving energy consumption. Consequently, a lot of related papers have proposed different prediction policy to predict the executing workload of 3D graphics system. For instance, the existing prediction policies include signature-based[1], history-based[3] and proportion-integral-derivative (PID)[14] methods, but most of designers put power management in software, i.e. processors. This solution not only slows power management to get the information about executing time of graphic processing unit (GPU), but also increases the operating overhead of CPU in handheld system.
In this paper, we propose a power management workload prediction scheme with a framework of using proportion-integral (PI) controller to be a master controller and fuzzy controller to be a slave controller, and then implement it into hardware circuit. Taking advantage of fuzzy conception in fuzzy controller is to adjust the proportional parameter in PI controller, the shortage of traditional PI controller that demands on complicated try-and-error method to look for a good proportional and integral parameters can be avoided so that the adaption and forecasting accuracy can be improved. Besides, Uniform Window-size Predictor 1 (UW1) is also implemented as an assistant manner. Using UW1 predictor appropriately can improve the prediction trend to catch up with the trend of real workload. Experimental results show that our predictor improves prediction accuracy about 3.8% on average and saves about 0.02% more energy compared with PI predictor[18]. Circuit area and power consumption only increases 6.8% percent and 1.4% compared with PI predictor. Besides, we also apply our predictor to the 3D first person game, Quake II, in the market. The result shows that our predictor is indeed an effective prediction policy. The adaption can put up with the intense workload variation of real game and adjust voltage and frequency precisely to decrease power consumption and meet the purpose of energy saving.
|
Page generated in 0.0814 seconds