Spelling suggestions: "subject:"bfrequency dcaling"" "subject:"bfrequency fcaling""
11 |
Hybrid Fuzzy Kalman Filter for Workload Prediction of 3D Graphic SystemKe, Bao-chen 28 July 2011 (has links)
In modern life, 3D graphics system is widely applied to portable product like Notebook, PDA and smart phone. Unlike desktop system, the capacity of batteries of these embedded systems is finite. Furthermore, rapid improvement of IC process leads to quick growth in the transistor count of a chip. According to above-mentioned reason and the complex computation of 3D graphics system, the power consumption will be very large. To efficiently lengthen the lifetime of battery, power management is an indispensable technique.
Dynamic voltage and frequency scaling (DVFS) is one of the popular power management policy. In the scheme of DVFS, an accurate workload predictor is needed to predict the workload of every frame. According to these predictions a specific voltage and frequency level is applied to each frame of the 3D graphics system. The number of the voltage/frequency levels and the voltage/frequency of each level are fixed, the voltage/frequency table is decided according to the application of power management. Whenever the workload predictor completes the workload prediction of next frame, the voltage/frequency level of next frame will be found by looking up the voltage/frequency table.
In this thesis, we propose a power management scheme with a framework composed of mainly Kalman filter and an auxiliary fuzzy controller to predict the workload of next frame. This scheme amends the shortcomings of traditional Kalman filter that needs to know the system features beforehand. And we propose a brand new concept named ¡¨delayed display¡¨ to massively reduce the miss rate of prediction without changing the framework of predictor.
|
12 |
POWER REDUCTION BY DYNAMICALLY VARYING SAMPLING RATEDatta, Srabosti 01 January 2006 (has links)
In modern digital audio applications, a continuous audio signal stream is sampled at a fixed sampling rate, which is always greater than twice the highest frequency of the input signal, to prevent aliasing. A more energy efficient approach is to dynamically change the sampling rate based on the input signal. In the dynamic sampling rate technique, fewer samples are processed when there is little frequency content in the samples. The perceived quality of the signal is unchanged in this technique. Processing fewer samples involves less computation work; therefore processor speed and voltage can be reduced. This reduction in processor speed and voltage has been shown to reduce power consumption by up to 40% less than if the audio stream had been run at a fixed sampling rate.
|
13 |
Energy and speed exploration in digital CMOS circuits in the near-threshold regime for very-wide voltage-frequency scalingStangherlin, Kleber Hugo January 2013 (has links)
Esta tese avalia os benefícios e desafios associados com a operação em uma ampla faixa de frequências e tensões próximas ao limiar do transistor. A diminuição da tensão de alimentação em circuitos digitais CMOS apresenta grandes vantagens em termos de potência consumida pelo circuito. Esta diminuição da potência é acompanhada por uma redução da performance, reflexo da diminuição na tensão de alimentação. A operação de circuitos digitais no ponto de energia mínima é comumente associada ao regime de operação abaixo do limiar do transistor, trazendo enormes penalidades em performance e variabilidade. Esta dissertação mostra que é possível obter 8X mais eficiência energética com uma ampla faixa dinâmica de tensão e frequência, da tensão nominal até o limite inferior da operação próximo ao limiar do transistor. Como parte deste estudo, uma biblioteca de células digitais CMOS para esta ampla faixa de frequências foi desenvolvida. A biblioteca de células lógicas foi exercitada em um PDK comercial de 65nm para operação próximo ao limiar do transistor, reduzindo os efeitos da variabilidade sem comprometer o projeto em termos de área e energia quando operando em inversão forte. Para operar próximo e abaixo do limiar do transistor as células devem ser desenvolvidas com um número limitado de transistores em série. Nosso estudo mostra que uma performance aceitável em termos de margens de ruído estático é obtida para um conjunto restrito de células, onde são empregados no máximo dois transistores em série. Reportamos resultados para projetos de média complexidade que incluem um filtro notch de 25kgates, um microcontrolador 8051 de 20kgates, e 4 circuitos combinacionais/ sequenciais do conjunto de avaliação ISCAS. Neste trabalho, é estudada a máxima frequência atingida em cada tensão de alimentação, desde 0.15V até 1.2V. O ponto de mínima energia é demonstrado em operação abaixo do limiar do transistor, aproximadamente 0.29V, oque representa um ganho de 2X em eficiência energética comparado ao regime de operação próximo ao limiar do transistor. Embora o pico de eficiência energética ocorra abaixo do limiar do transistor para os circuitos estudados, nós também demonstramos que nesta tensão de alimentação ultra-baixa o atraso e a potência sofrem um impacto substancial devido ao aumento na variabilidade, atigindo uma degradação em performance de 30X, com respeito à operação próxima ao limiar do transistor. / This thesis assesses the benefits and drawbacks associated with a very wide range of frequency when operation at near-threshold is considered. Scaling down the supply voltage in digital CMOS circuits presents great benefits in terms of power reduction. Such scaling comes with a performance penalty, hence in digital synchronous circuits the reduction in frequency of operation follows, for a given circuit layout, the VDD reduction. Minimum-energy operation of digital CMOS circuits is commonly associated to the sub-VT regime, carrying huge performance and variability penalties. This thesis shows that it is possible to achieve 8X higher energy-efficiency with a very-wide range of dynamic voltage-frequency scaling, from nominal voltages down to the lower boundary of near-VT operation. As part of this study, a CMOS digital cell-library for such wide range of frequencies was developed. The cell-library is exercised in a 65nm commercial PDK and targets near-VT operation, mitigating the variability effects without compromising the design in terms of area and energy at strong inversion. For near-VT or sub-VT operation the cells have to be designed with few stacked transistors. Our study shows that acceptable performance in terms of static-noise margins is obtained for a constrained set of cells, for which a maximum of 2-stacked transistors are allowed. In this set we include master-slave registers. We report results for medium complexity designs which include a 25kgates notch filter, a 20kgates 8051 compatible core, and 4-combinational/4-sequential ISCAS benchmark circuits. In this work the maximum frequency attainable at each supply for a wide variation of voltage is studied from 150mV up to nominal voltage (1.2V). The sub-VT operation is shown to hold the minimum energy-point at roughly 0.29V, which represents a 2X energy-saving compared to the near-VT regime. Although energy-efficiency peaks in sub-VT for the circuits studied, we also show that in this ultra-low VDD the circuit timing and power suffer from substantially increased variability impact and a 30X performance drawback, with respect to near-VT.
|
14 |
Energy and speed exploration in digital CMOS circuits in the near-threshold regime for very-wide voltage-frequency scalingStangherlin, Kleber Hugo January 2013 (has links)
Esta tese avalia os benefícios e desafios associados com a operação em uma ampla faixa de frequências e tensões próximas ao limiar do transistor. A diminuição da tensão de alimentação em circuitos digitais CMOS apresenta grandes vantagens em termos de potência consumida pelo circuito. Esta diminuição da potência é acompanhada por uma redução da performance, reflexo da diminuição na tensão de alimentação. A operação de circuitos digitais no ponto de energia mínima é comumente associada ao regime de operação abaixo do limiar do transistor, trazendo enormes penalidades em performance e variabilidade. Esta dissertação mostra que é possível obter 8X mais eficiência energética com uma ampla faixa dinâmica de tensão e frequência, da tensão nominal até o limite inferior da operação próximo ao limiar do transistor. Como parte deste estudo, uma biblioteca de células digitais CMOS para esta ampla faixa de frequências foi desenvolvida. A biblioteca de células lógicas foi exercitada em um PDK comercial de 65nm para operação próximo ao limiar do transistor, reduzindo os efeitos da variabilidade sem comprometer o projeto em termos de área e energia quando operando em inversão forte. Para operar próximo e abaixo do limiar do transistor as células devem ser desenvolvidas com um número limitado de transistores em série. Nosso estudo mostra que uma performance aceitável em termos de margens de ruído estático é obtida para um conjunto restrito de células, onde são empregados no máximo dois transistores em série. Reportamos resultados para projetos de média complexidade que incluem um filtro notch de 25kgates, um microcontrolador 8051 de 20kgates, e 4 circuitos combinacionais/ sequenciais do conjunto de avaliação ISCAS. Neste trabalho, é estudada a máxima frequência atingida em cada tensão de alimentação, desde 0.15V até 1.2V. O ponto de mínima energia é demonstrado em operação abaixo do limiar do transistor, aproximadamente 0.29V, oque representa um ganho de 2X em eficiência energética comparado ao regime de operação próximo ao limiar do transistor. Embora o pico de eficiência energética ocorra abaixo do limiar do transistor para os circuitos estudados, nós também demonstramos que nesta tensão de alimentação ultra-baixa o atraso e a potência sofrem um impacto substancial devido ao aumento na variabilidade, atigindo uma degradação em performance de 30X, com respeito à operação próxima ao limiar do transistor. / This thesis assesses the benefits and drawbacks associated with a very wide range of frequency when operation at near-threshold is considered. Scaling down the supply voltage in digital CMOS circuits presents great benefits in terms of power reduction. Such scaling comes with a performance penalty, hence in digital synchronous circuits the reduction in frequency of operation follows, for a given circuit layout, the VDD reduction. Minimum-energy operation of digital CMOS circuits is commonly associated to the sub-VT regime, carrying huge performance and variability penalties. This thesis shows that it is possible to achieve 8X higher energy-efficiency with a very-wide range of dynamic voltage-frequency scaling, from nominal voltages down to the lower boundary of near-VT operation. As part of this study, a CMOS digital cell-library for such wide range of frequencies was developed. The cell-library is exercised in a 65nm commercial PDK and targets near-VT operation, mitigating the variability effects without compromising the design in terms of area and energy at strong inversion. For near-VT or sub-VT operation the cells have to be designed with few stacked transistors. Our study shows that acceptable performance in terms of static-noise margins is obtained for a constrained set of cells, for which a maximum of 2-stacked transistors are allowed. In this set we include master-slave registers. We report results for medium complexity designs which include a 25kgates notch filter, a 20kgates 8051 compatible core, and 4-combinational/4-sequential ISCAS benchmark circuits. In this work the maximum frequency attainable at each supply for a wide variation of voltage is studied from 150mV up to nominal voltage (1.2V). The sub-VT operation is shown to hold the minimum energy-point at roughly 0.29V, which represents a 2X energy-saving compared to the near-VT regime. Although energy-efficiency peaks in sub-VT for the circuits studied, we also show that in this ultra-low VDD the circuit timing and power suffer from substantially increased variability impact and a 30X performance drawback, with respect to near-VT.
|
15 |
The Development of Hardware Multi-core Test-bed on Field Programmable Gate ArrayShivashanker, Mohan 24 March 2011 (has links)
The goal of this project is to develop a flexible multi-core hardware test-bed on field programmable gate array (FPGA) that can be used to effectively validate the theoretical research on multi-core computing, especially for the power/thermal aware computing. Based on a commercial FPGA test platform, i.e. Xilinx Virtex5 XUPV5 LX110T, we develop a homogeneous multi-core test-bed with four software cores, each of which can dynamically adjust its performance using software. We also enhance the operating system support for this test platform with the development of hardware and software primitives that are useful in dealing with inter-process communication, synchronization, and scheduling for processes on multiple cores. An application based on matrix addition and multiplication on multi-core is implemented to validate the applicability of the test bed.
|
16 |
Turbulent Boundary Layers over Rough Surfaces: Large Structure Velocity Scaling and Driver Implications for Acoustic MetamaterialsRepasky, Russell James 01 July 2019 (has links)
Turbulent boundary layer and metamaterial properties were explored to initiate the viability of controlling acoustic waves driven by pressure fluctuations from flow. A turbulent boundary layer scaling analysis was performed on zero-pressure-gradient turbulent boundary layers over rough surfaces, for 30,000≤〖Re〗_θ≤100,000. Relationships between fluctuating pressures and velocities were explored through the pressure Poisson equation. Certain scaling laws were implemented in attempts to collapse velocity spectra and turbulence profiles. Such analyses were performed to justify a proper scaling of the low-frequency region of the wall-pressure spectrum. Such frequencies are commonly associated with eddies containing the largest length scales. This study compared three scaling methods proposed in literature: The low-frequency classical scaling (velocity scale U_τ, length scale δ), the convection velocity scaling (U_e-U ̅_c, δ), and the Zagarola-Smits scaling (U_e-U ̅, δ). A default scaling (U_e, δ) was also selected as a baseline case for comparison. At some level, the classical scaling best collapsed rough and smooth wall Reynolds stress profiles. Low-pass filtering of the scaled turbulence profiles improved the rough-wall scaling of the Zagarola-Smits and convection velocity laws. However, inconsistent scaled results between the pressure and velocity requires a more rigorous pressure Poisson analysis. The selection of a proper scaling law gives insight into turbulent boundary layers as possible sources for acoustic metamaterials. A quiescent (no flow) experiment was conducted to measure the capabilities of a metamaterial in retaining acoustic surface waves. A point source speaker provided an acoustic input while the resulting sound waves were measured with a probe microphone. Acoustic surface waves were found via Fourier analysis in time and space. Standing acoustic surface waves were identified. Membrane response properties were measured to obtain source condition characteristics for turbulent boundary layers once the metamaterial is exposed to flow. / Master of Science / Aerodynamicists are often concerned with interactions between fluids and solids, such as an aircraft wing gliding through air. Due to frictional effects, the relative velocity of the air on the solid-surface is negligible. This results in a layer of slower moving fluid near the surface referred to as a boundary layer. Boundary layers regularly occur in the fluid-solid interface, and account for a sufficient amount of noise and drag on aircraft. To compensate for increases in drag, engines are required to produce increased amounts of power. This leads to higher fuel consumption and increased costs. Additionally, most boundary layers in nature are turbulent, or chaotic. Therefore, it is difficult to predict the exact paths of air molecules as they travel within a boundary layer. Because of its intriguing physics and impacts on economic costs, turbulent boundary layers have been a popular research topic. This study analyzed air pressure and velocity measurements of turbulent boundary layers. Relationships between the two were drawn, which fostered a discussion of future works in the field. Mainly, the simultaneous measurements of pressure on the surface and boundary layer velocity can be performed with understanding of the Pressure Poisson equation. This equation is a mathematical representation of the boundary layer pressure on the surface. This study also explored the possibility of turbulent-boundary-layer-driven-acoustic-metamaterials. Acoustic metamaterials contain hundreds of cavities which can collectively manipulate passing sound waves. A facility was developed at Virginia Tech to measure this effect, with aid from a similar laboratory at Exeter University. Microphone measurements showed the reduction of sound wave speed across the metamaterial, showing promise in acoustic manipulation. Applications in metamaterials in the altering of sound caused by turbulent boundary layers were also explored and discussed.
|
17 |
Modeling and Runtime Systems for Coordinated Power-Performance ManagementLi, Bo 28 January 2019 (has links)
Emergent systems in high-performance computing (HPC) expect maximal efficiency to achieve the goal of power budget under 20-40 megawatts for 1 exaflop set by the Department of Energy. To optimize efficiency, emergent systems provide multiple power-performance control techniques to throttle different system components and scale of concurrency. In this dissertation, we focus on three throttling techniques: CPU dynamic voltage and frequency scaling (DVFS), dynamic memory throttling (DMT), and dynamic concurrency throttling (DCT). We first conduct an empirical analysis of the performance and energy trade-offs of different architectures under the throttling techniques. We show the impact on performance and energy consumption on Intel x86 systems with accelerators of Intel Xeon Phi and a Nvidia general-purpose graphics processing unit (GPGPU). We show the trade-offs and potentials for improving efficiency. Furthermore, we propose a parallel performance model for coordinating DVFS, DMT, and DCT simultaneously. We present a multivariate linear regression-based approach to approximate the impact of DVFS, DMT, and DCT on performance for performance prediction. Validation using 19 HPC applications/kernels on two architectures (i.e., Intel x86 and IBM BG/Q) shows up to 7% and 17% prediction error correspondingly. Thereafter, we develop the metrics for capturing the performance impact of DVFS, DMT, and DCT. We apply the artificial neural network model to approximate the nonlinear effects on performance impact and present a runtime control strategy accordingly for power capping. Our validation using 37 HPC applications/kernels shows up to a 20% performance improvement under a given power budget compared with the Intel RAPL-based method. / Ph. D. / System efficiency on high-performance computing (HPC) systems is the key to achieving the goal of power budget for exascale supercomputers. Techniques for adjusting the performance of different system components can help accomplish this goal by dynamically controlling system performance according to application behaviors. In this dissertation, we focus on three techniques: adjusting CPU performance, memory performance, and the number of threads for running parallel applications. First, we profile the performance and energy consumption of different HPC applications on both Intel systems with accelerators and IBM BG/Q systems. We explore the trade-offs of performance and energy under these techniques and provide optimization insights. Furthermore, we propose a parallel performance model that can accurately capture the impact of these techniques on performance in terms of job completion time. We present an approximation approach for performance prediction. The approximation has up to 7% and 17% prediction error on Intel x86 and IBM BG/Q systems respectively under 19 HPC applications. Thereafter, we apply the performance model in a runtime system design for improving performance under a given power budget. Our runtime strategy achieves up to 20% performance improvement to the baseline method.
|
18 |
Software Controlled Clock Modulation for Energy Efficiency Optimization on Intel ProcessorsSchöne, Robert, Ilsche, Thomas, Bielert, Mario, Molka, Daniel, Hackenberg, Daniel 24 October 2017 (has links) (PDF)
Current Intel processors implement a variety of power saving features like frequency scaling and idle states. These mechanisms limit the power draw and thereby decrease the thermal dissipation of the processors. However, they also have an impact on the achievable performance. The various mechanisms significantly differ regarding the amount of power savings, the latency of mode changes, and the associated overhead. In this paper, we describe and closely examine the so-called software controlled clock modulation mechanism for different processor generations. We present results that imply that the available documentation is not always correct and describe when this feature can be used to improve energy efficiency. We additionally compare it against the more popular feature of dynamic voltage and frequency scaling and develop a model to decide which feature should be used to optimize inter-process synchronizations on Intel Haswell-EP processors.
|
19 |
Petri Net Model Based Energy Optimization Of Programs Using Dynamic Voltage And Frequency ScalingArun, R 06 1900 (has links) (PDF)
High power dissipation and on-chip temperature limit performance and affect reliability in modern microprocessors. For servers and data centers, they determine the cooling cost, whereas for handheld and mobile systems, they limit the continuous usage of these systems. For mobile systems, energy consumption affects the battery life. It can not be ignored for desktop and server systems as well, as the contribution of energy continues to go up in organizations’ budgets, influencing strategic decisions, and its implications on the environment are getting appreciated. Intelligent trade-offs involving these quantities are critical to meet the performance demands of many modern applications.
Dynamic Voltage and Frequency Scaling (DVFS) offers a huge potential for designing
trade-offs involving energy, power, temperature and performance of computing systems. In our work, we propose and evaluate DVFS schemes that aim at minimizing energy consumption while meeting a performance constraint, for both sequential and parallel applications.
We propose a Petri net based program performance model, parameterized by application properties, microarchitectural settings and system resource configuration, and use this model to find energy efficient DVFS settings. We first propose a DVFS scheme
using this model for sequential programs running on single core multiple clock domain
(MCD) processors, and evaluate this on a MCD processor simulator. We then extend
this scheme for data parallel (Single Program Multiple Data style) applications, and then generalize it for stream applications as well, and evaluate these two schemes on a full system CMP simulator. Our experimental evaluation shows that the proposed schemes achieve significant energy savings for a small performance degradation.
|
20 |
Software Controlled Clock Modulation for Energy Efficiency Optimization on Intel ProcessorsSchöne, Robert, Ilsche, Thomas, Bielert, Mario, Molka, Daniel, Hackenberg, Daniel 24 October 2017 (has links)
Current Intel processors implement a variety of power saving features like frequency scaling and idle states. These mechanisms limit the power draw and thereby decrease the thermal dissipation of the processors. However, they also have an impact on the achievable performance. The various mechanisms significantly differ regarding the amount of power savings, the latency of mode changes, and the associated overhead. In this paper, we describe and closely examine the so-called software controlled clock modulation mechanism for different processor generations. We present results that imply that the available documentation is not always correct and describe when this feature can be used to improve energy efficiency. We additionally compare it against the more popular feature of dynamic voltage and frequency scaling and develop a model to decide which feature should be used to optimize inter-process synchronizations on Intel Haswell-EP processors.
|
Page generated in 0.2131 seconds