31 |
I/O Aware Power ShiftingSavoie, Lee, Lowenthal, David K., Supinski, Bronis R. de, Islam, Tanzima, Mohror, Kathryn, Rountree, Barry, Schulz, Martin 05 1900 (has links)
Power limits on future high-performance computing (HPC) systems will constrain applications. However, HPC applications do not consume constant power over their lifetimes. Thus, applications assigned a fixed power bound may be forced to slow down during high-power computation phases, but may not consume their full power allocation during low-power I/O phases. This paper explores algorithms that leverage application semantics-phase frequency, duration and power needs-to shift unused power from applications in I/O phases to applications in computation phases, thus improving system-wide performance. We design novel techniques that include explicit staggering of applications to improve power shifting. Compared to executing without power shifting, our algorithms can improve average performance by up to 8% or improve performance of a single, high-priority application by up to 32%.
|
32 |
在WMN網路上考量功率及負載之路由協定 / An Efficient POwer-Load-Aware Routing Protocol (POLAR) for Wireless Mesh Networks吳耀先, Wu,Yao-Hsien Unknown Date (has links)
為了降低無線網路基地台後端之backhaul成本及解決Ad hoc網路涵蓋面積問題,無線網狀網路WMNs(Wireless Mesh Networks)因此應運而生。WMNs網路上的節點裝置與Ad hoc網路上的行動裝置對電量消耗及負載的需求是非常不同的,所以在Ad hoc網路上可使用之路由協定在WMNs網路上是無法直接適用的。
在Pure Ad hoc網路上考量Power之MMBCR(Min-Max Battery Cost Routing)及考量Loading之CSLAR(Contention Sensitive Load Aware Routing)等路由協定並沒有考量到WMNs網路上不同元件間的不同特性。有鑑於此,我們著重在Hybrid WMNs網路環境上,並提出了在Mesh Clients及Routers上同時考量Power及Loading的路徑演算法,我們稱之為POLAR。實驗結果顯示我們的路由協定能夠提昇整體的網路效能及延長網路存活時間。 / In order to reduce the backhaul cost and solve Ad hoc network coverage problem, WMNs (Wireless Mesh Networks) arise at the historic moment. The requirements on power efficiency and loading are much different between mesh nodes of WMNs and mobile hosts of ad hoc networks. The routing protocol used in Ad hoc networks would be not suitable in WMN networks.
The power-aware routing in MMBCR (Min-Max Battery Cost Routing) and load-aware routing in CSLAR (Contention Sensitive Load Aware Routing) used in pure Ad hoc networks don‘t consider the different characteristics of the components in WMNs. In view of this, we focus on the Hybrid WMNs environment, and propose a combined POwer-Aware with Load-Aware Routing algorithm (Called POLAR) along mesh clients and routers. The experimental results show that our routing protocol can enhance the network efficiency and lengthen the network live time.
|
33 |
Designing Energy-Aware Optimization Techniques through Program Behaviour AnalysisKommaraju, Ananda Varadhan January 2014 (has links) (PDF)
Green computing techniques aim to reduce the power foot print of modern embedded devices with particular emphasis on processors, the power hot-spots of these devices. In this thesis we propose compiler-driven and profile-driven optimizations that reduce power consumption in a modern embedded processor. We show that these optimizations reduce power consumption in functional units and memory subsystems with very low performance loss. We present three new techniques to reduce power consumption in processors, namely, transition aware scheduling, leakage reduction in data caches using criticality analysis, and dynamic power reduction in data caches using locality analysis of data regions.
A novel instruction scheduling technique to address leakage power consumption in functional units is proposed. This scheduling technique, transition aware scheduling, is motivated by idle periods that arise in the utilization of functional units during program execution. A continuously large idle period in a functional unit can be exploited to place the unit in low power state. This novel scheduling algorithm increases the duration of idle periods without hampering performance and drives power gating in these periods. A power model defined with idle cycles as a parameter shows that this technique saves up to 25% of leakage power with very low performance impact.
In modern embedded programs, data regions can be classified as critical and non-critical. Critical data regions significantly impact the performance. A new technique to identify such data regions through profiling is proposed. This technique along with a new criticality based cache policy is used to control the power state of the data cache. This scheme allocates non-critical data regions to low-power cache regions, thereby reducing leakage power consumption by up to 40% without compromising on the performance.
This profiling technique is extended to identify data regions that have low locality. Some data regions have high data reuse. A locality based cache policy based on cache parameters like size and associativity is proposed. This scheme reduces dynamic as well as static power consumption in the cache subsystem. This optimization reduces 25% of the total power consumption in the data caches without hampering the execution time.
In this thesis, the problem of power consumption of a program is decoupled from the number of processor cores. The underlying architecture model is simplified to abstract away a variety of processor scenarios. This simplified model can be scaled up to be implemented in various multi-core architecture models like Chip Multi-Processors, Simultaneous Multi-Threaded Processors, Chip Multi-Threaded Processors, to name a few.
The three techniques proposed in this thesis leverage underlying hardware features like low power functional units, drowsy caches and split data caches. These techniques reduce power consumption of a wide range of benchmarks with low performance loss.
|
34 |
Ensemble Stream Model for Data-Cleaning in Sensor NetworksIyer, Vasanth 16 October 2013 (has links)
Ensemble Stream Modeling and Data-cleaning are sensor information processing systems have different training and testing methods by which their goals are cross-validated. This research examines a mechanism, which seeks to extract novel patterns by generating ensembles from data. The main goal of label-less stream processing is to process the sensed events to eliminate the noises that are uncorrelated, and choose the most likely model without over fitting thus obtaining higher model confidence. Higher quality streams can be realized by combining many short streams into an ensemble which has the desired quality. The framework for the investigation is an existing data mining tool.
First, to accommodate feature extraction such as a bush or natural forest-fire event we make an assumption of the burnt area (BA*), sensed ground truth as our target variable obtained from logs. Even though this is an obvious model choice the results are disappointing. The reasons for this are two: One, the histogram of fire activity is highly skewed. Two, the measured sensor parameters are highly correlated. Since using non descriptive features does not yield good results, we resort to temporal features. By doing so we carefully eliminate the averaging effects; the resulting histogram is more satisfactory and conceptual knowledge is learned from sensor streams.
Second is the process of feature induction by cross-validating attributes with single or multi-target variables to minimize training error. We use F-measure score, which combines precision and accuracy to determine the false alarm rate of fire events. The multi-target data-cleaning trees use information purity of the target leaf-nodes to learn higher order features. A sensitive variance measure such as f-test is performed during each node’s split to select the best attribute. Ensemble stream model approach proved to improve when using complicated features with a simpler tree classifier.
The ensemble framework for data-cleaning and the enhancements to quantify quality of fitness (30% spatial, 10% temporal, and 90% mobility reduction) of sensor led to the formation of streams for sensor-enabled applications. Which further motivates the novelty of stream quality labeling and its importance in solving vast amounts of real-time mobile streams generated today.
|
35 |
Power Issues in SoCs : Power Aware DFT Architecture and Power EstimationTudu, Jaynarayan Thakurdas January 2016 (has links) (PDF)
Test power, data volume, and test time have been long-standing problems for sequential scan based testing of system-on-chip (SoC) design. The modern SoCs fabricated at lower technology nodes are complex in nature, the transistor count is as large as billions of gate for some of the microprocessors. The design complexity is further projected to increase in the coming years in accordance with Moore's law. The larger gate count and integration of multiple functionalities are the causes for higher test power dissipation, test time and data volume. The dynamic power dissipation during scan testing, i.e. during scan shift, launch and response capture, are major concerns for reliable as well as cost effective testing. Excessive average power dissipation leads to a thermal problem which causes burn-out of the chip during testing. Peak power on other hand causes test failure due to power induced additional delay. The test failure has direct impact on yield. The test power problem in modern 3D stacked based IC is even a more serious issue. Estimating the worst case functional power dissipation is yet another great challenge. The worst case functional power estimation is necessary because it gives an upper bound on the functional power dissipation which can further be used to determine the safe power zone for the test.
Several solutions in the past have been proposed to address these issues. In this thesis we have three major contributions: 1) Sequential scan chain reordering, and 2) JScan-an alternative Joint-scan DFT architecture to address primarily the test power issues along with test time and data volume, and 3) an integer linear programming methodology to address the power estimation problem. In order to reduce test power during shift, we have proposed a graph theoretic formulation for scan chain reordering and for optimum scan shift operation. For each formulation a set of algorithms is proposed. The experimental results on ISCAS-89 benchmark circuit show a reduction of around 25% and 15% in peak power and scan shift time respectively.
In order to have a holistic DFT architecture which could solve test power, test time, and data volume problems, a new DFT architecture called Joint-scan (JScan) have been developed. In JScan we have integrated the serial and random access scan architectures in a systematic way by which the JScan could harness the respective advantages from each of the architectures. The serial scan architecture
from test power, test time, and data volume problems. However, the serial scan is simple in terms of its functionality and is cost effective in terms of DFT circuitry. Whereas, the random ac-cess scan architecture is opposite to this; it is power efficient and it takes lesser time and data volume compared to serial scan. However, the random access scan occupies larger DFT area and introduces routing congestion. Therefore, we have proposed a methodology to realize the JScan architecture as an efficient alternative for standard serial and random access scan. Further, the JScan architecture is optimized and it resulted into a 2-Mode 2M-Jscan Joint-scan architecture. The proposed architectures are experimentally verified on larger benchmark circuits and compared with existing state of the art DFT architectures. The results show a reduction of 50% to 80% in test power and 30% to 50% in test time and data volume. The proposed architectures are also evaluated for routing area minimization and we obtained a saving of around 7% to 15% of chip area.
Estimating the worst case functional power being a challenging problem, we have proposed a binary integer linear programming (BILP) based methodology. Two different formulations have been proposed considering the different delay models namely zero-delay and unit-delay. The proposed methodology generates a pair or input vectors which could toggle the circuit to dissipate worst power. The BILP problems are solved using CPLEX solver for ISCAS-85 combinational benchmark circuits. For some of the circuits, the proposed methodology provided the worst possible power dissipation i.e. 80 to 100% toggling in nets.
|
Page generated in 0.0338 seconds