Global ETD Search

1	Practical Dynamic Thermal Management on Intel Desktop Computer Liu, Guanglei 12 July 2012 (has links) Fueled by increasing human appetite for high computing performance, semiconductor technology has now marched into the deep sub-micron era. As transistor size keeps shrinking, more and more transistors are integrated into a single chip. This has increased tremendously the power consumption and heat generation of IC chips. The rapidly growing heat dissipation greatly increases the packaging/cooling costs, and adversely affects the performance and reliability of a computing system. In addition, it also reduces the processor's life span and may even crash the entire computing system. Therefore, dynamic thermal management (DTM) is becoming a critical problem in modern computer system design. Extensive theoretical research has been conducted to study the DTM problem. However, most of them are based on theoretically idealized assumptions or simplified models. While these models and assumptions help to greatly simplify a complex problem and make it theoretically manageable, practical computer systems and applications must deal with many practical factors and details beyond these models or assumptions. The goal of our research was to develop a test platform that can be used to validate theoretical results on DTM under well-controlled conditions, to identify the limitations of existing theoretical results, and also to develop new and practical DTM techniques. This dissertation details the background and our research efforts in this endeavor. Specifically, in our research, we first developed a customized test platform based on an Intel desktop. We then tested a number of related theoretical works and examined their limitations under the practical hardware environment. With these limitations in mind, we developed a new reactive thermal management algorithm for single-core computing systems to optimize the throughput under a peak temperature constraint. We further extended our research to a multicore platform and developed an effective proactive DTM technique for throughput maximization on multicore processor based on task migration and dynamic voltage frequency scaling technique. The significance of our research lies in the fact that our research complements the current extensive theoretical research in dealing with increasingly critical thermal problems and enabling the continuous evolution of high performance computing systems. Dynamic thermal management thermal-aware scheduling throughput maximization practical hardware platform temperature
2	Capacity and Throughput Optimization in Multi-cell 3G WCDMA Networks Nguyen, Son 12 1900 (has links) User modeling enables in the computation of the traffic density in a cellular network, which can be used to optimize the placement of base stations and radio network controllers as well as to analyze the performance of resource management algorithms towards meeting the final goal: the calculation and maximization of network capacity and throughput for different data rate services. An analytical model is presented for approximating the user distributions in multi-cell third generation wideband code division multiple access (WCDMA) networks using 2-dimensional Gaussian distributions by determining the means and the standard deviations of the distributions for every cell. This model allows for the calculation of the inter-cell interference and the reverse-link capacity of the network. An analytical model for optimizing capacity in multi-cell WCDMA networks is presented. Capacity is optimized for different spreading factors and for perfect and imperfect power control. Numerical results show that the SIR threshold for the received signals is decreased by 0.5 to 1.5 dB due to the imperfect power control. The results also show that the determined parameters of the 2-dimensional Gaussian model match well with traditional methods for modeling user distribution. A call admission control algorithm is designed that maximizes the throughput in multi-cell WCDMA networks. Numerical results are presented for different spreading factors and for several mobility scenarios. Our methods of optimizing capacity and throughput are computationally efficient, accurate, and can be implemented in large WCDMA networks. user modeling capacity throughput maximization in WCDMA
3	Performance Limits of Communication with Energy Harvesting Znaidi, Mohamed Ridha 04 1900 (has links) In energy harvesting communications, the transmitters have to adapt transmission to the availability of energy harvested during communication. The performance of the transmission depends on the channel conditions which vary randomly due to mobility and environmental changes. During this work, we consider the problem of power allocation taking into account the energy arrivals over time and the quality of channel state information (CSI) available at the transmitter, in order to maximize the throughput. Differently from previous work, the CSI at the transmitter is not perfect and may include estimation errors. We solve this problem with respect to the energy harvesting constraints. Assuming a perfect knowledge of the CSI at the receiver, we determine the optimal power policy for different models of the energy arrival process (offline and online model). Indeed, we obtain the power allocation scheme when the transmitter has either perfect CSI or no CSI. We also investigate of utmost interest the case of fading channels with imperfect CSI. Moreover, a study of the asymptotic behavior of the communication system is proposed. Specifically, we analyze of the average throughput in a system where the average recharge rate goes asymptotically to zero and when it is very high. Average recharge rate Asymptomatic average throughput Channel state information Channel estimation Energy harvesting Throughput maximization
4	Memory-aware algorithms : from multicores to large scale platforms Jacquelin, Mathias 20 July 2011 (has links) (PDF) This thesis focus on memory-aware algorithms tailored for hierarchical memory architectures, found for instance within multicore processors. We first study the matrix product on multicore architectures. We model such a processor, and derive lower bounds on the communication volume. We introduce three ad hoc algorithms, and experimentally assess their performance.We then target a more complex operation: the QR factorization of tall matrices. We revisit existing algorithms to better exploit the parallelism of multicore processors. We thus study the critical paths of many algorithms, prove some of them to be asymptotically optimal, and assess their performance.In the next study, we focus on scheduling streaming applications onto a heterogeneous multicore platform, the QS 22. We introduce a model of the platform and use steady-state scheduling techniques so as to maximize the throughput. We present a mixed integer programming approach that computes an optimal solution, and propose simpler heuristics. We then focus on minimizing the amount of required memory for tree-shaped workflows, and target a classical two-level memory system. I/O represent transfers from a memory to the other. We propose a new exact algorithm, and show that there exist trees where postorder traversals are arbitrarily bad. We then study the problem of minimizing the I/O volume for a given memory, show that it is NP-hard, and provide a set of heuristics.Finally, we compare archival policies for BLUE WATERS. We introduce two archival policies and adapt the well known RAIT strategy. We provide a model of the tape storage platform, and use it to assess the performance of the three policies through simulation. [INFO:INFO_OH] Computer Science/Other Memory hierarchy Scheduling Steady-state Heterogeneous platforms Heuristics Optimization Linear algebra Throughput maximization Memory constraints Multicore
5	Distributed Cooperative Communications and Wireless Power Transfer Wang, Rui 22 February 2018 (has links) In telecommunications, distributed cooperative communications refer to techniques which allow different users in a wireless network to share or combine their information in order to increase diversity gain or power gain. Unlike conventional point-to-point communications maximizing the performance of the individual link, distributed cooperative communications enable multiple users to collaborate with each other to achieve an overall improvement in performance, e.g., improved range and data rates. The first part of this dissertation focuses the problem of jointly decoding binary messages from a single distant transmitter to a cooperative receive cluster. The outage probability of distributed reception with binary hard decision exchanges is compared with the outage probability of ideal receive beamforming with unquantized observation exchanges. Low- dimensional analysis and numerical results show, via two simple but surprisingly good approximations, that the outage probability performance of distributed reception with hard decision exchanges is well-predicted by the SNR of ideal receive beamforming after subtracting a hard decision penalty of slightly less than 2 dB. These results, developed in non-asymptotic regimes, are consistent with prior asymptotic results (for a large number of nodes and low per-node SNR) on hard decisions in binary communication systems. We next consider the problem of estimating and tracking channels in a distributed transmission system with multiple transmitters and multiple receivers. In order to track and predict the effective channel between each transmit node and each receive node to facilitate coherent transmission, a linear time-invariant state- space model is developed and is shown to be observable but nonstabilizable. To quantify the steady-state performance of a Kalman filter channel tracker, two methods are developed to efficiently compute the steady-state prediction covariance. An asymptotic analysis is also presented for the homogenous oscillator case for systems with a large number of transmit and receive nodes with closed-form results for all of the elements in the asymptotic prediction covariance as a function of the carrier frequency, oscillator parameters, and channel measurement period. Numeric results confirm the analysis and demonstrate the effect of the oscillator parameters on the ability of the distributed transmission system to achieve coherent transmission. In recent years, the development of efficient radio frequency (RF) radiation wireless power transfer (WPT) systems has become an active research area, motivated by the widespread use of low-power devices that can be charged wirelessly. In this dissertation, we next consider a time division multiple access scenario where a wireless access point transmits to a group of users which harvest the energy and then use this energy to transmit back to the access point. Past approaches have found the optimal time allocation to maximize sum throughput under the assumption that the users must use all of their harvested power in each block of the "harvest-then-transmit" protocol. This dissertation considers optimal time and energy allocation to maximize the sum throughput for the case when the nodes can save energy for later blocks. To maximize the sum throughput over a finite horizon, the initial optimization problem is separated into two sub-problems and finally can be formulated into a standard box- constrained optimization problem, which can be solved efficiently. A tight upper bound is derived by relaxing the energy harvesting causality. A disadvantage of RF-radiation based WPT is that path loss effects can significantly reduce the amount of power received by energy harvesting devices. To overcome this problem, recent investigations have considered the use of distributed transmit beamforming (DTB) in wireless communication systems where two or more individual transmit nodes pool their antenna resources to emulate a virtual antenna array. In order to take the advantages of the DTB in the WPT, in this dissertation, we study the optimization of the feedback rate to maximize the energy efficiency in the WPT system. Since periodic feedback improves the beamforming gain but requires the receivers to expend energy, there is a fundamental tradeoff between the feedback period and the efficiency of the WPT system. We develop a new model to combine WPT and DTB and explicitly account for independent oscillator dynamics and the cost of feedback energy from the receive nodes. We then formulate a "Normalized Weighted Mean Energy Harvesting Rate" (NWMEHR) maximization problem to select the feedback period to maximize the weighted averaged amount of net energy harvested by the receive nodes per unit of time as a function of the oscillator parameters. We develop an explicit method to numerically calculate the globally optimal feedback period. outage probability distributed communication systems channel prediction energy harvesting oscillator dynamics throughput maximization wireless power transfer discrete-time algebraic Riccati equation synchronization channel state feedback
6	Modelling and Analysis of Interconnects for Deep Submicron Systems-on-Chip Pamunuwa, Dinesh January 2003 (has links) The last few decades have been a very exciting period in thedevelopment of micro-electronics and brought us to the brink ofimplementing entire systems on a single chip, on a hithertounimagined scale. However an unforeseen challenge has croppedup in the form of managing wires, which have become the mainbottleneck in performance, masking the blinding speed of activedevices. A major problem is that increasingly complicatedeffects need to be modelled, but the computational complexityof any proposed model needs to be low enough to allow manyiterations in a design cycle. This thesis addresses the issue of closed form modelling ofthe response of coupled interconnect systems. Following astrict mathematical approach, second order models for thetransfer functions of coupled RC trees based on the first andsecond moments of the impulse response are developed. The2-pole-1-zero transfer function that is the best possible fromthe available information is obtained for the signal path fromeach driver to the output in multiple aggressor systems. Thisallows the complete response to be estimated accurately bysumming up the individual waveforms. The model represents theminimum complexity for a 2-pole-1-zero estimate, for this classof circuits. Also proposed are new techniques for the optimisation ofwires in on-chip buses. Rather than minimising the delay overeach individual wire, the configuration that maximises thetotal bandwidth over a number of parallel wires isinvestigated. It is shown from simulations that there is aunique optimal solution which does not necessarily translate tothe maximum possible number of wires, and in fact deviatesconsiderably from it when the resources available for repeatersare limited. Analytic guidelines dependent only on processparameters are derived for optimal sizing of wires andrepeaters. Finally regular tiled architectures with a commoncommunication backplane are being proposed as being the mostefficient way to implement systems-on-chip in the deepsubmicron regime. This thesis also considers the feasibility ofimplementing a regular packet-switched network-on-chip in atypical future deep submicron technology. All major physicalissues and challenges are discussed for two differentarchitectures and important limitations are identified. cross-talk interconnect modelling timing analysis transfer function on-chip bus bandwidth maximization throughput maximization repeater insertion wire optimization
7	Finite-horizon Online Energy-efficient Transmissionscheduling Schemes Forcommunication Links Bacinoglu, Tan Baran 01 January 2013 (has links) (PDF) The proliferation of embedded systems, mobile devices, wireless sensor applications and in- creasing global demand for energy directed research attention toward self-sustainable and environmentally friendly systems. In the field of communications, this new trend pointed out the need for study of energy constrained communication and networking. Particularly, in the literature, energy efficient transmission schemes have been well studied for various cases. However, fundamental results have been obtained mostly for offline problems which are not applicable to practical implementations. In contrast, this thesis focuses on online counterparts of offline transmission scheduling problems and provides a theoretical background for energy efficient online transmission schemes. The proposed heuristics, Expected Threshold and Expected Water Level policies, promise an adequate solution which can adapt to short-time-scale dynamics while being computationally efficient.
8	Modelling and Analysis of Interconnects for Deep Submicron Systems-on-Chip Pamunuwa, Dinesh January 2003 (has links) <p>The last few decades have been a very exciting period in thedevelopment of micro-electronics and brought us to the brink ofimplementing entire systems on a single chip, on a hithertounimagined scale. However an unforeseen challenge has croppedup in the form of managing wires, which have become the mainbottleneck in performance, masking the blinding speed of activedevices. A major problem is that increasingly complicatedeffects need to be modelled, but the computational complexityof any proposed model needs to be low enough to allow manyiterations in a design cycle.</p><p>This thesis addresses the issue of closed form modelling ofthe response of coupled interconnect systems. Following astrict mathematical approach, second order models for thetransfer functions of coupled RC trees based on the first andsecond moments of the impulse response are developed. The2-pole-1-zero transfer function that is the best possible fromthe available information is obtained for the signal path fromeach driver to the output in multiple aggressor systems. Thisallows the complete response to be estimated accurately bysumming up the individual waveforms. The model represents theminimum complexity for a 2-pole-1-zero estimate, for this classof circuits.</p><p>Also proposed are new techniques for the optimisation ofwires in on-chip buses. Rather than minimising the delay overeach individual wire, the configuration that maximises thetotal bandwidth over a number of parallel wires isinvestigated. It is shown from simulations that there is aunique optimal solution which does not necessarily translate tothe maximum possible number of wires, and in fact deviatesconsiderably from it when the resources available for repeatersare limited. Analytic guidelines dependent only on processparameters are derived for optimal sizing of wires andrepeaters.</p><p>Finally regular tiled architectures with a commoncommunication backplane are being proposed as being the mostefficient way to implement systems-on-chip in the deepsubmicron regime. This thesis also considers the feasibility ofimplementing a regular packet-switched network-on-chip in atypical future deep submicron technology. All major physicalissues and challenges are discussed for two differentarchitectures and important limitations are identified.</p> cross-talk interconnect modelling timing analysis transfer function on-chip bus bandwidth maximization throughput maximization repeater insertion wire optimization
9	Allocation de ressource et analyse des critères de performance dans les réseaux cellulaires coopératifs / Resource allocation and performance metrics analysis in cooperative cellular networks Maaz, Mohamad 03 December 2013 (has links) Dans les systèmes de communications sans fil, la transmission de grandes quantités d'information et à faible coût énergétique sont les deux principales questions qui n'ont jamais cessé d'attirer l'attention de la communauté scientifique au cours de la dernière décennie. Récemment, il a été démontré que la communication coopérative est une technique intéressante notamment parce qu'elle permet d'exploiter la diversité spatiale dans le canal sans fil. Cette technique assure une communication robuste et fiable, une meilleure qualité de service (QoS) et rend le concept de coopération prometteur pour les futurs générations de systèmes cellulaires. Typiquement, les QoS sont le taux d'erreurs paquet, le débit et le délai. Ces métriques sont impactées par le délai, induit par les mécanismes de retransmission Hybrid-Automatic Repeat-Request (HARQ) inhérents à la réception d'un paquet erroné et qui a un retard sur la QoS demandée. En revanche, les mécanismes HARQ créent une diversité temporelle. Par conséquent, l'adoption conjointe de la communication coopérative et des protocoles HARQ pourrait s'avérer avantageux pour la conception de schémas cross-layer. Nous proposons tout d'abord une stratégie de maximisation de débit total dans un réseau cellulaire hétérogène. Nous introduisons un algorithme qui alloue la puissance optimale à la station de base (BS) et aux relais, qui à chaque utilisateur attribue de manière optimale les sous-porteuses et les relais. Nous calculons le débit maximal atteignable ainsi que le taux d'utilisateurs sans ressources dans le réseau lorsque le nombre d'utilisateurs actifs varie. Nous comparons les performances de notre algorithme à ceux de la littérature existante, et montrons qu'un gain significatif est atteint sur la capacité globale. Dans un second temps, nous analysons théoriquement le taux d'erreurs paquet, le délai ainsi que l'efficacité de débit des réseaux HARQ coopératifs, dans le canal à évanouissements par blocs. Dans le cas des canaux à évanouissement lents, le délai moyen du mécanisme HARQ n'est pas pertinent à cause de la non-ergodicité du processus. Ainsi, nous nous intéressons plutôt à la probabilité de coupure de délai en présence d'évanouissements lents. La probabilité de coupure de délai est de première importance pour les applications sensibles au délai. Nous proposons une forme analytique de la probabilité de coupure permettant de se passer de longues simulations. Dans la suite de notre travail, nous analysons théoriquement l'efficacité énergétique (bits/joule) dans les réseaux HARQ coopératifs. Nous résolvons ensuite un problème de minimisation de l'énergie dans les réseaux coopératifs en liaison descendante. Dans ce problème, chaque utilisateur possède une contrainte de délai moyen à satisfaire de telle sorte que la contrainte sur la puissance totale du système soit respectée. L'algorithme de minimisation permet d'attribuer à chaque utilisateur la station-relai optimale et sa puissance ainsi que la puissance optimale de la BS afin de satisfaire les contraintes de délai. Les simulations montrent qu'en termes de consommation d'énergie, les techniques assistées par relais prédominent nettement les transmissions directes, dans tout système limité en délai. En conclusion, les travaux proposés dans cette thèse peuvent promettre d'établir des règles fiables pour l'ingénierie et la conception des futures générations de systèmes cellulaires énergétiquement efficaces. / In wireless systems, transmitting large amounts of information with low energetic cost are two main issues that have never stopped drawing the attention of the scientific community during the past decade. Later, it has been shown that cooperative communication is an appealing technique that exploits spatial diversity in wireless channel. Therefore, this technique certainly promises a robust and reliable communications, higher quality-of-service (QoS) and makes the cooperation concept attractive for future cellular systems. Typically, the QoS requirements are the packet error rate, throughput and delay. These metrics are affected by the delay, where each erroneous packet is retransmitted several times according to Hybrid-Automatic Repeat-Request (HARQ) mechanism inducing a delay on the demanded QoS but a temporal diversity is created. Therefore, adopting jointly cooperative communications and HARQ mechanisms could be beneficial for designing cross-layer schemes. First, a new rate maximization strategy, under heterogeneous data rate constraints among users is proposed. We propose an algorithm that allocates the optimal power at the base station (BS) and relays, assigns subcarriers and selects relays. The achievable data rate is investigated as well as the average starvation rate in the network when the load, i.e. the number of active users in the network, is increasing. It showed a significant gain in terms of global capacity compared to literature. Second, in block fading channel, theoretical analyses of the packet error rate, delay and throughput efficiency in relayassisted HARQ networks are provided. In slow fading channels, the average delay of HARQ mechanisms w.r.t. the fading states is not relevant due to the non-ergodic process of the fading channel. The delay outage is hence invoked to deal with the slow fading channel and is defined as the probability that the average delay w.r.t. AWGN channel exceeds a predefined threshold. This criterion has never been studied in literature, although being of importance for delay sensitive applications in slow fading channels. Then, an analytical form of the delay outage probability is proposed which might be useful to avoid lengthy simulations. These analyses consider a finite packet length and a given modulation and coding scheme (MCS) which leads to study the performance of practical systems. Third, a theoretical analysis of the energy efficiency (bits/joule) in relay-assisted HARQ networks is provided. Based on this analysis, an energy minimization problem in multiuser relayassisted downlink cellular networks is investigated. Each user has an average delay constraint to be satisfied such that a total power constraint in the system is respected. The BS is assumed to have only knowledge about the average channel statistics but no instantaneous channel state information (CSI). Finally, an algorithm that jointly allocates the optimal power at BS, the relay stations and selects the optimal relay in order to satisfy the delay constrains of users is proposed. The simulations show the improvement in terms of energy consumption of relay-assisted techniques compared to nonaided transmission in delay-constrained systems. Hence, the work proposed in this thesis can give useful insights for engineering rules in the design of the next generation energyefficient cellular systems. Réseaux coopératifs Hybrid-ARQ Resource allocation Energy efficiency Throughput maximization Cooperative communication Green communications Retransmission technique Quality of Service (QoS) PHY/MAC layers 621.382
10	Memory-aware algorithms : from multicores to large scale platforms / Algorithmes orientés mémoire : des processeurs multi-cœurs aux plates-formes à grande échelle Jacquelin, Mathias 20 July 2011 (has links) Cette thèse s’intéresse aux algorithmes adaptés aux architectures mémoire hiérarchiques, rencontrées notamment dans le contexte des processeurs multi-cœurs.Nous étudions d’abord le produit de matrices sur les processeurs multi-cœurs. Nous modélisons le processeur, bornons le volume de communication, présentons trois algorithmes réduisant ce volume de communication et validons leurs performances. Nous étudions ensuite la factorisation QR, dans le contexte des matrices ayant plus de lignes que de colonnes. Nous revisitons les algorithmes existants afin d’exploiter les processeurs multi-cœurs, analysons leurs chemins critiques, montrons que certains sont asymptotiquement optimaux, et analysons leurs performances.Nous étudions ensuite les applications pipelinées sur une plate-forme hétérogène, le QS 22. Nous modélisons celle-ci et appliquons les techniques d’ordonnancement en régime permanent. Nous introduisons un programme linéaire mixte permettant d’obtenir une solution optimale. Nous introduisons en outre un ensemble d’heuristiques.Puis, nous minimisons la mémoire nécessaire à une application modélisée par un arbre, sur une plate-forme à deux niveaux de mémoire. Nous présentons un algorithme optimal et montrons qu’il existe des arbres tels que les parcours postfixes sont arbitrairement mauvais. Nous étudions alors la minimisation du volume d’E/S à mémoire donnée, montrons que ce problème est NP-complet, et présentons des heuristiques. Enfin, nous comparons plusieurs politiques d’archivage pour BLUE WATERS. Nous introduisons deux politiques d’archivage améliorant les performances de la politique RAIT, modélisons la plate-forme de stockage et simulons son fonctionnement. / This thesis focus on memory-aware algorithms tailored for hierarchical memory architectures, found for instance within multicore processors. We first study the matrix product on multicore architectures. We model such a processor, and derive lower bounds on the communication volume. We introduce three ad hoc algorithms, and experimentally assess their performance.We then target a more complex operation: the QR factorization of tall matrices. We revisit existing algorithms to better exploit the parallelism of multicore processors. We thus study the critical paths of many algorithms, prove some of them to be asymptotically optimal, and assess their performance.In the next study, we focus on scheduling streaming applications onto a heterogeneous multicore platform, the QS 22. We introduce a model of the platform and use steady-state scheduling techniques so as to maximize the throughput. We present a mixed integer programming approach that computes an optimal solution, and propose simpler heuristics. We then focus on minimizing the amount of required memory for tree-shaped workflows, and target a classical two-level memory system. I/O represent transfers from a memory to the other. We propose a new exact algorithm, and show that there exist trees where postorder traversals are arbitrarily bad. We then study the problem of minimizing the I/O volume for a given memory, show that it is NP-hard, and provide a set of heuristics.Finally, we compare archival policies for BLUE WATERS. We introduce two archival policies and adapt the well known RAIT strategy. We provide a model of the tape storage platform, and use it to assess the performance of the three policies through simulation. Hiérarchies mémoire Ordonnancement Régime permanent Plates-formes hétérogènes Méthodes heuristiques Optimisation Programmes linéaires Maximisation du débit Contraintes mémoire Multicoeur Memory hierarchy Scheduling Steady-state Heterogeneous platforms Heuristics Optimization Linear algebra Throughput maximization Memory constraints Multicore

Search results