81 |
Network architectures and energy efficiency for high performance data centers / Architectures réseaux et optimisation d'énergie pour les centres de données massivesBaccour, Emna 30 June 2017 (has links)
L’évolution des services en ligne et l’avènement du big data ont favorisé l’introduction de l’internet dans tous les aspects de notre vie : la communication et l’échange des informations (exemple, Gmail et Facebook), la recherche sur le web (exemple, Google), l’achat sur internet (exemple, Amazon) et le streaming vidéo (exemple, YouTube). Tous ces services sont hébergés sur des sites physiques appelés centres de données ou data centers qui sont responsables de stocker, gérer et fournir un accès rapide à toutes les données. Tous les équipements constituants le système d’information d’une entreprise (ordinateurs centraux, serveurs, baies de stockage, équipements réseaux et de télécommunications, etc) peuvent être regroupés dans ces centres de données. Cette évolution informatique et technologique a entrainé une croissance exponentielle des centres de données. Cela pose des problèmes de coût d’installation des équipements, d’énergie, d’émission de chaleur et de performance des services offerts aux clients. Ainsi, l’évolutivité, la performance, le coût, la fiabilité, la consommation d’énergie et la maintenance sont devenus des défis importants pour ces centres de données. Motivée par ces défis, la communauté de recherche a commencé à explorer de nouveaux mécanismes et algorithmes de routage et des nouvelles architectures pour améliorer la qualité de service du centre de données. Dans ce projet de thèse, nous avons développé de nouveaux algorithmes et architectures qui combinent les avantages des solutions proposées, tout en évitant leurs limitations. Les points abordés durant ce projet sont: 1) Proposer de nouvelles topologies, étudier leurs propriétés, leurs performances, ainsi que leurs coûts de construction. 2) Conception des algorithmes de routage et des modèles pour réduire la consommation d’énergie en prenant en considération la complexité, et la tolérance aux pannes. 3) Conception des protocoles et des systèmes de gestion de file d’attente pour fournir une bonne qualité de service. 4) Évaluation des nouveaux systèmes en les comparants à d’autres architectures et modèles dans des environnements réalistes. / The increasing trend to migrate applications, computation and storage into more robust systems leads to the emergence of mega data centers hosting tens of thousands of servers. As a result, designing a data center network that interconnects this massive number of servers, and providing efficient and fault-tolerant routing service are becoming an urgent need and a challenge that will be addressed in this thesis. Since this is a hot research topic, many solutions are proposed like adapting new interconnection technologies and new algorithms for data centers. However, many of these solutions generally suffer from performance problems, or can be quite costly. In addition, devoted efforts have not focused on quality of service and power efficiency on data center networks. So, in order to provide a novel solution that challenges the drawbacks of other researches and involves their advantages, we propose to develop new data center interconnection networks that aim to build a scalable, cost-effective, high performant and QoS-capable networking infrastructure. In addition, we suggest to implement power aware algorithms to make the network energy effective. Hence, we will particularly investigate the following issues: 1) Fixing architectural and topological properties of the new proposed data centers and evaluating their performances and capacities of providing robust systems under a faulty environment. 2) Proposing routing, load-balancing, fault-tolerance and power efficient algorithms to apply on our architectures and examining their complexity and how they satisfy the system requirements. 3) Integrating quality of service. 4) Comparing our proposed data centers and algorithms to existing solutions under a realistic environment. In this thesis, we investigate a quite challenging topic where we intend, first, to study the existing models, propose improvements and suggest new methodologies and algorithms.
|
82 |
Software-defined datacenter network debuggingTammana, Praveen Aravind Babu January 2018 (has links)
Software-defined Networking (SDN) enables flexible network management, but as networks evolve to a large number of end-points with diverse network policies, higher speed, and higher utilization, abstraction of networks by SDN makes monitoring and debugging network problems increasingly harder and challenging. While some problems impact packet processing in the data plane (e.g., congestion), some cause policy deployment failures (e.g., hardware bugs); both create inconsistency between operator intent and actual network behavior. Existing debugging tools are not sufficient to accurately detect, localize, and understand the root cause of problems observed in a large-scale networks; either they lack in-network resources (compute, memory, or/and network bandwidth) or take long time for debugging network problems. This thesis presents three debugging tools: PathDump, SwitchPointer, and Scout, and a technique for tracing packet trajectories called CherryPick. We call for a different approach to network monitoring and debugging: in contrast to implementing debugging functionality entirely in-network, we should carefully partition the debugging tasks between end-hosts and network elements. Towards this direction, we present CherryPick, PathDump, and SwitchPointer. The core of CherryPick is to cherry-pick the links that are key to representing an end-to-end path of a packet, and to embed picked linkIDs into its header on its way to destination. PathDump is an end-host based network debugger based on tracing packet trajectories, and exploits resources at the end-hosts to implement various monitoring and debugging functionalities. PathDump currently runs over a real network comprising only of commodity hardware, and yet, can support surprisingly a large class of network debugging problems with minimal in-network functionality. The key contributions of SwitchPointer is to efficiently provide network visibility to end-host based network debuggers like PathDump by using switch memory as a "directory service" - each switch, rather than storing telemetry data necessary for debugging functionalities, stores pointers to end hosts where relevant telemetry data is stored. The key design choice of thinking about memory as a directory service allows to solve performance problems that were hard or infeasible with existing designs. Finally, we present and solve a network policy fault localization problem that arises in operating policy management frameworks for a production network. We develop Scout, a fully-automated system that localizes faults in a large scale policy deployment and further pin-points the physical-level failures which are most likely cause for observed faults.
|
83 |
Reduced-Order Modeling of Multiscale Turbulent Convection: Application to Data Center Thermal ManagementRambo, Jeffrey D. 27 March 2006 (has links)
Data centers are computing infrastructure facilities used by industries with large data processing needs and the rapid increase in power density of high performance computing equipment has caused many thermal issues in these facilities. Systems-level thermal management requires modeling and analysis of complex fluid flow and heat transfer processes across several decades of length scales. Conventional computational fluid dynamics and heat transfer techniques for such systems are severely limited as a design tool because their large model sizes render parameter sensitivity studies and optimization impractically slow.
The traditional proper orthogonal decomposition (POD) methodology has been reformulated to construct physics-based models of turbulent flows and forced convection. Orthogonal complement POD subspaces were developed to parametrize inhomogeneous boundary conditions and greatly extend the use of the existing POD methodology beyond prototypical flows with fixed parameters. A flux matching procedure was devised to overcome the limitations of Galerkin projection methods for the Reynolds-averaged Navier-Stokes equations and greatly improve the computational efficiency of the approximate solutions. An implicit coupling procedure was developed to link the temperature and velocity fields and further extend the low-dimensional modeling methodology to conjugate forced convection heat transfer. The overall reduced-order modeling framework was able to reduce numerical models containing 105 degrees of freedom (DOF) down to less than 20 DOF, while still retaining greater that 90% accuracy over the domain.
Rigorous a posteriori error bounds were formulated by using the POD subspace to partition the error contributions and dual residual methods were used to show that the flux matching procedure is a computationally superior approach for low-dimensional modeling of steady turbulent convection.
To efficiently model large-scale systems, individual reduced-order models were coupled using flow network modeling as the component interconnection procedure. The development of handshaking procedures between low-dimensional component models lays the foundation to quickly analyze and optimize the modular systems encountered in electronics thermal management. This modularized approach can also serve as skeletal structure to allow the efficient integration of highly-specialized models across disciplines and significantly advance simulation-based design.
|
84 |
Monitoring and analysis system for performance troubleshooting in data centersWang, Chengwei 13 January 2014 (has links)
It was not long ago. On Christmas Eve 2012, a war of troubleshooting began in Amazon data centers. It started at 12:24 PM, with an mistaken deletion of the state data of Amazon Elastic Load Balancing Service (ELB for short), which was
not realized at that time. The mistake first led to a local issue that a small number of ELB service APIs were affected. In about six minutes, it evolved into a critical one that EC2 customers were significantly affected. One example was that Netflix, which was using hundreds of Amazon ELB services, was experiencing an extensive streaming service outage when many customers could not watch TV shows or movies on Christmas Eve. It took Amazon engineers 5 hours 42 minutes to find the root cause, the mistaken deletion, and another 15 hours and 32 minutes to fully recover the ELB service. The war ended at 8:15 AM the next day and brought the performance
troubleshooting in data centers to world’s attention. As shown in this Amazon ELB case.Troubleshooting runtime performance issues is crucial in time-sensitive multi-tier cloud services because of their stringent end-to-end timing requirements, but it is also notoriously difficult and time consuming.
To address the troubleshooting challenge, this dissertation proposes VScope, a flexible monitoring and analysis system for online troubleshooting in data centers.
VScope provides primitive operations which data center operators can use to troubleshoot various performance issues. Each operation is essentially a series of monitoring and analysis functions executed on an overlay network. We design a novel
software architecture for VScope so that the overlay networks can be generated, executed and terminated automatically, on-demand. From the troubleshooting side, we design novel anomaly detection algorithms and implement them in VScope. By
running anomaly detection algorithms in VScope, data center operators are notified when performance anomalies happen. We also design a graph-based guidance approach, called VFocus, which tracks the interactions among hardware and software components in data centers. VFocus provides primitive operations by which operators can analyze the interactions to find out which components are relevant to the
performance issue.
VScope’s capabilities and performance are evaluated on a testbed with over 1000 virtual machines (VMs). Experimental results show that the VScope runtime negligibly perturbs system and application performance, and requires mere seconds to deploy monitoring and analytics functions on over 1000 nodes. This demonstrates VScope’s ability to support fast operation and online queries against a comprehensive set of application to system/platform level metrics, and a variety of representative analytics functions. When supporting algorithms with high computation complexity, VScope serves as a ‘thin layer’ that occupies no more than 5% of their total latency. Further, by using VFocus, VScope can locate problematic VMs that cannot be found
via solely application-level monitoring, and in one of the use cases explored in the dissertation, it operates with levels of perturbation of over 400% less than what is seen for brute-force and most sampling-based approaches. We also validate VFocus
with real-world data center traces. The experimental results show that VFocus has troubleshooting accuracy of 83% on average.
|
85 |
Waste heat recovery in data centers: ejector heat pump analysisHarman, Thomas David, V 24 November 2008 (has links)
The purpose of this thesis is to examine possible waste heat recovery methods in data
centers. Predictions indicate that in the next decade data center racks may dissipate 70kW of
heat, up from the current levels of 10-15kW. Due to this increase, solutions must be found to
increase the efficiency of data center cooling. This thesis will examine possible waste heat
recovery technologies which will improve energy efficiency. Possible approaches include phase
change materials, thermoelectrics, thermomagnetics, vapor compression cycles, absorbtion and
adsorbtion systems. After a thorough evaluation of the possible waste heat engines, the use of an
ejector heat pump was evaluated in detail. The principle behind an ejector heat pump is very
similar to a vapor compression cycle. However, the compressor is replaced with a pump, boiler
and an ejector. These three components require less moving parts and are more cost effective
then a comparable compressor, despite a lower efficiency. This system will be examined under
general operating conditions in a data center. The heat load is around 15-20kW and air
temperatures near 85°C. A parametric study is conducted to determine the viability and cost
effectiveness of this system in the data center. Included will be various environmentally friendly
working fluids that satisfy the low temperature ranges found in a data center. It is determined
that Ammonia presents the best option as a working fluid for this application. Using this system
a Coefficient Of Performance of 1.538 at 50°C can be realized. This will result in an estimated
373,000 kW-hr saved over a year and a $36,425 reduction in annual cost. Finally,
recommendations for implementation are considered to allow for future design and testing of this
viable waste heat recovery device.
|
86 |
Periodic Data Structures for Bandwidth-intensive ApplicationsAlbanese, Ilijc 12 January 2015 (has links)
Current telecommunication infrastructure is undergoing significant changes. Such changes involve the type of traffic traveling through the network as well as the requirements imposed by the new traffic mix (e.g. strict delay control and low end-to-end delay). In this new networking scenario, the current infrastructure, which remained almost unchanged for the last several decades, is struggling to adapt, and its limitations in terms of power consumption, scalability, and economical viability have become more evident.
In this dissertation we explore the potential advantages of using periodic data structures to handle efficiently bandwidth-intensive transactions, which constitute a significant portion of today's network traffic.
We start by implementing an approach that can work as a standalone system aiming to provide the same advantages promised by all-optical approaches such as OBS and OFS. We show that our approach is able to provide similar advantages (e.g. energy efficiency, link utilization, and low computational load for the network hardware) while avoiding the drawbacks (e.g. use of optical buffers, inefficient resource utilization, and costly deployment), using commercially available hardware.
Aware of the issues of large scale hardware redeployment, we adapt our approach to work within the current transport network architecture, reusing most of the hardware and protocols that are already in place, offering a more gradual evolutionary path, while retaining the advantages of our standalone system.
We then apply our approach to Data Center Networks (DCNs), showing its ability to achieve significant improvements in terms of network performance stability, predictability, performance isolation, agility, and goodput with respect to popular DCN approaches. We also show our approach is able to work in concert with many proposed and deployed DCN architectures, providing DCNs with a simple, efficient, and versatile protocol to handle bandwidth-intensive applications within the DCs. / Graduate
|
87 |
Design of Modularized Data Center with a Wooden Construction / Design av modulariserade datacenter med en träkonstruktionGille, Marika January 2017 (has links)
The purpose of this thesis is to investigate the possibility to build a modular data center in wood. The goals is to investigate how to build data centers using building system modules, making it easier to build more flexible data centers and expand the business later on. Investigations have been conducted to find out advantages and disadvantages for using wood in a modularized data center structure. The investigation also includes analysing the moistures effect on the material and if there are any other advantages than environmental benefits in using wood as a building material. A literature study were conducted to examine where research already have been conducted and how those studies can be applicable to this thesis. Although the ICT sector is a rapidly growing industry little research has been published in regards to how to build a data center. Most published information involves electric and cooling, not measurements of the building and how materials is affected by the special climate in a data center. As a complement to the little research interviews were conducted and site visits were made. Interviews were conducted with Hydro66, RISE SICS North, Sunet and Swedish modules, whilst site visits were made at Hydro66, RISE SICS North, Sunet and Facebook. As a result of these studies, limitations were identified with regards to maximum and minimum measurements for the building system and service spaces in a data center. These limitations were used as an input when designing a construction proposal using stated building systems and a design proposal for a data center. During the study, access have been granted to measurements of temperature and humidity for the in- and outgoing air of the Hydro66 data center. These measurements have been analyzed with the facts about HVAC systems and the climates effect on wood, for example in regards to strength and stability. This analysis has shown that more data needs to be collected during the winter and that further analysis needs to be conducted, to beable to draw conclusions if the indoor climate of a data center has an effect on the wooden structure. A design proposal for a data center have been produced with regards to the information gathered by the litterature and empirical studies. The proposal were designed to show how the information could be implemented. The result have increased the understanding on how to build data center buildings in wood and how this type of buildings could be made more flexible towards future changes through modularization.
|
88 |
Energy Efficient Cloud Computing: Techniques and ToolsKnauth, Thomas 22 April 2015 (has links) (PDF)
Data centers hosting internet-scale services consume megawatts of power. Mainly for cost reasons but also to appease environmental concerns, data center operators are interested to reduce their use of energy.
This thesis investigates if and how hardware virtualization helps to improve the energy efficiency of modern cloud data centers. Our main motivation is to power off unused servers to save energy. The work encompasses three major parts: First, a simulation-driven analysis to quantify the benefits of known reservation times in infrastructure clouds. Virtual machines with similar expiration times are co-located to increase the probability to power down unused physical hosts. Second, we propose and prototyped a system to deliver truly on-demand cloud services. Idle virtual machines are suspended to free resources and as a first step to power off the physical server. Third, a novel block-level data synchronization tool enables fast and efficient state replication. Frequent state synchronization is necessary to prevent data unavailability: powering down a server disables access to the locally attached disks and any data stored on them.
The techniques effectively reduce the overall number of required servers either through optimized scheduling or by suspending idle virtual machines. Fewer live servers translate into proportional energy savings, as the unused servers must no longer be powered.
|
89 |
Applications of Traditional and Concentrated Photovoltaic Technologies for Reducing Electricity Costs at Ontario Data CentersTomosk, Steven January 2016 (has links)
Demand for cloud-based applications and remote digital storage is increasing. As
such, data center capacities will need to expand to support this shift in computing.
Data centers consume substantial amounts of electricity in support of their
operations, and larger data centers will mean that more energy is consumed. To
reduce electricity bills, data center operators must explore innovative options, and
this thesis proposes leveraging solar technology for this purpose. Three different
photovoltaic and concentrated photovoltaic costing scenarios, as well as four
different Ontario-based electricity tariff scenarios – time-of-use, feed-in tariff, power
purchase agreement, and a peak-dependent electricity charge involving the
province’s global adjustment fee – will be used to determine if there is a business
case for using solar technology at data centers in Ontario to reduce energy costs.
Discounted net present value, return on investment, internal rate of return, and
levelized cost of electricity will be calculated to determine the economic viability of
solar for this application, and both deterministic and stochastic results will be
provided. Sensitivity of the four metrics to variability from energy yield, operations
and maintenance costs, as well as system prices will also be presented.
|
90 |
Paving the Way for Next Generation Wireless Data Center NetworksAlGhadhban, Amer M. 05 1900 (has links)
Data Centers (DCs) have become an intrinsic element of emerging technologies such as big data, artificial intelligence, cloud services; all of which entails interconnected and sophisticated computing and storage resources. Recent studies of conventional data center networks (DCNs) revealed two key challenges: a biased distribution of inter-rack traffic and unidentified flow classes: delay sensitive mice flows (MFs) and throughput-hungry elephant flows (EFs). Unfortunately, existing DCN topologies support only uniform distribution of capacities, provide limited bandwidth flexibilities and lacks of efficient flow classification mechanism.
Fortunately, wireless DCs can leverage wireless communication emerging technologies, such as multi-terabit free-space optic (FSO), to provide flexible and reconfigurable DCN topologies. It is worth noting that indoor FSO links are less vulnerable to outdoor FSO channel impairments. Consequently, indoor FSO links are more robust and can offer high bandwidths with long stability, which can further be enhanced with wavelength division multiplexing (WDM) methods. In this thesis, we alleviate the bandwidth inefficiency by FSO links that have the desired agility by allocating the transmission powers to adapt link capacity for dynamically changing traffic conditions, and to reduce the maintenance costs and overhead.
While routing the two classes along the same path causes unpleasant consequences, the DC researchers proposed traffic management solutions to treat them separately. However, the solutions either suffer from packet reordering and high queuing delay, or lack of accurate visibility and estimation on end-to-end path status. Alternatively, we leverage WDM to design elastic network topologies (i.e., part of the wavelengths are assigned to route MFs and the remaining for EFs). Since bandwidth demands can be lower than available capacity of WDM channels, we use traffic grooming to aggregate multiple flows into a larger flow and to enhance the link utilization.
On the other hand, to reap the benefits of the proposed WDM isolated topology, an accurate and fast EF detection mechanism is necessary. Accordingly, we propose a scheme that uses TCP communication behavior and collect indicative packets for its flow classification algorithm, it demonstrates perfect flow classification accuracy, and is in order of magnitudes faster than existing solutions with low communication and computation overhead.
|
Page generated in 0.0213 seconds