Return to search

Hybrid Surrogate Model for Pressure and Temperature Prediction in a Data Center and Its Application

One of the crucial challenges for Data Center (DC) operation is inefficient thermal management which leads to excessive energy waste. The information technology (IT) equipment and cooling systems of a DC are major contributors to power consumption. Additionally, failure of a DC cooling system leads to higher operating temperatures, causing critical electronic devices, such as servers, to fail which leads to significant economic loss. Improvements can be made in two ways, through (1) better design of a DC architecture and (2) optimization of the system for better heat transfer from hot servers.
Row-based cooling is a suitable DC configuration that reduces energy costs by improving airflow distribution. Here, the IT equipment is contained within an enclosure that includes a cooling unit which separates cold and back chambers to eliminate hot air recirculation and cold air bypass, both of which produce undesirable airflow distributions. Besides, due to scalability, ease of implementation, and operational cost, row-based systems have gained in popularity for DC computing applications. However, a general thermal model is required to predict spatiotemporal temperature changes inside the DC and properly apply appropriate strategies. As yet, only primitive tools have been developed that are time-consuming and provide unacceptable errors during extrapolative predictions. We address these deficiencies by developing a rapid, adaptive, and accurate hybrid model by combining a DDM and the thermofluid transport relations to predict temperatures in a DC. Our hybrid model has low interpolative prediction errors below 0.7 oC and extrapolative errors less than one half of black-box models. Additionally, by changing the studied DC configuration such as cooling unit fans and severs locations, there are a few zones with prediction error more than 2 oC.
Existing methods for cooling unit fault detection and diagnosis (FDD) are designed to successfully overcome individually occurring faults but have difficulty handling simultaneous faults. We apply a gray-box model involves a case study to detect and diagnose cooling unit fan and pump failure in a row-based DC cooling system. Fast detection of anomalous behavior saves energy and reduces operational costs by initiating remedial actions. Cooling unit fans and pumps are relatively low-reliability components, where the failure of one or more components can cause the entire system to overheat. Therefore, appropriate energy-saving strategies depend largely on the accuracy and timeliness of temperature prediction models. We used our gray-box model to produce thermal maps of the DC airspace for single as well as simultaneous failure conditions, which are fed as inputs for two different data-driven classifiers, CNN and RNN, to rapidly predict multiple simultaneous failures. Our FDD strategy can detect and diagnose multiple faults with accuracy as high as 100% while requiring relatively few simultaneous fault training data samples. / Thesis / Candidate in Philosophy

Identiferoai:union.ndltd.org:mcmaster.ca/oai:macsphere.mcmaster.ca:11375/26703
Date January 2021
CreatorsSahar Asgari
ContributorsIshwar K. Puri, Rong Zheng, Mechanical Engineering
Source SetsMcMaster University
LanguageEnglish
Detected LanguageEnglish
TypeThesis

Page generated in 0.0026 seconds