Global ETD Search

371	Parallel optimization based operational planning to enhance the resilience of large-scale power systems Gong, Lin 01 May 2020 (has links) The resilience of power systems is attracting extensive attention in recent years and needs to be further enhanced in the future, as potential threats from severe events such as extreme weather, geomagnetic storm, as well as extended fuel disruption, which are not easy to be quantified, predicted, or anticipated, are still challenging the modern power industry. To increase the resilience, proper operational planning considering potential impacts of severe events could effectively enable power systems to prepare for, operate through, and recover from those events and mitigate their negative economic, social, and humanitarian consequences by fully deploying existing system resources and operational measures. In this dissertation, operational planning problems in the bulk power system considering potential threats from severe events are focused, including the co-optimization of security-constrained unit commitment and transmission switching with consideration of transmission line outages probably caused by severe weather events, the security-constrained optimal power flow under potential impacts from geomagnetic storms, and the optimal operational planning to prevent electricity-natural gas systems from possible risks of natural gas supply disruptions. Notice that systematic, comprehensive, and consistent operational strategies should be conducted across the entire system to achieve superior resilience enhancement solution, which, along with increased size and complexity of modern energy systems, makes the proposed operational planning problems mathematically large-size and computationally complex optimization problems, and practically difficult to solve, especially when comprehensive operational measures and resourceful components are incorporated. In order to tackle such a challenge, the parallel optimization based approaches are developed in the proposed research, which fully decompose an originally large and complex problem into multiple independent small subproblems, simultaneously solve them in a fully parallel manner on scalable multiple-core computing platforms, and iteratively coordinate their results by using mathematical programming methods to achieve optimal solutions that satisfy engineering requirements of power system operations in practice. As a result, by efficiently solving optimal operational planning problems of large-scale power systems, their secure and economic operations in the presence of severe events like hurricanes, geomagnetic storms, and natural gas supply disruptions can be ensured, which indicates the resilience of power systems is effectively enhanced. Power systems operations and planning Resilience of power systems Optimization techniques Parallel algorithms and applications High-performance computing Contingency analysis
372	A Framework for Efficient Management of Fault Tolerance in Cloud Data Centres and High-Performance Computing Systems: An Investigation and Performance analysis of a Cloud Based Virtual Machine Success and Failure Rate in a typical Cloud Computing Environment and Prediction Methods Mohammed, Bashir January 2019 (has links) Cloud computing is increasingly attracting huge attention both in academic research and industry initiatives and has been widely used to solve advanced computation problem. As cloud datacentres continue to grow in scale and complexity, the risk of failure of Virtual Machines (VM) and hosts running several jobs and processing large amount of user request increases and consequently becomes even more difficult to predict potential failures within a datacentre. However, even though fault tolerance continues to be an issue of growing concern in cloud and HPC systems, mitigating the impact of failure and providing accurate predictions with enough lead time remains a difficult research problem. Traditional existing fault-tolerance strategies such as regular check-point/restart and replication are not adequate due to emerging complexities in the systems and do not scale well in the cloud due to resource sharing and distributed systems networks. In the thesis, a new reliable Fault Tolerance scheme using an intelligent optimal strategy is presented to ensure high system availability, reduced task completion time and efficient VM allocation process. Specifically, (i) A generic fault tolerance algorithm for cloud data centres and HPC systems in the cloud was developed. (ii) A verification process is developed to a fully dimensional VM specification during allocation in the presence of fault. In comparison to existing approaches, the results obtained shows an increase in success rate of the VMs, a reduction in response time of VM allocation and an improved overall performance. (iii) A failure prediction model is further developed, and the predictive capabilities of machine learning is explored by applying several algorithms to improve the accuracy of prediction. Experimental results indicate that the average prediction accuracy of the proposed model when predicting failure is about 90% accurate compared to existing algorithms, which implies that the approach can effectively predict potential system and application failures within the system. Cloud computing Fault tolerance Check-pointing Virtualisation Load balancing Virtual machine Failure Machine learning High-performance computing
373	Exploring High Performance SQL Databases with Graphics Processing Units Hordemann, Glen J. 26 November 2013 (has links) No description available. Computer Science GPU Graphics Processing Unit Database SQL SQLite Visualization Data Mining CUDA GPGPU Parallel Computing High Performance Computing NVIDIA
374	Techniques for Characterizing the Data Movement Complexity of Computations Elango, Venmugil 08 June 2016 (has links) No description available. Computer Science High-Performance Computing IO complexity Data movement complexity Data access complexity Red-blue pebble game Lower bounds
375	Rethinking I/O in High-Performance Computing Environments Ali, Nawab January 2009 (has links) No description available. Computer Science High-performance I/O Parallel file systems Leadership-class machines Remote I/O High-performance computing
376	Towards Efficient Data Analysis and Management of Semi-structured Data Tatikonda, Shirish 08 September 2010 (has links) No description available. Computer Science semi-structured data data mining data management high performance computing databases architecture-conscious techniques trees multicore systems
377	Specification, Configuration and Execution of Data-intensive Scientific Applications Kumar, Vijay Shiv 14 December 2010 (has links) No description available. Computer Science data-intensive computing high performance computing scientific workflow out-of-core multi-dimensional data semantic modeling
378	An Application-Attuned Framework for Optimizing HPC Storage Systems Paul, Arnab Kumar 19 August 2020 (has links) High performance computing (HPC) is routinely employed in diverse domains such as life sciences, and Geology, to simulate and understand the behavior of complex phenomena. Big data driven scientific simulations are resource intensive and require both computing and I/O capabilities at scale. There is a crucial need for revisiting the HPC I/O subsystem to better optimize for and manage the increased pressure on the underlying storage systems from big data processing. Extant HPC storage systems are designed and tuned for a specific set of applications targeting a range of workload characteristics, but they lack the flexibility in adapting to the ever-changing application behaviors. The complex nature of modern HPC storage systems along with the ever-changing application behaviors present unique opportunities and engineering challenges. In this dissertation, we design and develop a framework for optimizing HPC storage systems by making them application-attuned. We select three different kinds of HPC storage systems - in-memory data analytics frameworks, parallel file systems and object storage. We first analyze the HPC application I/O behavior by studying real-world I/O traces. Next we optimize parallelism for applications running in-memory, then we design data management techniques for HPC storage systems, and finally focus on low-level I/O load balance for improving the efficiency of modern HPC storage systems. / Doctor of Philosophy / Clusters of multiple computers connected through internet are often deployed in industry and laboratories for large scale data processing or computation that cannot be handled by standalone computers. In such a cluster, resources such as CPU, memory, disks are integrated to work together. With the increase in popularity of applications that read and write a tremendous amount of data, we need a large number of disks that can interact effectively in such clusters. This forms the part of high performance computing (HPC) storage systems. Such HPC storage systems are used by a diverse set of applications coming from organizations from a vast range of domains from earth sciences, financial services, telecommunication to life sciences. Therefore, the HPC storage system should be efficient to perform well for the different read and write (I/O) requirements from all the different sets of applications. But current HPC storage systems do not cater to the varied I/O requirements. To this end, this dissertation designs and develops a framework for HPC storage systems that is application-attuned and thus provides much improved performance than other state-of-the-art HPC storage systems without such optimizations. Parallel File Systems Object-Based Storage Data Management Load Balancing File System Indexing Metadata Management High Performance Computing
379	Towards the development of a reliable reconfigurable real-time operating system on FPGAs Hong, Chuan January 2013 (has links) In the last two decades, Field Programmable Gate Arrays (FPGAs) have been rapidly developed from simple “glue-logic” to a powerful platform capable of implementing a System on Chip (SoC). Modern FPGAs achieve not only the high performance compared with General Purpose Processors (GPPs), thanks to hardware parallelism and dedication, but also better programming flexibility, in comparison to Application Specific Integrated Circuits (ASICs). Moreover, the hardware programming flexibility of FPGAs is further harnessed for both performance and manipulability, which makes Dynamic Partial Reconfiguration (DPR) possible. DPR allows a part or parts of a circuit to be reconfigured at run-time, without interrupting the rest of the chip’s operation. As a result, hardware resources can be more efficiently exploited since the chip resources can be reused by swapping in or out hardware tasks to or from the chip in a time-multiplexed fashion. In addition, DPR improves fault tolerance against transient errors and permanent damage, such as Single Event Upsets (SEUs) can be mitigated by reconfiguring the FPGA to avoid error accumulation. Furthermore, power and heat can be reduced by removing finished or idle tasks from the chip. For all these reasons above, DPR has significantly promoted Reconfigurable Computing (RC) and has become a very hot topic. However, since hardware integration is increasing at an exponential rate, and applications are becoming more complex with the growth of user demands, highlevel application design and low-level hardware implementation are increasingly separated and layered. As a consequence, users can obtain little advantage from DPR without the support of system-level middleware. To bridge the gap between the high-level application and the low-level hardware implementation, this thesis presents the important contributions towards a Reliable, Reconfigurable and Real-Time Operating System (R3TOS), which facilitates the user exploitation of DPR from the application level, by managing the complex hardware in the background. In R3TOS, hardware tasks behave just like software tasks, which can be created, scheduled, and mapped to different computing resources on the fly. The novel contributions of this work are: 1) a novel implementation of an efficient task scheduler and allocator; 2) implementation of a novel real-time scheduling algorithm (FAEDF) and two efficacious allocating algorithms (EAC and EVC), which schedule tasks in real-time and circumvent emerging faults while maintaining more compact empty areas. 3) Design and implementation of a faulttolerant microprocessor by harnessing the existing FPGA resources, such as Error Correction Code (ECC) and configuration primitives. 4) A novel symmetric multiprocessing (SMP)-based architectures that supports shared memory programing interface. 5) Two demonstrations of the integrated system, including a) the K-Nearest Neighbour classifier, which is a non-parametric classification algorithm widely used in various fields of data mining; and b) pairwise sequence alignment, namely the Smith Waterman algorithm, used for identifying similarities between two biological sequences. R3TOS gives considerably higher flexibility to support scalable multi-user, multitasking applications, whereby resources can be dynamically managed in respect of user requirements and hardware availability. Benefiting from this, not only the hardware resources can be more efficiently used, but also the system performance can be significantly increased. Results show that the scheduling and allocating efficiencies have been improved up to 2x, and the overall system performance is further improved by ~2.5x. Future work includes the development of Network on Chip (NoC), which is expected to further increase the communication throughput; as well as the standardization and automation of our system design, which will be carried out in line with the enablement of other high-level synthesis tools, to allow application developers to benefit from the system in a more efficient manner. 621.39
380	A comparative analysis of the performance and deployment overhead of parallelized Finite Difference Time Domain (FDTD) algorithms on a selection of high performance multiprocessor computing systems Ilgner, Robert Georg 12 1900 (has links) Thesis (PhD)--Stellenbosch University, 2013. / ENGLISH ABSTRACT: The parallel FDTD method as used in computational electromagnetics is implemented on a variety of different high performance computing platforms. These parallel FDTD implementations have regularly been compared in terms of performance or purchase cost, but very little systematic consideration has been given to how much effort has been used to create the parallel FDTD for a specific computing architecture. The deployment effort for these platforms has changed dramatically with time, the deployment time span used to create FDTD implementations in 1980 ranging from months, to the contemporary scenario where parallel FDTD methods can be implemented on a supercomputer in a matter of hours. This thesis compares the effort required to deploy the parallel FDTD on selected computing platforms from the constituents that make up the deployment effort, such as coding complexity and time of coding. It uses the deployment and performance of the serial FDTD method on a single personal computer as a benchmark and examines the deployments of the parallel FDTD using different parallelisation techniques. These FDTD deployments are then analysed and compared against one another in order to determine the common characteristics between the FDTD implementations on various computing platforms with differing parallelisation techniques. Although subjective in some instances, these characteristics are quantified and compared in tabular form, by using the research information created by the parallel FDTD implementations. The deployment effort is of interest to scientists and engineers considering the creation or purchase of an FDTD-like solution on a high performance computing platform. Although the FDTD method has been considered to be a brute force approach to solving computational electromagnetic problems in the past, this was very probably a factor of the relatively weak computing platforms which took very long periods to process small model sizes. This thesis will describe the current implementations of the parallel FDTD method, made up of a combination of several techniques. These techniques can be easily deployed in a relatively quick time frame on computing architectures ranging from IBM’s Bluegene/P to the amalgamation of multicore processor and graphics processing unit, known as an accelerated processing unit. / AFRIKAANSE OPSOMMING: Die parallel Eindige Verskil Tyd Domein (Eng: FDTD) metode word gebruik in numeriese elektromagnetika en kan op verskeie hoë werkverrigting rekenaars geïmplementeer word. Hierdie parallele FDTD implementasies word gereeld in terme van werkverrigting of aankoop koste vergelyk, maar word bitter min sistematies oorweeg in terme van die hoeveelheid moeite wat dit geverg het om die parallele FDTD vir 'n spesifieke rekenaar argitektuur te skep. Mettertyd het die moeite om die platforms te ontplooi dramaties verander, in the 1980's het die ontplooings tyd tipies maande beloop waarteenoor dit vandag binne 'n kwessie van ure gedoen kan word. Hierdie tesis vergelyk die inspanning wat nodig is om die parallelle FDTD op geselekteerde rekenaar platforms te ontplooi deur te kyk na faktore soos die kompleksiteit van kodering en die tyd wat dit vat om 'n kode te implementeer. Die werkverrigting van die serie FDTD metode, geïmplementeer op 'n enkele persoonlike rekenaar word gebruik as 'n maatstaf om die ontplooing van die parallel FDTD met verskeie parallelisasie tegnieke te evalueer. Deur hierdie FDTD ontplooiings met verskillende parallelisasie tegnieke te ontleed en te vergelyk word die gemeenskaplike eienskappe bepaal vir verskeie rekenaar platforms. Alhoewel sommige gevalle subjektief is, is hierdie eienskappe gekwantifiseer en vergelyk in tabelvorm deur gebruik te maak van die navorsings inligting geskep deur die parallel FDTD implementasies. Die ontplooiings moeite is belangrik vir wetenskaplikes en ingenieurs wat moet besluit tussen die ontwikkeling of aankoop van 'n FDTD tipe oplossing op 'n höe werkverrigting rekenaar. Hoewel die FDTD metode in die verlede beskou was as 'n brute krag benadering tot die oplossing van elektromagnetiese probleme was dit waarskynlik weens die relatiewe swak rekenaar platforms wat lank gevat het om klein modelle te verwerk. Hierdie tesis beskryf die moderne implementering van die parallele FDTD metode, bestaande uit 'n kombinasie van verskeie tegnieke. Hierdie tegnieke kan maklik in 'n relatiewe kort tydsbestek ontplooi word op rekenaar argitekture wat wissel van IBM se BlueGene / P tot die samesmelting van multikern verwerkers en grafiese verwerkings eenhede, beter bekend as 'n versnelde verwerkings eenheid. Computational electromagnetics FDTD High performance computing GPU Computer electromagnetism Finite differences Time-domain analysis

Search results