Spelling suggestions: "subject:"ppc"" "subject:"dppc""
71 |
Enabling Development of OpenCL Applications on FPGA platformsShagrithaya, Kavya Subraya 17 September 2012 (has links)
FPGAs can potentially deliver tremendous acceleration in high-performance server and embedded computing applications. Whether used to augment a processor or as a stand-alone device, these reconfigurable architectures are being deployed in a large number of implementations owing to the massive amounts of parallelism offered. At the same time, a significant challenge encountered in their wide-spread acceptance is the laborious efforts required in programming these devices. The increased development time, level of experience needed by the developers, lower turns per day and difficulty involved in faster iterations over designs affect the time-to-market for many solutions. High-level synthesis aims towards increasing the productivity of FPGAs and bringing them within the reach software developers and domain experts. OpenCL is a specification introduced for parallel programming purposes across platforms. Applications written in OpenCL consist of two parts - a host program for initialization and management, and kernels that define the compute intensive tasks. In this thesis, a compilation flow to generate customized application-specific hardware descriptions from OpenCL computation kernels is presented. The flow uses Xilinx AutoESL tool to obtain the design specification for compute cores. An architecture provided integrates the cores with memory and host interfaces. The host program in the application is compiled and executed to demonstrate a proof-of-concept implementation towards achieving an end-to-end flow that provides abstraction of hardware at the front-end. / Master of Science
|
72 |
Optimizations for Deep Learning-Based CT Image EnhancementChaturvedi, Ayush 04 March 2024 (has links)
Computed tomography (CT) combined with deep learning (DL) has recently shown great potential in biomedical imaging. Complex DL models with varying architectures inspired by the human brain are improving imaging software and aiding diagnosis. However, the accuracy of these DL models heavily relies on the datasets used for training, which often contain low-quality CT images from low-dose CT (LDCT) scans. Moreover, in contrast to the neural architecture of the human brain, DL models today are dense and complex, resulting in a significant computational footprint. Therefore, in this work, we propose sparse optimizations to minimize the complexity of the DL models and leverage architecture-aware optimization to reduce the total training time of these DL models. To that end, we leverage a DL model called DenseNet and Deconvolution Network (DDNet). The model enhances LDCT chest images into high-quality (HQ) ones but requires many hours to train. To further improve the quality of final HQ images, we first modified DDNet's architecture with a more robust multi-level VGG (ML-VGG) loss function to achieve state-of-the-art CT image enhancement. However, improving the loss function results in increased computational cost. Hence, we introduce sparse optimizations to reduce the complexity of the improved DL model and then propose architecture-aware optimizations to efficiently utilize the underlying computing hardware to reduce the overall training time. Finally, we evaluate our techniques for performance and accuracy using state-of-the-art hardware resources. / Master of Science / Deep learning-based (DL) techniques that leverage computed tomography (CT) are becoming omnipresent in diagnosing diseases and abnormalities associated with different parts of the human body. However, their diagnostic accuracy is directly proportional to the quality of the CT images used in training the DL models, which is majorly governed by the radiation dose of the X-ray in the CT scanner. To improve the quality of low-dose CT (LDCT) images, DL-based techniques show promising improvements. However, these techniques require substantial computational resources and time to train the DL models. Therefore, in this work, we incorporate algorithmic techniques inspired by sparse neural architecture of the human brain to reduce the complexity of such DL models. To that end, we leverage a DL model called DenseNet and Deconvolution Network (DDNet) that enhances the quality of CT images generated by low X-ray dosage into high-quality CT images. However, due to its architecture, it takes hours to train DDNet on state-of-the-art hardware resources.
Hence, in this work, we propose techniques that efficiently utilize the hardware resources and reduce the time required to train DDNet. We evaluate the efficacy of our techniques on modern supercomputers in terms of speed and accuracy.
|
73 |
Implementierung des Genom-Alignments auf modernen hochparallelen Plattformen / Implementing Genome Alignment Algorithms on Highly Parallel PlatformsKnodel, Oliver 26 March 2014 (has links) (PDF)
Durch die wachsende Bedeutung der DNS-Sequenzierung wurden die Geräte zur Sequenzierung weiterentwickelt und ihr Durchsatz so erhöht, dass sie Millionen kurzer Nukleotidsequenzen innerhalb weniger Tage liefern. Moderne Algorithmen und Programme, welche die dadurch entstehenden großen Datenmengen in akzeptabler Zeit verarbeiten können, ermitteln jedoch nur einen Bruchteil der Positionen der Sequenzen in bekannten Datenbanken. Eine derartige Suche ist eine der wichtigsten Aufgaben in der modernen Molekularbiologie. Diese Arbeit untersucht mögliche Übertragungen moderner Genom-Alignment Programme auf hochparallele Plattformen wie FPGA und GPU.
Die derzeitig an das Problem angepassten Programme und Algorithmen werden untersucht und hinsichtlich ihrer Parallelisierbarkeit auf den beiden Plattformen FPGA und GPU analysiert. Nach einer Bewertung der Alternativen erfolgt die Auswahl eines Algorithmus. Anschließend wird dessen Übertragung auf die beiden Plattformen entworfen und implementiert. Dabei stehen die Geschwindigkeit der Suche, die Anzahl der ermittelten Positionen sowie die Nutzbarkeit im Vordergrund.
Der auf der GPU implementierte reduzierte Smith & Waterman-Algorithmus ist effizient an die Problemstellung angepasst und erreicht für kurze Sequenzen höhere Geschwindigkeiten als bisherige Realisierungen auf Grafikkarten. Eine vergleichbare Umsetzung auf dem FPGA benötigt eine deutlich geringere Laufzeit, findet ebenfalls jede Position in der Datenbank und erreicht dabei ähnliche Geschwindigkeiten wie moderne leistungsfähige Programme, die aber heuristisch arbeiten. Die Anzahl der gefundenen Positionen ist bei FPGA und GPU damit mehr als doppelt so hoch wie bei sämtlichen vergleichbaren Programmen. / Further developments of DNA sequencing devices produce millions of short nucleotide sequences. Finding the positions of these sequences in databases of known sequences is an important problem in modern molecular biology. Current heuristic algorithms and programs only find a small fraction of these positions. In this thesis genome alignment algorithms are implemented on massively parallel platforms as FPGA and GPU.
The next generation sequencing technologies that are currently in use are reviewed regarding their possible parallelization on FPGA and GPU. After evaluation one algorithm is chosen for parallelization. Its implementation on both platforms is designed and realized. Runtime, accuracy as well as usability are important features of the implementation.
The reduced Smith & Waterman algorithm which is realized on the GPU outperforms similar GPU programs in speed and efficiency for short sequences. The runtime of the FPGA approach is similar to those of widely used heuristic software mappers and much lower than on the GPU. Furthermore the FPGA guarantees to find all alignment positions of a sequence in the database, which is more than twice the number that is found by comparable software algorithms.
|
74 |
Techniques de simulation rapide quasi cycle-précise pour l'exploration d'architectures multicoeur / Fast Cycle-approximate Simulation Techniques for Manycore Architecture ExplorationButko, Anastasiia 11 December 2015 (has links)
Le calcul intensif joue un rôle moteur de premier plan pour de nombreux domaines scientifiques. La croissance en puissance crête des supercalculateurs a évolué du téraflops au pétaflops en l'espace d'une décennie. Toutefois, la consommation d'énergie associée extrêmement élevée ainsi que le coût associé ont motivé des recherches vers des technologies plus efficaces énergétiquement comme l'utilisation de processeurs issus du domaine des systèmes embarqués à faible puissance.Selon les prévisions, les systèmes multicœurs émergents seront constitués de centaines de cœurs d'ici la fin de la décennie. Cette évolution nécessite des solutions efficaces pour l'exploration de l'espace de conception et le débogage. Les simulateurs industriels et académiques disponibles à ce jour diffèrent en termes de compromis entre vitesse de simulation et précision. Leur adoption est généralement définie par le niveau d'exploration souhaité. Les simulateurs quasi cycle-précis sont populaires et attrayants pour l'exploration architecturale. Alors que la vitesse de simulation est trivialement observée, le niveau de précision de ces simulateurs reste souvent flou. En outre, bien que permettant une évaluation flexible et détaillée de l'architecture, les simulateurs quasi cycle-précis entraînent des vitesses de simulation lentes ce qui limite leur champ d'application pour des systèmes avec des centaines de cœurs. Cela exige des approches alternatives capables de fournir des simulations rapides tout en préservant une précision élevée ce qui est cruciale pour l'exploration architecturale.Dans cette thèse, des modèles d'architectures multicœurs complexes ont été développés et évalués en utilisant des systèmes de simulation quasi cycle-précis pour l'exploration de la performance et de la puissance. Sur cette base, une approche hybride orientée traces d'exécution a été proposée pour permettre une exploration rapide, flexible et précise des architectures multicœurs à grande échelle. Sur la base de l'environnement de simulation proposé, plusieurs configurations de systèmes manycoeurs ont été construites et évaluées en évaluant le passage à l'échelle des performances. Enfin, des configurations alternatives d'architectures multicœurs hétérogènes ont été proposées et ont montré des améliorations significatives en termes d'efficacité énergétique. / Since the computational needs precipitously grow each year, HPC technology becomes a driving force for numerous scientific and consumer areas. The most powerful supercomputer has been progressing from TFLOPS to PFLOPS throughout the last ten years. However, the extremely high power consumption and therefore the high cost pushed researchers to explore more energy-efficient technologies, such as the use of low-power embedded SoCs.The evolution of emerging manycore systems, forecasted to feature hundreds of cores by the end of the decade calls for efficient solutions for the design space exploration and debugging. Available industrial and academic simulators differ in terms of simulation speed/accuracy trade-offs. Cycle-approximate simulators are popular and attractive for architectural exploration. Even though enabling flexible and detailed architecture evaluation, cycle-approximate simulators entail slow simulation speeds, thereby limiting their scope of applicability for systems with hundreds of cores. This calls for alternative approaches capable of providing high simulation speed while preserving accuracy that is crucial to architectural exploration.In this thesis, we evaluate cycle-approximate simulation techniques for fast and accurate exploration of multi- and manycore architectures. Expecting to significantly reduce simulation time still preserving the accuracy at the cycle-approximate level, we propose a hybrid trace-oriented approach to enable flexible manycore architecture simulation. We design a set of simulation techniques to overcome the main weaknesses of the trace-oriented approach. The trace synchronization technique aims to manage control and data dependencies arising from the abstraction of processor cores. The trace replication technique is proposed to simulate manycore architectures using a finite set of pre-collected traces. The computation phase scaling technique is designed to enable flexible switching between multiple processor models without considering microarchitectural difference but taking into account the computation speed ratio. Based on the proposed simulation environment, we explore several manycore architectures in terms of performance and energy-efficiency trade-offs.
|
75 |
Preparação e caracterização de eletrólitos sólidos poliméricos a partir dos derivados de celulose - hidroxietilcelulose e hidroxipropilcelulose / Preparation and characterization of solid polymeric electrolytes based on hydroxypropylcellulose and hydroxyethlcelluloseMachado, Gilmara de Oliveira 19 April 2004 (has links)
Esta tese apresenta os resultados da obtenção de eletrólitos sólidos poliméricos a partir dos derivados de celulose - hidroxipropilcelulose (HPC) e hidroxietilcelulose (HEC), ambas comerciais. Para atingir os objetivos do projeto, os dois derivados passaram por diferentes processos sendo que a HEC foi modificada fisicamente por meio de plastificação com glicerol e HPC foi alterada quimicamente. A transformação química consistiu nas reações de oxidação de grupos hidroxila da HPC em grupos cetona que, em seguida, foram submetidos as reações de enxertia com diamina de poli(óxido de propileno) [Jeffamina] resultando em redes por meio de ligações imina. A adição do sal perclorato de lítio, em diferentes concentrações, na matriz plastificada ou entrecruzada, resultou na obtenção de eletrólitos sólidos poliméricos, todos na forma de filmes. A caracterização destes eletrólitos foi realizada com técnicas básicas de caracterização de materiais como: análises térmicas (DSC, TGA), análise térmica dinâmico-mecânica (DMTA), análises estruturais (raios-X), medidas espectroscópicas (IR, UVNIS), análise elementar, microscopia eletrônica de varredura (SEM), e, como a mais importante, medidas de condutividade iônica utilizando a técnica de espectroscopia de impedância eletroquímica (EIS). / The present thesis reports the preparation and characterization of new types of solid polymeric electrolytes (SPE) based on cellulose derivatives such as hydroxypropylcellulose (HPC) and hydroxyethylcellulose (HEC), both commercial products. Aiming to reach this purpose both derivatives were subjected to modification processes, where HEC were physically modified by plasticization process with glycerol and HPC were submitted to chemical reactions. The latter ones were promoted by the oxidation of HPC hydroxyl groups and ketone groups and then subjected to grafting with diamine poly(propylene oxide) (Jeffamine), resulting in the imine bond network formation. Different concentrations of lithium salt were added to the plasticized and grafted samples, resulting in solid polymeric electrolytes, all in the film form. The characterization of these samples was performed by thermal analysis (DSC, TGA and DMTA), X-ray diffraction (XDR), scanning electron microscopy (SEM), ultraviolet/visible/near-infrared spectroscopy (UV/Vis/NIR) and, as most important, measured of ionic conductivity using the technique of electrochemical impedance spectroscopy (EIS).
|
76 |
Workload Traces Analysis and Replay in Large Scale Distributed Systems / Analyse de rejeu de traces de charge dans les grands systèmes de calcul distribuésEmeras, Joseph 01 October 2013 (has links)
L'auteur n'a pas fourni de résumé en français / High Performance Computing is preparing the era of the transition from Petascale to Exascale. Distributed computing systems are already facing new scalability problems due to the increasing number of computing resources to manage. It is now necessary to study in depth these systems and comprehend their behaviors, strengths and weaknesses to better build the next generation.The complexity of managing users applications on the resources conducted to the analysis of the workload the platform has to support, this to provide them an efficient service.The need for workload comprehension has lead to the collection of traces from production systems and to the proposal of a standard workload format. These contributions enabled the study of numerous of these traces. This also lead to the construction of several models, based on the statistical analysis of the different workloads from the collection.Until recently, existing workload traces did not enabled researchers to study the consumption of resources by the jobs in a temporal way. This is now changing with the need for characterization of jobs consumption patterns.In the first part of this thesis we propose a study of existing workload traces. Then we contribute with an observation of cluster workloads with the consideration of the jobs resource consumptions over time. This highlights specific and unattended patterns in the usage of resources from users.Finally, we propose an extension of the former standard workload format that enables to add such temporal consumptions without loosing the benefit of the existing works.Experimental approaches based on workload models have also served the goal of distributed systems evaluation. Existing models describe the average behavior of observed systems.However, although the study of average behaviors is essential for the understanding of distributed systems, the study of critical cases and particular scenarios is also necessary. This study would give a more complete view and understanding of the performance of the resources and jobs management. In the second part of this thesis we propose an experimental method for performance evaluation of distributed systems based on the replay of production workload trace extracts. These extracts, replaced in their original context, enable to experiment the change of configuration of the system in an online workload and observe the different configurations results. Our technical contribution in this experimental approach is twofold. We propose a first tool to construct the environment in which the experimentation will take place, then we propose a second set of tools that automatize the experiment setup and that replay the trace extract within its original context.Finally, these contributions conducted together, enable to gain a better knowledge of HPC platforms. As future works, the approach proposed in this thesis will serve as a basis to further study larger infrastructures.
|
77 |
A Reproducible Research Methodology for Designing and Conducting Faithful Simulations of Dynamic HPC Applications / Méthodologie de recherche reproductible adaptée à la conception et à la conduite de simulations d'applications scientifique multitâche dynamiquesStanisic, Luka 30 October 2015 (has links)
L'évolution de l'informatique haute performance s'est réorientée au cours de cette dernière décennie. L'importante consommation énergétique des plates-formes modernes limite fortement la miniaturisation et l'augmentation des fréquences des processeurs. Cette contrainte énergétique a poussé les fabricants de matériels à développer de nombreuses architectures alternatives afin de répondre au besoin croissant de performance imposé par la communauté scientifique. Cependant, programmer efficacement sur une telle diversité de plate-formes et exploiter l'intégralité des ressources qu'elles offrent s'avère d'une grande difficulté. La tendance générale de conception d'application haute performance, basée sur un gros code monolithique offrant de nombreuses opportunités d'optimisation, est ainsi devenu de plus en plus difficile à appliquer en raison de la difficulté d'implémentation et de maintenance de ces codes complexes. Par conséquent, les développeurs de telles applications considèrent maintenant une approche plus modulaire et une exécution dynamique de celles-ci. Une approche populaire est d'implémenter ces applications à plus haut niveau, indépendamment de l'architecture matérielle, suivant un graphe de tâches où chacune d'entre elles correspond à un noyau de calcul soigneusement optimisé pour chaque architecture. Un système de runtime peut ensuite être utilisé pour ordonnancer dynamiquement ces tâches sur les ressources de calcul.Développer ces solutions et assurer leur bonne performance sur un large spectre de configurations reste un défit majeur. En raison de la grande complexité du matériel, de la variabilité des temps d'exécution des calculs et de la dynamicité d'ordonnancement des tâches, l'exécution des applications n'est pas déterministe et l'évaluation de la performance de ces systèmes est très difficile. Par conséquent, il y a un besoin de méthodes systématiques et reproductibles pour la conduite de recherche ainsi que de techniques d'évaluation de performance fiables pour étudier ces systèmes complexes.Dans cette thèse, nous montrons qu'il est possible de mettre en place une étude propre, cohérente et reproductible, par simulation, d'applications dynamiques. Nous proposons une méthode de travail unique basée sur deux outils connus, Git et Org-mode, pour la conduite de recherche expérimentale reproductible. Cette méthode simple permet une résolution pragmatique de problèmes comme le suivi de la provenance ou la réplication de l'analyse des données. Notre contribution à l'évaluation de performance des applications dynamiques consiste au design et à la validation de simulation/émulation hybride gros-grain de StarPU, un runtime dynamique basé sur un graphe de tâches pour architecture hybride, au dessus de SimGrid, un simulateur polyvalent pour systèmes distribués. Nous présentons comment notre solution permet l'obtention de prédictions fiables de performances d'exécutions réelles dans un large panel de machines hétérogènes sur deux classes de programme différentes, des applications d'algèbre linéaire dense et creuse, qui sont représentatives des applications scientifiques. / The evolution of High-Performance Computing systems has taken asharp turn in the last decade. Due to the enormous energyconsumption of modern platforms, miniaturization and frequencyscaling of processors have reached a limit. The energy constraintshas forced hardware manufacturers to develop alternative computerarchitecture solutions in order to manage answering the ever-growingneed of performance imposed by the scientists and thesociety. However, efficiently programming such diversity ofplatforms and fully exploiting the potentials of the numerousdifferent resources they offer is extremely challenging. Thepreviously dominant trend for designing high performanceapplications, which was based on large monolithic codes offeringmany optimization opportunities, has thus become more and moredifficult to apply since implementing and maintaining such complexcodes is very difficult. Therefore, application developersincreasingly consider modular approaches and dynamic applicationexecutions. A popular approach is to implement the application at ahigh level independently of the hardware architecture as DirectedAcyclic Graphs of tasks, each task corresponding to carefullyoptimized computation kernels for each architecture. A runtimesystem can then be used to dynamically schedule those tasks on thedifferent computing resources.Developing such solutions and ensuring their good performance on awide range of setups is however very challenging. Due to the highcomplexity of the hardware, to the duration variability of theoperations performed on a machine and to the dynamic scheduling ofthe tasks, the application executions are non-deterministic and theperformance evaluation of such systems is extremelydifficult. Therefore, there is a definite need for systematic andreproducible methods for conducting such research as well asreliable performance evaluation techniques for studying thesecomplex systems.In this thesis, we show that it is possible to perform a clean,coherent, reproducible study, using simulation, of dynamic HPCapplications. We propose a unique workflow based on two well-knownand widely-used tools, Git and Org-mode, for conducting areproducible experimental research. This simple workflow allows forpragmatically addressing issues such as provenance tracking and dataanalysis replication. Our contribution to the performance evaluationof dynamic HPC applications consists in the design and validation ofa coarse-grain hybrid simulation/emulation of StarPU, a dynamictask-based runtime for hybrid architectures, over SimGrid, aversatile simulator for distributed systems. We present how thistool can achieve faithful performance predictions of nativeexecutions on a wide range of heterogeneous machines and for twodifferent classes of programs, dense and sparse linear algebraapplications, that are a good representative of the real scientificapplications.
|
78 |
On Scalable Reconfigurable Component Models for High-Performance Computing / Modèles à composants reconfigurables et passant à l'échelle pour le calcul haute performanceLanore, Vincent 10 December 2015 (has links)
La programmation à base de composants est un paradigme de programmation qui facilite la réutilisation de code et la séparation des préoccupations. Les modèles à composants dits « reconfigurables » permettent de modifier en cours d'exécution la structure d'une application. Toutefois, ces modèles ne sont pas adaptés au calcul haute performance (HPC) car ils reposent sur des mécanismes ne passant pas à l'échelle.L'objectif de cette thèse est de fournir des modèles, des algorithmes et des outils pour faciliter le développement d'applications HPC reconfigurables à base de composants. La principale contribution de la thèse est le modèle à composants formel DirectMOD qui facilite l'écriture et la réutilisation de code de transformation distribuée. Afin de faciliter l'utilisation de ce premier modèle, nous avons également proposé :• le modèle formel SpecMOD qui permet la spécialisation automatique d'assemblage de composants afin de fournir des fonctionnalités de génie logiciel de haut niveau ; • des mécanismes de reconfiguration performants à grain fin pour les applications AMR, une classe d'application importante en HPC.Une implémentation de DirectMOD, appelée DirectL2C, a été réalisée et a permis d'implémenter une série de benchmarks basés sur l'AMR pour évaluer notre approche. Des expériences sur grappes de calcul et supercalculateur montrent que notre approche passe à l'échelle. De plus, une analyse quantitative du code produit montre que notre approche est compacte et facilite la réutilisation. / Component-based programming is a programming paradigm which eases code reuse and separation of concerns. Some component models, which are said to be "reconfigurable", allow the modification at runtime of an application's structure. However, these models are not suited to High-Performance Computing (HPC) as they rely on non-scalable mechanisms.The goal of this thesis is to provide models, algorithms and tools to ease the development of component-based reconfigurable HPC applications.The main contribution of the thesis is the DirectMOD component model which eases development and reuse of distributed transformations. In order to improve on this core model in other directions, we have also proposed:• the SpecMOD formal component model which allows automatic specialization of hierarchical component assemblies and provides high-level software engineering features;• mechanisms for efficient fine-grain reconfiguration for AMR applications, an important application class in HPC.An implementation of DirectMOD, called DirectL2C, as been developed so as to implement a series of benchmarks to evaluate our approach. Experiments on HPC architectures show our approach scales. Moreover, a quantitative analysis of the benchmark's codes show that our approach is compact and eases reuse.
|
79 |
Preparação e caracterização de eletrólitos sólidos poliméricos a partir dos derivados de celulose - hidroxietilcelulose e hidroxipropilcelulose / Preparation and characterization of solid polymeric electrolytes based on hydroxypropylcellulose and hydroxyethlcelluloseGilmara de Oliveira Machado 19 April 2004 (has links)
Esta tese apresenta os resultados da obtenção de eletrólitos sólidos poliméricos a partir dos derivados de celulose - hidroxipropilcelulose (HPC) e hidroxietilcelulose (HEC), ambas comerciais. Para atingir os objetivos do projeto, os dois derivados passaram por diferentes processos sendo que a HEC foi modificada fisicamente por meio de plastificação com glicerol e HPC foi alterada quimicamente. A transformação química consistiu nas reações de oxidação de grupos hidroxila da HPC em grupos cetona que, em seguida, foram submetidos as reações de enxertia com diamina de poli(óxido de propileno) [Jeffamina] resultando em redes por meio de ligações imina. A adição do sal perclorato de lítio, em diferentes concentrações, na matriz plastificada ou entrecruzada, resultou na obtenção de eletrólitos sólidos poliméricos, todos na forma de filmes. A caracterização destes eletrólitos foi realizada com técnicas básicas de caracterização de materiais como: análises térmicas (DSC, TGA), análise térmica dinâmico-mecânica (DMTA), análises estruturais (raios-X), medidas espectroscópicas (IR, UVNIS), análise elementar, microscopia eletrônica de varredura (SEM), e, como a mais importante, medidas de condutividade iônica utilizando a técnica de espectroscopia de impedância eletroquímica (EIS). / The present thesis reports the preparation and characterization of new types of solid polymeric electrolytes (SPE) based on cellulose derivatives such as hydroxypropylcellulose (HPC) and hydroxyethylcellulose (HEC), both commercial products. Aiming to reach this purpose both derivatives were subjected to modification processes, where HEC were physically modified by plasticization process with glycerol and HPC were submitted to chemical reactions. The latter ones were promoted by the oxidation of HPC hydroxyl groups and ketone groups and then subjected to grafting with diamine poly(propylene oxide) (Jeffamine), resulting in the imine bond network formation. Different concentrations of lithium salt were added to the plasticized and grafted samples, resulting in solid polymeric electrolytes, all in the film form. The characterization of these samples was performed by thermal analysis (DSC, TGA and DMTA), X-ray diffraction (XDR), scanning electron microscopy (SEM), ultraviolet/visible/near-infrared spectroscopy (UV/Vis/NIR) and, as most important, measured of ionic conductivity using the technique of electrochemical impedance spectroscopy (EIS).
|
80 |
A Unified Infrastructure for Monitoring and Tuning the Energy Efficiency of HPC ApplicationsSchöne, Robert 07 November 2017 (has links) (PDF)
High Performance Computing (HPC) has become an indispensable tool for the scientific community to perform simulations on models whose complexity would exceed the limits of a standard computer. An unfortunate trend concerning HPC systems is that their power consumption under high-demanding workloads increases. To counter this trend, hardware vendors have implemented power saving mechanisms in recent years, which has increased the variability in power demands of single nodes. These capabilities provide an opportunity to increase the energy efficiency of HPC applications. To utilize these hardware power saving mechanisms efficiently, their overhead must be analyzed. Furthermore, applications have to be examined for performance and energy efficiency issues, which can give hints for optimizations. This requires an infrastructure that is able to capture both, performance and power consumption information concurrently. The mechanisms that such an infrastructure would inherently support could further be used to implement a tool that is able to do both, measuring and tuning of energy efficiency.
This thesis targets all steps in this process by making the following contributions: First, I provide a broad overview on different related fields. I list common performance measurement tools, power measurement infrastructures, hardware power saving capabilities, and tuning tools. Second, I lay out a model that can be used to define and describe energy efficiency tuning on program region scale. This model includes hardware and software dependent parameters. Hardware parameters include the runtime overhead and delay for switching power saving mechanisms as well as a contemplation of their scopes and the possible influence on application performance. Thus, in a third step, I present methods to evaluate common power saving mechanisms and list findings for different x86 processors. Software parameters include their performance and power consumption characteristics as well as the influence of power-saving mechanisms on these. To capture software parameters, an infrastructure for measuring performance and power consumption is necessary. With minor additions, the same infrastructure can later be used to tune software and hardware parameters. Thus, I lay out the structure for such an infrastructure and describe common components that are required for measuring and tuning. Based on that, I implement adequate interfaces that extend the functionality of contemporary performance measurement tools. Furthermore, I use these interfaces to conflate performance and power measurements and further process the gathered information for tuning. I conclude this work by demonstrating that the infrastructure can be used to manipulate power-saving mechanisms of contemporary x86 processors and increase the energy efficiency of HPC applications.
|
Page generated in 0.0564 seconds