Spelling suggestions: "subject:"highperformance"" "subject:"highperformance""
441 |
An Object Oriented and High Performance Platform for Aerothermodynamics SimulationLani, Andrea 04 December 2008 (has links)
This thesis presents the author's contribution
to the design and implementation of COOLFluiD,
an object oriented software platform for
the high performance simulation of multi-physics phenomena on unstructured grids. In this context, the final goal has been to provide a reliable tool for handling high speed aerothermodynamic
applications. To this end, we introduce a number of design techniques that have been developed in order to provide the framework with flexibility
and reusability, allowing developers to easily integrate new functionalities such as arbitrary mesh-based data structures, numerical algorithms (space discretizations, time stepping schemes, linear system solvers, ...),and physical models.
Furthermore, we describe the parallel algorithms
that we have implemented in order to efficiently
read/write generic computational meshes involving
millions of degrees of freedom and partition them
in a scalable way: benchmarks on HPC clusters with
up to 512 processors show their effective suitability for large scale computing.
Several systems of partial differential equations,
characterizing flows in conditions of thermal and
chemical equilibrium (with fixed and variable elemental fractions)and, particularly, nonequilibrium (multi-temperature models)
have been integrated in the framework.
In order to simulate such flows, we have developed
two state-of-the-art flow solvers:
1- a parallel implicit 2D/3D steady and unsteady cell-centered Finite Volume (FV) solver for arbitrary systems of PDE's on hybrid unstructured meshes;
2- a parallel implicit 2D/3D steady vertex-centered Residual Distribution (RD) solver for arbitrary systems of PDE's on meshes with simplex elements (triangles and tetrahedra).
The FV~code has been extended to handle all
the available physical models, in regimes ranging from incompressible to hypersonic.
As far as the RD code is concerned, the strictly conservative variant of the RD method, denominated CRD, has been applied for the first time in literature to solve high speed viscous flows in thermochemical nonequilibrium, yielding some preliminary outstanding results on a challenging double cone flow simulation.
All the developments have been validated on real-life testcases of current interest in the aerospace community. A quantitative comparison with experimental measurements and/or literature has been performed whenever possible.
|
442 |
Betydelsen av prestationsbaserad självkänsla för utbränning bland prestationssträvande högpresterareAndersson, Åsa January 2008 (has links)
Utbränning är ett högaktuellt ämne i dagens samhälle där fokus ligger på individens prestationer. Denna studies syfte var att undersöka om högpresterande individer bygger sin självkänsla på prestationer och om detta i sin tur har betydelse för utbränning. Ytterligare syfte var att undersöka om utbränning bland högpresterare med prestationsbaserad självkänsla var lika utbrett bland båda könen. I undersökningen deltog 66 tjänstemän från ett tillverkningsföretag i Mellansverige. En enkät sammanställdes av de tre befintliga och utprövade skalorna; Karolinskas utbrändhetsformulär, The Performance Based Self-esteem Scale samt en del av The Jenkins Activity Survey. Resultaten visade att prestationsbaserad självkänsla predicerade högpresterare samt att högpresterare predicerade utbränning. Även att vara kvinna predicerade utbränning. Detta visar vikten av att ta hänsyn till personalens individuella skillnader.
|
443 |
Foundations for Automatic, Adaptable CompilationJanuary 2011 (has links)
Computational science demands extreme performance because the running time of an application often determines the size of the experiment that a scientist can reasonably compute. Unfortunately, traditional compiler technology is ill-equipped to harness the full potential of today's computing platforms, forcing scientists to spend time manually tuning their application's performance. Although improving compiler technology should alleviate this problem, two challenges obstruct this goal: hardware platforms are rapidly changing and application software is difficult to statically model and predict. To address these problems, this thesis presents two techniques that aim to improve a compiler's adaptability: automatic resource characterization and selective, dynamic optimization. Resource characterization empirically measures a system's performance-critical characteristics, which can be provided to a parameterized compiler that specializes programs accordingly. Measuring these characteristics is important, because a system's physical characteristics do not always match its observed characteristics. Consequently, resource characterization provides an empirical performance model of a system's actual behavior, which is better suited for guiding compiler optimizations than a purely theoretical model. This thesis presents techniques for determining a system's data cache and TLB capacity, line size, and associativity, as well as instruction-cache capacity. Even with a perfect architectural-model, compilers will still often generate suboptimal code because of the difficulty in statically analyzing and predicting a program's behavior. This thesis presents two techniques that enable selective, dynamic-optimization for cases in which static compilation fails to deliver adequate performance. First, intermediate-representation (IR) annotation generates a fully-optimized native binary tagged with a higher-level compiler representation of itself. The native binary benefits from static optimization and code generation, but the IR annotation allows targeted and aggressive dynamic-optimization. Second, adaptive code-selection allows a program to empirically tune its performance throughout execution by automatically identifying and favoring the best performing variant of a routine. This technique can be used for dynamically choosing between different static-compilation strategies; or, it can be used with IR annotation for performing dynamic, feedback-directed optimization.
|
444 |
Parallel design optimization of multi-trailer articulated heavy vehicles with active safety systemsIslam, Md. Manjurul 01 April 2013 (has links)
Multi-trailer articulated heavy vehicles (MTAHVs) exhibit unstable motion modes
at high speeds, including jack-knifing, trailer swing, and roll-over. These unstable
motion modes may lead to fatal accidents. On the other hand, these vehicle
combinations have poor maneuverability at low speeds. Of all contradictory design
criteria of MTAHVs, the trade-off relationship between the maneuverability
at low speeds and the lateral stability at high speeds is the most important and
fundamental. This trade-off relationship has not been adequately addressed. The
goal of this research is to address this trade-off relationship through the design optimization
of MTAHVs with active safety systems. A parallel design optimization
(PDO) method is developed and applied to the design of MTAHVs with integrated
active safety systems, which involve active trailer steering (ATS) control, anti-roll
(AR) control, differential braking (BD) control, and a variety of combinations of
these three control strategies. To derive model-based controllers, a single-trailer
articulated heavy vehicle (STAHV) model with 5 degrees of freedom (DOF) and a
MTAHV model with 7 DOF are generated. The vehicle models are validated with
those derived using a commercial software package, TruckSim, in order to examine
their applicability for the design optimization of MTAHVs with active safety
systems. The PDO method is implemented to perform the concurrent design of
the plant (vehicle model) and controllers. To simulate the closed-loop testing maneuvers,
a driver model is developed and it is used to drive the virtual vehicle
following the prescribed path. Case studies indicate that the PDO method is effective
for identifying desired design variables and predicting performance envelopes
in the early design stages of MTAHVs with active safety systems. / UOIT
|
445 |
Dynamic Load Balancing Schemes for Large-scale HLA-based SimulationsDe Grande, Robson E. 26 July 2012 (has links)
Dynamic balancing of computation and communication load is vital for the execution stability and performance of distributed, parallel simulations deployed on shared, unreliable resources of large-scale environments. High Level Architecture (HLA) based simulations can experience a decrease in performance due to imbalances that are produced initially and/or during run-time. These imbalances are generated by the dynamic load changes of distributed simulations or by unknown, non-managed background processes resulting from the non-dedication of shared resources. Due to the dynamic execution characteristics of elements that compose distributed simulation applications, the computational load and interaction dependencies of each simulation entity change during run-time. These dynamic changes lead to an irregular load and communication distribution, which increases overhead of resources and execution delays. A static partitioning of load is limited to deterministic applications and is incapable of predicting the dynamic changes caused by distributed applications or by external background processes. Due to the relevance in dynamically balancing load for distributed simulations, many balancing approaches have been proposed in order to offer a sub-optimal balancing solution, but they are limited to certain simulation aspects, specific to determined applications, or unaware of HLA-based simulation characteristics. Therefore, schemes for balancing the communication and computational load during the execution of distributed simulations are devised, adopting a hierarchical architecture. First, in order to enable the development of such balancing schemes, a migration technique is also employed to perform reliable and low-latency simulation load transfers. Then, a centralized balancing scheme is designed; this scheme employs local and cluster monitoring mechanisms in order to observe the distributed load changes and identify imbalances, and it uses load reallocation policies to determine a distribution of load and minimize imbalances. As a measure to overcome the drawbacks of this scheme, such as bottlenecks, overheads, global synchronization, and single point of failure, a distributed redistribution algorithm is designed. Extensions of the distributed balancing scheme are also developed to improve the detection of and the reaction to load imbalances. These extensions introduce communication delay detection, migration latency awareness, self-adaptation, and load oscillation prediction in the load redistribution algorithm. Such developed balancing systems successfully improved the use of shared resources and increased distributed simulations' performance.
|
446 |
Structural Characterization of Freshwater Dissolved Organic Matter from Arctic and Temperate Climates Using Novel Analytical ApproachesWoods, Gwen 19 March 2013 (has links)
Dissolved organic matter (DOM) is comprised of a complex array of molecular constituents that are linked to many globally-relevant processes and yet this material is still largely molecularly uncharacterized. Research presented here attempted to probe the molecular complexity of this material from both Arctic and temperate climates via multifaceted and novel approaches. DOM collected from remote Arctic watersheds provided evidence to suggest that permafrost-disturbed systems contain more photochemically- and biologically-labile material than undisturbed systems. These results have large implications for predicted increasing temperatures where widespread permafrost melt would significantly impact stores of organic carbon in polar environments. In attempting to address the complexities and reactivity of DOM within global environments, more information at the molecular-level is necessary. Further research sought to unravel the molecularly uncharacterized fraction via use of nuclear magnetic resonance (NMR) spectroscopy in conjunction with hyphenated and varied analytical techniques. Directly hyphenated high performance size exclusion chromatography (HPSEC) with NMR was explored. This hyphenation was found to separate DOM into structurally distinct fractions but proved limited at reducing DOM heterogeneity. Of the many high performance liquid chromatography (HPLC) techniques tested, hydrophilic interaction chromatography (HILIC) was found the most effective at simplifying DOM. HILIC separations utilizing a sample from Florida resulted in fractions with highly resolved NMR signals and substantial reduction in heterogeneity. Further development with a 2D-HILIC/HILIC system to achieve additional fractionation was employed. This method produced fractions of DOM that were homogenous enough to produce excellent resolution and spectral dispersion, permitting 2D and 3D NMR experiments to be performed. Extensive NMR analyses of these fractions demonstrated strong evidence for the presence of highly oxidized sterols. All fractions, however, provided 2D NMR spectra consistent with oxidized polycyclic structures and support emerging data and hypotheses suggesting that cyclic structures, likely derived from terpenoids, are an abundant, refractory and major component of DOM. Research presented within this thesis demonstrates that HILIC and NMR are excellent co-techniques for the analysis of DOM as well as that oxidized sterols and other cyclic components with significant hydroxyl and carboxyl substituents are major constituents in DOM.
|
447 |
Determination of triterpenoids in Psidium guajavaChen, Ying January 2012 (has links)
University of Macau / Institute of Chinese Medical Sciences
|
448 |
Software caching techniques and hardware optimizations for on-chip local memoriesVujic, Nikola 05 June 2012 (has links)
Despite the fact that the most viable L1 memories in processors are caches,
on-chip local memories have been a great topic of consideration lately. Local
memories are an interesting design option due to their many benefits: less
area occupancy, reduced energy consumption and fast and constant access time.
These benefits are especially interesting for the design of modern multicore processors
since power and latency are important assets in computer architecture
today. Also, local memories do not generate coherency traffic which is important
for the scalability of the multicore systems.
Unfortunately, local memories have not been well accepted in modern processors
yet, mainly due to their poor programmability. Systems with on-chip local
memories do not have hardware support for transparent data transfers between
local and global memories, and thus ease of programming is one of the main
impediments for the broad acceptance of those systems. This thesis addresses
software and hardware optimizations regarding the programmability, and the
usage of the on-chip local memories in the context of both single-core and multicore
systems.
Software optimizations are related to the software caching techniques. Software
cache is a robust approach to provide the user with a transparent view
of the memory architecture; but this software approach can suffer from poor
performance. In this thesis, we start optimizing traditional software cache by
proposing a hierarchical, hybrid software-cache architecture. Afterwards, we develop
few optimizations in order to speedup our hybrid software cache as much
as possible. As the result of the software optimizations we obtain that our hybrid
software cache performs from 4 to 10 times faster than traditional software
cache on a set of NAS parallel benchmarks.
We do not stop with software caching. We cover some other aspects of the
architectures with on-chip local memories, such as the quality of the generated
code and its correspondence with the quality of the buffer management in local
memories, in order to improve performance of these architectures. Therefore,
we run our research till we reach the limit in software and start proposing optimizations
on the hardware level. Two hardware proposals are presented in this
thesis. One is about relaxing alignment constraints imposed in the architectures
with on-chip local memories and the other proposal is about accelerating the
management of local memories by providing hardware support for the majority
of actions performed in our software cache. / Malgrat les memòries cau encara son el component basic pel disseny del subsistema de memòria, les memòries locals han esdevingut una alternativa degut a les seves característiques pel que fa a l’ocupació d’àrea, el seu consum energètic i el seu rendiment amb un temps d’accés ràpid i constant. Aquestes característiques son d’especial interès quan les properes arquitectures multi-nucli estan limitades pel consum de potencia i la latència del subsistema de memòria.Les memòries locals pateixen de limitacions respecte la complexitat en la seva programació, fet que dificulta la seva introducció en arquitectures multi-nucli, tot i els avantatges esmentats anteriorment. Aquesta tesi presenta un seguit de solucions basades en programari i maquinari específicament dissenyat per resoldre aquestes limitacions.Les optimitzacions del programari estan basades amb tècniques d'emmagatzematge de memòria cau suportades per llibreries especifiques. La memòria cau per programari és un sòlid mètode per proporcionar a l'usuari una visió transparent de l'arquitectura, però aquest enfocament pot patir d'un rendiment deficient. En aquesta tesi, es proposa una estructura jeràrquica i híbrida. Posteriorment, desenvolupem optimitzacions per tal d'accelerar l’execució del programari que suporta el disseny de la memòria cau. Com a resultat de les optimitzacions realitzades, obtenim que el nostre disseny híbrid es comporta de 4 a 10 vegades més ràpid que una implementació tradicional de memòria cau sobre un conjunt d’aplicacions de referencia, com son els “NAS parallel benchmarks”.El treball de tesi inclou altres aspectes de les arquitectures amb memòries locals, com ara la qualitat del codi generat i la seva correspondència amb la qualitat de la gestió de memòria intermèdia en les memòries locals, per tal de millorar el rendiment d'aquestes arquitectures. La tesi desenvolupa propostes basades estrictament en el disseny de nou maquinari per tal de millorar el rendiment de les memòries locals quan ja no es possible realitzar mes optimitzacions en el programari. En particular, la tesi presenta dues propostes de maquinari: una relaxa les restriccions imposades per les memòries locals respecte l’alineament de dades, l’altra introdueix maquinari específic per accelerar les operacions mes usuals sobre les memòries locals.
|
449 |
DVFS power management in HPC systemsEtinski, Maja 01 June 2012 (has links)
Recent increase in performance of High Performance Computing (HPC) systems has been followed by
even higher increase in power consumption. Power draw of modern supercomputers leads to very high
operating costs and reliability concerns. Furthermore, it has negative consequences on the environment.
Accordingly, over the last decade there have been many works dealing with power/energy management
in HPC systems.
Since CPUs accounts for a high portion of the total system power consumption, our work aims at CPU
power reduction. Dynamic Voltage Frequency Scaling (DVFS) is a widely used technique for CPU
power management. Running an application at lower frequency/voltage reduces its power
consumption. However, frequency scaling should be used carefully since it has negative effects on the
application performance.
We argue that the job scheduler level presents a good place for power management in an HPC center
having in mind that a parallel job scheduler has a global overview of the entire system. In this thesis we
propose power-aware parallel job scheduling policies where the scheduler determines the job CPU
frequency, besides the job execution order. Based on the goal, the proposed policies can be classified
into two groups: energy saving and power budgeting policies. The energy saving policies aim to reduce
CPU energy consumption with a minimal job performance penalty. The first of the energy saving
policies assigns the job frequency based on system utilization while the other makes job performance
predictions. While for less loaded workloads these policies achieve energy savings, highly loaded
workloads suffer from a substantial performance degradation because of higher job wait times due to
an increase in load caused by longer job run times. Our results show higher potential of the DVFS
technique when applied for power budgeting.
The second group of policies are policies for power constrained systems. In contrast to the systems
without a power limitation, in the case of a given power budget the DVFS technique even improves
overall job performance reducing the average job wait time. This comes from a lower job power
consumption that allows more jobs to run simultaneously. The first proposed policy from this group
assigns CPU frequency using the job predicted performance and current power draw of already running
jobs. The other power budgeting policy is based on an optimization problem which solution determines
the job execution order, as well as power distribution among jobs selected for execution. This policy
fully exploits available power and leads to further performance improvements.
The last contribution of the thesis is an analysis of the DVFS technique potential for energyperformance
trade-off in current and future HPC systems. Ongoing changes in technology decrease the
DVFS applicability for energy savings but the technique still reduces power consumption making it
useful for power constrained systems. In order to analyze DVFS potential, a model of frequency
scaling impact on MPI application execution time has been proposed and validated against
measurements on a large-scale system. This parametric analysis showed for which
application/platform characteristic, frequency scaling leads to energy savings. / El aumento de rendimiento que han experimentado los sistemas de altas prestaciones ha venido acompañado de un aumento aún mayor en el consumo de energía. El consumo de los supercomputadores actuales implica unos costes muy altos de funcionamiento. Estos costes no tienen simplemente implicaciones a nivel económico sino también implicaciones en el medio ambiente. Dado la importancia del problema, en los últimos tiempos se han realizado importantes esfuerzos de investigación para atacar el problema de la gestión eficiente de la energía que consumen los sistemas de supercomputación.
Dado que la CPU supone un alto porcentaje del consumo total de un sistema, nuestro trabajo se centra en la reducción y gestión eficiente de la energía consumida por la CPU. En concreto, esta tesis se centra en la viabilidad de realizar esta gestión mediante la técnica de Dynamic Voltage Frequency Scalingi (DVFS), una técnica ampliamente utilizada con el objetivo de reducir el consumo energético de la CPU. Sin embargo, esta técnica puede implicar una reducción en el rendimiento de las aplicaciones que se ejecutan, ya que implica una reducción de la frecuencia. Si tenemos en cuenta que el contexto de esta tesis son sistemas de alta prestaciones, minimizar el impacto en la pérdida de rendimiento será uno de nuestros objetivos. Sin embargo, en nuestro contexto, el rendimiento de un trabajo viene determinado por dos factores, tiempo de ejecución y tiempo de espera, por lo que habrá que considerar los dos componentes.
Los sistemas de supercomputación suelen estar gestionados por sistemas de colas. Los trabajos, dependiendo de la política que se aplique y el estado del sistema, deberán esperar más o menos tiempo antes de ser ejecutado. Dado las características del sistema objetivo de esta tesis, nosotros consideramos que el Planificador de trabajo (o Job Scheduler), es el mejor componente del sistema para incluir la gestión de la energía ya que es el único punto donde se tiene una visión global de todo el sistema.
En este trabajo de tesis proponemos un conjunto de políticas de planificación que considerarán el consumo energético como un recurso más. Estas políticas decidirán que trabajo ejecutar, el número de cpus asignadas y la lista de cpus (y nodos) sino también la frecuencia a la que estas cpus se ejecutarán. Estas políticas estarán orientadas a dos objetivos: reducir la energía total consumida por un conjunto de trabajos y controlar en consumo puntual de un conjunto puntual para evitar saturaciones del sistema en aquellos centros que puedan tener una capacidad limitada (permanente o puntual).
El primer grupo de políticas intentará reducir el consumo total minimizando el impacto en el rendimiento. En este grupo encontramos una primera política que asigna la frecuencia de las cpus en función de la utilización del sistema y una segunda que calcula una estimación de la penalización que sufrirá el trabajo que va a empezar para decidir si reducir o no la frecuencia. Estas políticas han mostrado unos resultados aceptables con sistemas poco cargados, pero han mostrado unas pérdidas de rendimiento significativas cuando el sistema está muy cargado. Estas pérdidas de rendimiento no han sido a nivel de incremento significativo del tiempo de ejecución de los trabajos, pero sí de las métricas de rendimiento que incluyen el tiempo de espera de los trabajos (habituales en este contexto).
El segundo grupo de políticas, orientadas a sistemas con limitaciones en cuanto a la potencia que pueden consumir, han mostrado un gran potencial utilizando DVFS como mecanismo de
gestión. En este caso, comparado con un sistema que no incluya esta gestión, han demostrado mejoras en el rendimiento ya que permiten ejecutar más trabajos de forma simultánea, reduciendo significativamente el tiempo de espera de los trabajos. En este segundo grupo proponemos una política basada en el rendimiento del trabajo que se va a ejecutar y una segunda que considera la asignación de todos los recursos como un problema de optimización lineal. Esta última política es la contribución más importante de la tesis ya que demuestra un buen comportamiento en todos los casos evaluados.
La última contribución de la tesis es un estudio del potencial de DVFS como técnica de gestión de la energía en un futuro próximo, en función de un estudio de las características de las aplicaciones, de la reducción de DVFS en el consumo de la CPU y del peso de la CPU dentro de todo el sistema. Este estudio indica que la capacidad de DVFS de ahorrar energía será limitado pero sigue mostrando un gran potencial de cara al control del consumo energético.
|
450 |
Evaluation of High Performance Residential Housing TechnologyGrin, Aaron January 2008 (has links)
The energy consumption of residential buildings in Canada accounts for 17% of national energy use (Trudeau, 2005). Production homes represent a considerable portion of new housing. In an effort to reduce the national energy demand, the energy consumption of these homes must be addressed. Techniques, methods and materials to achieve reductions in residential energy use are readily available.
The goal of this thesis is to show that it is possible to build a low-energy home for less total carrying cost than a home built to the 2006 Ontario Building Code. To show how this is possible, a range of cost-effective and practical-to-implement upgrades are identified, and quantitative projections of cost-savings and benefits gained by the homeowner are generated.
The interest in, and demand for, greener less energy consumptive homes is increasing. As oil prices rise, climate changes, landfills become overburdened and water restrictions become more frequent, the public pushes harder for change. The residential housing sector has seen increased demand for energy efficient homes that incorporate green features, high efficiency appliances and mechanical systems. Increased environmental concern has put ‘Green’ in demand.
This thesis reviews a variety of North American green rating systems and contrasts their energy performance requirements with those of the Ontario Building Code. The Ontario Building Code was considered the baseline. Although the R2000 program was originally developed nearly 30 years ago it has managed to maintain a standard of performance that has always exceeded the OBC. It has a wider range of requirements than either the building code or ENERGY STAR, but falls short of the LEED for homes program in terms of breadth of environmental concerns.
The literature review shows that homes that use 75% less heating energy than a standard house could be built in the 1980s for a mere 5% construction cost premium. When care is taken to produce quality designs and specifications, and to ensure that details are properly finished, these types of homes can be built almost anywhere. Some of the most successful technology and strategies of the 80’s have found their way into mainstream Canadian houses. As a result, the average new Canadian home consumes less energy than its predecessors. The Ontario building code has some of the most stringent thermal insulation and energy performance requirements of all provincial codes in Canada. However, significantly more can be done to economically reduce house energy consumption.
A parametric analysis of a representative urban house was performed. This analysis suggests that there is significant room for improvement in the minimum Ontario Building Code requirements, especially with regard to the insulation and air tightness specifications. In 2006 the OBC requirements for above grade wall insulation were increased from R17 to R19 whereas this investigation found that R34 could be justified financially. The fenestration requirements in the 2006 OBC require windows to attain at least R2.8, while this investigation shows that a further 25% increase to R3.5 will soon be financially sensible.
|
Page generated in 0.0785 seconds