Spelling suggestions: "subject:"[een] HIGH PERFORMANCE COMPUTING"" "subject:"[enn] HIGH PERFORMANCE COMPUTING""
111 |
Parallel design optimization of multi-trailer articulated heavy vehicles with active safety systemsIslam, Md. Manjurul 01 April 2013 (has links)
Multi-trailer articulated heavy vehicles (MTAHVs) exhibit unstable motion modes
at high speeds, including jack-knifing, trailer swing, and roll-over. These unstable
motion modes may lead to fatal accidents. On the other hand, these vehicle
combinations have poor maneuverability at low speeds. Of all contradictory design
criteria of MTAHVs, the trade-off relationship between the maneuverability
at low speeds and the lateral stability at high speeds is the most important and
fundamental. This trade-off relationship has not been adequately addressed. The
goal of this research is to address this trade-off relationship through the design optimization
of MTAHVs with active safety systems. A parallel design optimization
(PDO) method is developed and applied to the design of MTAHVs with integrated
active safety systems, which involve active trailer steering (ATS) control, anti-roll
(AR) control, differential braking (BD) control, and a variety of combinations of
these three control strategies. To derive model-based controllers, a single-trailer
articulated heavy vehicle (STAHV) model with 5 degrees of freedom (DOF) and a
MTAHV model with 7 DOF are generated. The vehicle models are validated with
those derived using a commercial software package, TruckSim, in order to examine
their applicability for the design optimization of MTAHVs with active safety
systems. The PDO method is implemented to perform the concurrent design of
the plant (vehicle model) and controllers. To simulate the closed-loop testing maneuvers,
a driver model is developed and it is used to drive the virtual vehicle
following the prescribed path. Case studies indicate that the PDO method is effective
for identifying desired design variables and predicting performance envelopes
in the early design stages of MTAHVs with active safety systems. / UOIT
|
112 |
Dynamic Load Balancing Schemes for Large-scale HLA-based SimulationsDe Grande, Robson E. 26 July 2012 (has links)
Dynamic balancing of computation and communication load is vital for the execution stability and performance of distributed, parallel simulations deployed on shared, unreliable resources of large-scale environments. High Level Architecture (HLA) based simulations can experience a decrease in performance due to imbalances that are produced initially and/or during run-time. These imbalances are generated by the dynamic load changes of distributed simulations or by unknown, non-managed background processes resulting from the non-dedication of shared resources. Due to the dynamic execution characteristics of elements that compose distributed simulation applications, the computational load and interaction dependencies of each simulation entity change during run-time. These dynamic changes lead to an irregular load and communication distribution, which increases overhead of resources and execution delays. A static partitioning of load is limited to deterministic applications and is incapable of predicting the dynamic changes caused by distributed applications or by external background processes. Due to the relevance in dynamically balancing load for distributed simulations, many balancing approaches have been proposed in order to offer a sub-optimal balancing solution, but they are limited to certain simulation aspects, specific to determined applications, or unaware of HLA-based simulation characteristics. Therefore, schemes for balancing the communication and computational load during the execution of distributed simulations are devised, adopting a hierarchical architecture. First, in order to enable the development of such balancing schemes, a migration technique is also employed to perform reliable and low-latency simulation load transfers. Then, a centralized balancing scheme is designed; this scheme employs local and cluster monitoring mechanisms in order to observe the distributed load changes and identify imbalances, and it uses load reallocation policies to determine a distribution of load and minimize imbalances. As a measure to overcome the drawbacks of this scheme, such as bottlenecks, overheads, global synchronization, and single point of failure, a distributed redistribution algorithm is designed. Extensions of the distributed balancing scheme are also developed to improve the detection of and the reaction to load imbalances. These extensions introduce communication delay detection, migration latency awareness, self-adaptation, and load oscillation prediction in the load redistribution algorithm. Such developed balancing systems successfully improved the use of shared resources and increased distributed simulations' performance.
|
113 |
Software caching techniques and hardware optimizations for on-chip local memoriesVujic, Nikola 05 June 2012 (has links)
Despite the fact that the most viable L1 memories in processors are caches,
on-chip local memories have been a great topic of consideration lately. Local
memories are an interesting design option due to their many benefits: less
area occupancy, reduced energy consumption and fast and constant access time.
These benefits are especially interesting for the design of modern multicore processors
since power and latency are important assets in computer architecture
today. Also, local memories do not generate coherency traffic which is important
for the scalability of the multicore systems.
Unfortunately, local memories have not been well accepted in modern processors
yet, mainly due to their poor programmability. Systems with on-chip local
memories do not have hardware support for transparent data transfers between
local and global memories, and thus ease of programming is one of the main
impediments for the broad acceptance of those systems. This thesis addresses
software and hardware optimizations regarding the programmability, and the
usage of the on-chip local memories in the context of both single-core and multicore
systems.
Software optimizations are related to the software caching techniques. Software
cache is a robust approach to provide the user with a transparent view
of the memory architecture; but this software approach can suffer from poor
performance. In this thesis, we start optimizing traditional software cache by
proposing a hierarchical, hybrid software-cache architecture. Afterwards, we develop
few optimizations in order to speedup our hybrid software cache as much
as possible. As the result of the software optimizations we obtain that our hybrid
software cache performs from 4 to 10 times faster than traditional software
cache on a set of NAS parallel benchmarks.
We do not stop with software caching. We cover some other aspects of the
architectures with on-chip local memories, such as the quality of the generated
code and its correspondence with the quality of the buffer management in local
memories, in order to improve performance of these architectures. Therefore,
we run our research till we reach the limit in software and start proposing optimizations
on the hardware level. Two hardware proposals are presented in this
thesis. One is about relaxing alignment constraints imposed in the architectures
with on-chip local memories and the other proposal is about accelerating the
management of local memories by providing hardware support for the majority
of actions performed in our software cache. / Malgrat les memòries cau encara son el component basic pel disseny del subsistema de memòria, les memòries locals han esdevingut una alternativa degut a les seves característiques pel que fa a l’ocupació d’àrea, el seu consum energètic i el seu rendiment amb un temps d’accés ràpid i constant. Aquestes característiques son d’especial interès quan les properes arquitectures multi-nucli estan limitades pel consum de potencia i la latència del subsistema de memòria.Les memòries locals pateixen de limitacions respecte la complexitat en la seva programació, fet que dificulta la seva introducció en arquitectures multi-nucli, tot i els avantatges esmentats anteriorment. Aquesta tesi presenta un seguit de solucions basades en programari i maquinari específicament dissenyat per resoldre aquestes limitacions.Les optimitzacions del programari estan basades amb tècniques d'emmagatzematge de memòria cau suportades per llibreries especifiques. La memòria cau per programari és un sòlid mètode per proporcionar a l'usuari una visió transparent de l'arquitectura, però aquest enfocament pot patir d'un rendiment deficient. En aquesta tesi, es proposa una estructura jeràrquica i híbrida. Posteriorment, desenvolupem optimitzacions per tal d'accelerar l’execució del programari que suporta el disseny de la memòria cau. Com a resultat de les optimitzacions realitzades, obtenim que el nostre disseny híbrid es comporta de 4 a 10 vegades més ràpid que una implementació tradicional de memòria cau sobre un conjunt d’aplicacions de referencia, com son els “NAS parallel benchmarks”.El treball de tesi inclou altres aspectes de les arquitectures amb memòries locals, com ara la qualitat del codi generat i la seva correspondència amb la qualitat de la gestió de memòria intermèdia en les memòries locals, per tal de millorar el rendiment d'aquestes arquitectures. La tesi desenvolupa propostes basades estrictament en el disseny de nou maquinari per tal de millorar el rendiment de les memòries locals quan ja no es possible realitzar mes optimitzacions en el programari. En particular, la tesi presenta dues propostes de maquinari: una relaxa les restriccions imposades per les memòries locals respecte l’alineament de dades, l’altra introdueix maquinari específic per accelerar les operacions mes usuals sobre les memòries locals.
|
114 |
Micro-scheduling and its interaction with cache partitioningChoudhary, Dhruv 05 July 2011 (has links)
The thesis explores the sources of energy inefficiency in asymmetric multi- core architectures where energy efficiency is measured by the energy-delay squared product. The insights gathered from this study drive the development of optimized thread scheduling and coordinated cache management strategies in an important class of asymmetric shared memory architectures. The proposed techniques are founded on well known mathematical optimization techniques yet are lightweight enough to be implemented in practical systems.
|
115 |
An empirical approach to automated performance management for elastic n-tier applications in computing cloudsMalkowski, Simon J. 03 April 2012 (has links)
Achieving a high degree of efficiency is non-trivial when managing the performance of large web-facing applications such as e-commerce websites and social networks. While computing clouds have been touted as a good solution for elastic applications, many significant technological challenges still have to be addressed in order to leverage the full potential of this new computing paradigm. In this dissertation I argue that the automation of elastic n-tier application performance management in computing clouds presents novel challenges to classical system performance management methodology that can be successfully addressed through a systematic empirical approach. I present strong evidence in support of my thesis in a framework of three incremental building blocks: Experimental Analysis of Elastic System Scalability and Consolidation, Modeling and Detection of Non-trivial Performance Phenomena in Elastic Systems, and Automated Control and Configuration Planning of Elastic Systems. More concretely, I first provide a proof of concept for the feasibility of large-scale experimental database system performance analyses, and illustrate several complex performance phenomena based on the gathered scalability and consolidation data. Second, I extend these initial results to a proof of concept for automating bottleneck detection based on statistical analysis and an abstract definition of multi-bottlenecks. Third, I build a performance control system that manages elastic n-tier applications efficiently with respect to complex performance phenomena such as multi-bottlenecks. This control system provides a proof of concept for automated online performance management based on empirical data.
|
116 |
Global synchronization of asynchronous computing systemsBarnes, Richard Neil. January 2001 (has links)
Thesis (M.S.)--Mississippi State University. Department of Electrical and Computer Engineering. / Title from title screen. Includes bibliographical references.
|
117 |
A distributed memory implementation of LociGeorge, Thomas. January 2001 (has links)
Thesis (M.S.)--Mississippi State University. Department of Computational Engineering. / Title from title screen. Includes bibliographical references.
|
118 |
A faster technique for rendering meshes in multiple display systemsHand, Randall Eugene. January 2002 (has links)
Thesis (M.S.)--Mississippi State University. Department of Electrical and Computer Engineering. / Title from title screen. Includes bibliographical references.
|
119 |
ProLAS a novel dynamic load balancing library for advanced scientific computing /Krishnan, Manoj Kumar. January 2003 (has links)
Thesis (M.S.)--Mississippi State University. Department of Computer Science and Engineering. / Title from title screen. Includes bibliographical references.
|
120 |
Overlapping of communication and computation and early binding fundamental mechanisms for improving parallel performance on clusters of workstations /Dimitrov, Rossen Petkov. January 2001 (has links)
Thesis (Ph. D.)--Mississippi State University. Department of Computer Science. / Title from title screen. Includes bibliographical references.
|
Page generated in 0.0489 seconds