• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 146
  • 43
  • 19
  • 11
  • 7
  • 6
  • 3
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 289
  • 289
  • 61
  • 61
  • 53
  • 52
  • 48
  • 47
  • 40
  • 36
  • 35
  • 34
  • 33
  • 32
  • 31
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
251

Interactions Study of Self Optimizing Schemes in LTE Femtocell Networks

El-murtadi Suleiman, Kais 06 December 2012 (has links)
One of the enabling technologies for Long Term Evolution (LTE) deployments is the femtocell technology. By having femtocells deployed indoors and closer to the user, high data rate services can be provided efficiently. These femtocells are expected to be depolyed in large numbers which raises many technical challenges including the handover management. In fact, managing handovers in femtocell environments, with the conventional manual adjustment techniques, is almost impossible to keep pace with in such a rapidly growing femtocell environment. Therefore, doing this automatically by implementing Self Organizing Network (SON) use cases becomes a necessity rather than an option. However, having multiple SON use cases operating simultaneously with a shared objective could cause them to interact either negatively or positively. In both cases, designing a suitable coordination policy is critical in solving negative conflicts and building upon positive benefits. In this work, we focus on studying the interactions between three self optimization use cases aiming at improving the overall handover procedure in LTE femtocell networks. These self optimization use cases are handover, Call Admission Control (CAC) and load balancing. We develop a comprehensive, unified LTE compliant evaluation environment. This environment is extendable to other radio access technologies including LTE-Advanced (LTE-A), and can also be used to study other SON use cases. Various recommendations made by main bodies in the area of femtocells are considered including the Small Cell Forum, the Next Generation Mobile Networks (NGMN) alliance and the 3rd Generation Partnership Project (3GPP). Additionally, traffic sources are simulated in compliance with these recommendations and evaluation methodologies. We study the interaction between three representative handover related self optimization schemes. We start by testing these schemes separately, in order to make sure that they meet their individual goals, and then their mutual interactions when operating simultaneously. Based on these experiments, we recommend several guidelines that can help mobile network operators and researchers in designing better coordination policies. / Thesis (Master, Electrical & Computer Engineering) -- Queen's University, 2012-12-05 22:35:27.538
252

Automating Geospatial RDF Dataset Integration and Enrichment / Automatische geografische RDF Datensatzintegration und Anreicherung

Sherif, Mohamed Ahmed Mohamed 12 December 2016 (has links) (PDF)
Over the last years, the Linked Open Data (LOD) has evolved from a mere 12 to more than 10,000 knowledge bases. These knowledge bases come from diverse domains including (but not limited to) publications, life sciences, social networking, government, media, linguistics. Moreover, the LOD cloud also contains a large number of crossdomain knowledge bases such as DBpedia and Yago2. These knowledge bases are commonly managed in a decentralized fashion and contain partly verlapping information. This architectural choice has led to knowledge pertaining to the same domain being published by independent entities in the LOD cloud. For example, information on drugs can be found in Diseasome as well as DBpedia and Drugbank. Furthermore, certain knowledge bases such as DBLP have been published by several bodies, which in turn has lead to duplicated content in the LOD . In addition, large amounts of geo-spatial information have been made available with the growth of heterogeneous Web of Data. The concurrent publication of knowledge bases containing related information promises to become a phenomenon of increasing importance with the growth of the number of independent data providers. Enabling the joint use of the knowledge bases published by these providers for tasks such as federated queries, cross-ontology question answering and data integration is most commonly tackled by creating links between the resources described within these knowledge bases. Within this thesis, we spur the transition from isolated knowledge bases to enriched Linked Data sets where information can be easily integrated and processed. To achieve this goal, we provide concepts, approaches and use cases that facilitate the integration and enrichment of information with other data types that are already present on the Linked Data Web with a focus on geo-spatial data. The first challenge that motivates our work is the lack of measures that use the geographic data for linking geo-spatial knowledge bases. This is partly due to the geo-spatial resources being described by the means of vector geometry. In particular, discrepancies in granularity and error measurements across knowledge bases render the selection of appropriate distance measures for geo-spatial resources difficult. We address this challenge by evaluating existing literature for point set measures that can be used to measure the similarity of vector geometries. Then, we present and evaluate the ten measures that we derived from the literature on samples of three real knowledge bases. The second challenge we address in this thesis is the lack of automatic Link Discovery (LD) approaches capable of dealing with geospatial knowledge bases with missing and erroneous data. To this end, we present Colibri, an unsupervised approach that allows discovering links between knowledge bases while improving the quality of the instance data in these knowledge bases. A Colibri iteration begins by generating links between knowledge bases. Then, the approach makes use of these links to detect resources with probably erroneous or missing information. This erroneous or missing information detected by the approach is finally corrected or added. The third challenge we address is the lack of scalable LD approaches for tackling big geo-spatial knowledge bases. Thus, we present Deterministic Particle-Swarm Optimization (DPSO), a novel load balancing technique for LD on parallel hardware based on particle-swarm optimization. We combine this approach with the Orchid algorithm for geo-spatial linking and evaluate it on real and artificial data sets. The lack of approaches for automatic updating of links of an evolving knowledge base is our fourth challenge. This challenge is addressed in this thesis by the Wombat algorithm. Wombat is a novel approach for the discovery of links between knowledge bases that relies exclusively on positive examples. Wombat is based on generalisation via an upward refinement operator to traverse the space of Link Specifications (LS). We study the theoretical characteristics of Wombat and evaluate it on different benchmark data sets. The last challenge addressed herein is the lack of automatic approaches for geo-spatial knowledge base enrichment. Thus, we propose Deer, a supervised learning approach based on a refinement operator for enriching Resource Description Framework (RDF) data sets. We show how we can use exemplary descriptions of enriched resources to generate accurate enrichment pipelines. We evaluate our approach against manually defined enrichment pipelines and show that our approach can learn accurate pipelines even when provided with a small number of training examples. Each of the proposed approaches is implemented and evaluated against state-of-the-art approaches on real and/or artificial data sets. Moreover, all approaches are peer-reviewed and published in a conference or a journal paper. Throughout this thesis, we detail the ideas, implementation and the evaluation of each of the approaches. Moreover, we discuss each approach and present lessons learned. Finally, we conclude this thesis by presenting a set of possible future extensions and use cases for each of the proposed approaches.
253

Development and validation of the Euler-Lagrange formulation on a parallel and unstructured solver for large-eddy simulation / Développement et validation du formalisme Euler-Lagrange dans un solveur parallèle et non-structuré pour la simulation aux grandes échelles

García Martinez, Marta 19 January 2009 (has links)
De nombreuses applications industrielles mettent en jeu des écoulements gaz-particules, comme les turbines aéronautiques et les réacteurs a lit fluidisé de l'industrie chimique. La prédiction des propriétés de la phase dispersée, est essentielle à l'amélioration et la conception des dispositifs conformément aux nouvelles normes européennes des émissions polluantes. L'objectif de cette these est de développer le formalisme Euler- Lagrange dans un solveur parallèle et non-structuré pour la simulation aux grandes échelles pour ce type d'écoulements. Ce travail est motivé par l'augmentation rapide de la puissance de calcul des machines massivement parallèles qui ouvre une nouvelle voie pour des simulations qui étaient prohibitives il y a une décennie. Une attention particulière a été portée aux structures de données afin de conserver une certaine simplicité et la portabilité du code sur des differentes! architectures. Les développements sont validés pour deux configurations : un cas académique de turbulence homogène isotrope décroissante et un calcul polydisperse d'un jet turbulent recirculant chargé en particules. L'équilibrage de charges de particules est mis en évidence comme une solution prometteuse pour les simulations diphasiques Lagrangiennes afin d'améliorer les performances des calculs lorsque le déséquilibrage est trop important. / Particle-laden flows occur in industrial applications ranging from droplets in gas turbines tofluidized bed in chemical industry. Prediction of the dispersed phase properties such as concentration and dynamics are crucial for the design of more efficient devices that meet the new pollutant regulations of the European community. The objective of this thesis is to develop an Euler-Lagrange formulation on a parallel and unstructured solver for large- eddy simulation. This work is motivated by the rapid increase in computing power which opens a new way for simulations that were prohibitive one decade ago. Special attention is taken to keep data structure simplicity and code portability. Developments are validated in two configurations : an academic test of a decaying homogeneous isotropic turbulence and a polydisperse two-phase flow of a confined bluff body. The use of load-balancing capabilities is highlighted as a promising solut! ion in Lagrangian two-phase flow simulations to improve performance when strong imbalance of the dispersed phase is present
254

Vérification des contraintes temporelles de bout-en-bout dans le contexte AutoSar / Verification of end-to-end real-time constraints in the context of AutoSar

Monot, Aurélien 26 October 2012 (has links)
Les systèmes électroniques embarqués dans les véhicules ont une complexité sans cesse croissante. Cependant, il est crucial d'en maîtriser le comportement temporel afin de garantir la sécurité ainsi que le confort des passagers. La vérification des contraintes temporelles de bout-en-bout est donc un enjeu majeur lors de la conception d'un véhicule. Dans le contexte de l'architecture logicielle AUTOSAR standard dans les véhicules, nous décomposons la vérification d'une contrainte de bout-en-bout en sous-problèmes d'ordonnancement sur les calculateurs et sur les réseaux de communication que nous traitons ensuite séparément. Dans un premier temps, nous présentons une approche permettant d'améliorer l'utilisation des calculateurs exécutant un grand nombre de composants logiciels, compatible avec l'introduction progressive des plateformes multi-coeurs. Nous décrivons des algorithmes rapides et efficaces pour lisser la charge périodique sur les calculateurs multi-coeurs en adaptant puis en améliorant une approche existant pour les bus CAN. Nous donnons également des résultats théoriques sur l'efficacité des algorithmes dans certains cas particuliers. Enfin, nous décrivons les possibilités d'utilisation de ces algorithmes en fonction des autres tâches exécutées sur le calculateur. La suite des travaux est consacrée à l'étude des distributions de temps de réponse des messages transmis sur les bus CAN. Dans un premier temps nous présentons une approche de simulation basée sur la modélisation des dérives d'horloges des calculateurs communicant sur le réseau. Nous montrons que nous obtenons des distributions de temps de réponse similaires en réalisant une longue simulation avec des dérives d'horloge ou en faisant un grand nombre de courtes simulations sans dérives d'horloge. Nous présentons enfin une technique analytique pour évaluer les distributions de temps de réponse des trames CAN. Nous présentons différents paramètres d'approximation permettant de réduire le nombre très important de calculs à effectuer en limitant la perte de précision. Enfin, nous comparons expérimentalement les résultats obtenus par analyse et simulation et décrivons les avantages et inconvénients respectifs de ces approches / The complexity of electronic embedded systems in cars is continuously growing. Hence, mastering the temporal behavior of such systems is paramount in order to ensure the safety and comfort of the passengers. As a consequence, the verification of end-to-end real-time constraints is a major challenge during the design phase of a car. The AUTOSAR software architecture drives us to address the verification of end-to-end real-time constraints as two independent scheduling problems respectively for electronic control units and communication buses. First, we introduce an approach, which optimizes the utilization of controllers scheduling numerous software components that is compatible with the upcoming multicore architectures. We describe fast and efficient algorithms in order to balance the periodic load over time on multicore controllers by adapting and improving an existing approach used for the CAN networks. We provide theoretical result on the efficiency of the algorithms in some specific cases. Moreover, we describe how to use these algorithms in conjunction with other tasks scheduled on the controller. The remaining part of this research work addresses the problem of obtaining the response time distributions of the messages sent on a CAN network. First, we present a simulation approach based on the modelisation of clock drifts on the communicating nodes connected on the CAN network. We show that we obtain similar results with a single simulation using our approach in comparison with the legacy approach consisting in numerous short simulation runs without clock drifts. Then, we present an analytical approach in order to compute the response time distributions of the CAN frames. We introduce several approximation parameters to cope with the very high computational complexity of this approach while limiting the loss of accuracy. Finally, we compare experimentally the simulation and analytical approaches in order to discuss the relative advantages of each of the two approaches
255

Resource Allocation Algorithms for Event-Based Enterprise Systems

Cheung, Alex King Yeung 30 August 2011 (has links)
Distributed event processing systems suffer from poor scalability and inefficient resource usage caused by load distributions typical in real-world applications. The results of these shortcomings are availability issues, poor system performance, and high operating costs. This thesis proposes three remedies to solve these limitations in content-based publish/subscribe, which is a practical realization of an event processing system. First, we present a load balancing algorithm that relocates subscribers to distribute load and avoid overloads. Second, we propose publisher relocation algorithms that reduces both the load imposed onto brokers and delivery delay experienced by subscribers. Third, we present ``green" resource allocation algorithms that allocate as few brokers as possible while maximizing their resource usage efficiency by reconfiguring the publishers, subscribers, and the broker topology. We implemented and evaluated all of our approaches on an open source content-based publish/subscribe system called PADRES and evaluated them on SciNet, PlanetLab, a cluster testbed, and in simulations to prove the effectiveness of our solutions. Our evaluation findings are summarized as follows. One, the proposed load balancing algorithm is effective in distributing and balancing load originating from a single server to all available servers in the network. Two, our publisher relocation algorithm reduces the average input load of the system by up to 68%, average broker message rate by up to 85%, and average delivery delay by up to 68%. Three, our resource allocation algorithm reduces the average broker message rate even further by up to 92% and the number of allocated brokers by up to 91%.
256

Resource Allocation Algorithms for Event-Based Enterprise Systems

Cheung, Alex King Yeung 30 August 2011 (has links)
Distributed event processing systems suffer from poor scalability and inefficient resource usage caused by load distributions typical in real-world applications. The results of these shortcomings are availability issues, poor system performance, and high operating costs. This thesis proposes three remedies to solve these limitations in content-based publish/subscribe, which is a practical realization of an event processing system. First, we present a load balancing algorithm that relocates subscribers to distribute load and avoid overloads. Second, we propose publisher relocation algorithms that reduces both the load imposed onto brokers and delivery delay experienced by subscribers. Third, we present ``green" resource allocation algorithms that allocate as few brokers as possible while maximizing their resource usage efficiency by reconfiguring the publishers, subscribers, and the broker topology. We implemented and evaluated all of our approaches on an open source content-based publish/subscribe system called PADRES and evaluated them on SciNet, PlanetLab, a cluster testbed, and in simulations to prove the effectiveness of our solutions. Our evaluation findings are summarized as follows. One, the proposed load balancing algorithm is effective in distributing and balancing load originating from a single server to all available servers in the network. Two, our publisher relocation algorithm reduces the average input load of the system by up to 68%, average broker message rate by up to 85%, and average delivery delay by up to 68%. Three, our resource allocation algorithm reduces the average broker message rate even further by up to 92% and the number of allocated brokers by up to 91%.
257

Dynamische Lastbalancierung und Modellkopplung zur hochskalierbaren Simulation von Wolkenprozessen

Lieber, Matthias 26 September 2012 (has links) (PDF)
Die komplexen Interaktionen von Aerosolen, Wolken und Niederschlag werden in aktuellen Vorhersagemodellen nur ungenügend dargestellt. Simulationen mit spektraler Beschreibung von Wolkenprozessen können zu verbesserten Vorhersagen beitragen, sind jedoch weitaus rechenintensiver. Die Beschleunigung dieser Simulationen erfordert eine hochparallele Ausführung. In dieser Arbeit wird ein Konzept zur Kopplung spektraler Wolkenmikrophysikmodelle mit atmosphärischen Modellen entwickelt, das eine effiziente Nutzung der heute verfügbaren Parallelität der Größenordnung von 100.000 Prozessorkernen ermöglicht. Aufgrund des stark variierenden Rechenaufwands ist dafür eine hochskalierbare dynamische Lastbalancierung des Wolkenmikrophysikmodells unumgänglich. Dies wird durch ein hierarchisches Partitionierungsverfahren erreicht, das auf raumfüllenden Kurven basiert. Darüber hinaus wird eine hochskalierbare Verknüpfung von dynamischer Lastbalancierung und Modellkopplung durch ein effizientes Verfahren für die regelmäßige Bestimmung der Überschneidungen zwischen unterschiedlichen Partitionierungen ermöglicht. Durch die effiziente Nutzung von Hochleistungsrechnern ermöglichen die Ergebnisse der Arbeit die Anwendung spektraler Wolkenmikrophysikmodelle zur Simulation realistischer Szenarien auf hochaufgelösten Gittern. / Current forecast models insufficiently represent the complex interactions of aerosols, clouds and precipitation. Simulations with spectral description of cloud processes allow more detailed forecasts. However, they are much more computationally expensive. Reducing the runtime of such simulations requires a highly parallel execution. This thesis presents a concept for coupling spectral cloud microphysics models with atmospheric models that allows for efficient utilization of today\'s available parallelism in the order of 100.000 processor cores. Due to the strong workload variations, highly scalable dynamic load balancing of the cloud microphysics model is essential in order to reach this goal. This is achieved through a hierarchical partitioning method based on space-filling curves. Furthermore, a highly scalable connection of dynamic load balancing and model coupling is facilitated by an efficient method to regularly determine the intersections between different partitionings. The results of this thesis enable the application of spectral cloud microphysics models for the simulation of realistic scenarios with high resolution grids by efficient use of high performance computers.
258

Supporting Distributed Transaction Processing Over Mobile and Heterogeneous Platforms

Xie, Wanxia 28 November 2005 (has links)
Recent advances in pervasive computing and peer-to-peer computing have opened up vast opportunities for developing collaborative applications. To benefit from these emerging technologies, there is a need for investigating techniques and tools that will allow development and deployment of these applications on mobile and heterogeneous platforms. To meet these challenging tasks, we need to address the typical characteristics of mobile peer-to-peer systems such as frequent disconnections, frequent network partitions, and peer heterogeneity. This research focuses on developing the necessary models, techniques and algorithms that will enable us to build and deploy collaborative applications in the Internet enabled, mobile peer-to-peer environments. This dissertation proposes a multi-state transaction model and develops a quality aware transaction processing framework to incorporate quality of service with transaction processing. It proposes adaptive ACID properties and develops a quality specification language to associate a quality level with transactions. In addition, this research develops a probabilistic concurrency control mechanism and a group based transaction commit protocol for mobile peer-to-peer systems that greatly reduces blockings in transactions and improves the transaction commit ratio. To the best of our knowledge, this is the first attempt to systematically support disconnection-tolerant and partition-tolerant transaction processing. This dissertation also develops a scalable directory service called PeerDS to support the above framework. It addresses the scalability and dynamism of the directory service from two aspects: peer-to-peer and push-pull hybrid interfaces. It also addresses peer heterogeneity and develops a new technique for load balancing in the peer-to-peer system. This technique comprises an improved routing algorithm for virtualized P2P overlay networks and a generalized Top-K server selection algorithm for load balancing, which could be optimized based on multiple factors such as proximity and cost. The proposed push-pull hybrid interfaces greatly reduce the overhead of directory servers caused by frequent queries from directory clients. In order to further improve the scalability of the push interface, this dissertation also studies and evaluates different filter indexing schemes through which the interests of each update could be calculated very efficiently. This dissertation was developed in conjunction with the middleware called System on Mobile Devices (SyD).
259

Load balancing in heterogeneous cellular networks

Singh, Sarabjot, active 21st century 10 February 2015 (has links)
Pushing wireless data traffic onto small cells is important for alleviating congestion in the over-loaded macrocellular network. However, the ultimate potential of such load balancing and its effect on overall system performance is not well understood. With the ongoing deployment of multiple classes of access points (APs) with each class differing in transmit power, employed frequency band, and backhaul capacity, the network is evolving into a complex and “organic” heterogeneous network or HetNet. Resorting to system-level simulations for design insights is increasingly prohibitive with such growing network complexity. The goal of this dissertation is to develop realistic yet tractable frameworks to model and analyze load balancing dynamics while incorporating the heterogeneous nature of these networks. First, this dissertation introduces and analyzes a class of user-AP association strategies, called stationary association, and the resulting association cells for HetNets modeled as stationary point processes. A “Feller-paradox”-like relationship is established between the area of the association cell containing the origin and that of a typical association cell. This chapter also provides a foundation for subsequent chapters, as association strategies directly dictate the load distribution across the network. Second, this dissertation proposes a baseline model to characterize downlink rate and signal-to-interference-plus-noise-ratio (SINR) in an M-band K-tier HetNet with a general weighted path loss based association. Each class of APs is modeled as an independent Poisson point process (PPP) and may differ in deployment density, transmit power, bandwidth (resource), and path loss exponent. It is shown that the optimum fraction of traffic offloaded to maximize SINR coverage is not in general the same as the one that maximizes rate coverage. One of the main outcomes is demonstrating the aggressive of- floading required for out-of-band small cells (like WiFi) as compared to those for in-band (like picocells). To achieve aggressive load balancing, the offloaded users often have much lower downlink SINR than they would on the macrocell, particularly in co-channel small cells. This SINR degradation can be partially alleviated through interference avoidance, for example time or frequency resource partitioning, whereby the macrocell turns off in some fraction of such resources. As the third contribution, this dissertation proposes a tractable framework to analyze joint load balancing and resource partitioning in co-channel HetNets. Fourth, this dissertation investigates the impact of uplink load balancing. Power control and spatial interference correlation complicate the mathixematical analysis for the uplink as compared to the downlink. A novel generative model is proposed to characterize the uplink rate distribution as a function of the association and power control parameters, and used to show the optimal amount of channel inversion increases with the path loss variance in the network. In contrast to the downlink, minimum path loss association is shown to be optimal for uplink rate coverage. Fifth, this dissertation develops a model for characterizing rate distribution in self-backhauled millimeter wave (mmWave) cellular networks and thus generalizes the earlier multi-band offloading framework to the co-existence of current ultra high frequency (UHF) HetNets and mmWave networks. MmWave cellular systems will require high gain directional antennas and dense AP deployments. The analysis shows that in sharp contrast to the interferencelimited nature of UHF cellular networks, mmWave networks are usually noiselimited. As a desirable side effect, high gain antennas yield interference isolation, providing an opportunity to incorporate self-backhauling. For load balancing, the large bandwidth at mmWave makes offloading users, with reliable mmWave links, optimal for rate. / text
260

Molecular Dynamics for Exascale Supercomputers / La dynamique moléculaire pour les machines exascale

Cieren, Emmanuel 09 October 2015 (has links)
Dans la course vers l’exascale, les architectures des supercalculateurs évoluent vers des nœuds massivement multicœurs, sur lesquels les accès mémoire sont non-uniformes et les registres de vectorisation toujours plus grands. Ces évolutions entraînent une baisse de l’efficacité des applications homogènes (MPI simple), et imposent aux développeurs l’utilisation de fonctionnalités de bas-niveau afin d’obtenir de bonnes performances.Dans le contexte de la dynamique moléculaire (DM) appliqué à la physique de la matière condensée, les études du comportement des matériaux dans des conditions extrêmes requièrent la simulation de systèmes toujours plus grands avec une physique de plus en plus complexe. L’adaptation des codes de DM aux architectures exaflopiques est donc un enjeu essentiel.Cette thèse propose la conception et l’implémentation d’une plateforme dédiée à la simulation de très grands systèmes de DM sur les futurs supercalculateurs. Notre architecture s’organise autour de trois niveaux de parallélisme: décomposition de domaine avec MPI, du multithreading massif sur chaque domaine et un système de vectorisation explicite. Nous avons également inclus une capacité d’équilibrage dynamique de charge de calcul. La conception orienté objet a été particulièrement étudiée afin de préserver un niveau de programmation utilisable par des physiciens sans altérer les performances.Les premiers résultats montrent d’excellentes performances séquentielles, ainsi qu’une accélération quasi-linéaire sur plusieurs dizaines de milliers de cœurs. En production, nous constatons une accélération jusqu’à un facteur 30 par rapport au code utilisé actuellement par les chercheurs du CEA. / In the exascale race, supercomputer architectures are evolving towards massively multicore nodes with hierarchical memory structures and equipped with larger vectorization registers. These trends tend to make MPI-only applications less effective, and now require programmers to explicitly manage low-level elements to get decent performance.In the context of Molecular Dynamics (MD) applied to condensed matter physics, the need for a better understanding of materials behaviour under extreme conditions involves simulations of ever larger systems, on tens of thousands of cores. This will put molecular dynamics codes among software that are very likely to meet serious difficulties when it comes to fully exploit the performance of next generation processors.This thesis proposes the design and implementation of a high-performance, flexible and scalable framework dedicated to the simulation of large scale MD systems on future supercomputers. We managed to separate numerical modules from different expressions of parallelism, allowing developers not to care about optimizations and still obtain high levels of performance. Our architecture is organized in three levels of parallelism: domain decomposition using MPI, thread parallelization within each domain, and explicit vectorization. We also included a dynamic load balancing capability in order to equally share the workload among domains.Results on simple tests show excellent sequential performance and a quasi linear speedup on several thousands of cores on various architectures. When applied to production simulations, we report an acceleration up to a factor 30 compared to the code previously used by CEA’s researchers.

Page generated in 0.1038 seconds