• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 149
  • 24
  • 19
  • 12
  • 8
  • 4
  • 4
  • 4
  • 3
  • 2
  • 2
  • 1
  • Tagged with
  • 269
  • 96
  • 82
  • 74
  • 67
  • 47
  • 37
  • 35
  • 31
  • 30
  • 28
  • 26
  • 25
  • 25
  • 25
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
261

Profilage système et leviers verts pour les infrastructures distribuées à grande échelle

Tsafack Chetsa, Ghislain Landry 03 December 2013 (has links) (PDF)
De nos jours, réduire la consommation énergétique des infrastructures de calcul à grande échelle est devenu un véritable challenge aussi bien dans le monde académique qu'industriel. Ceci est justifié par les nombreux efforts visant à réduire la consommation énergétique de ceux-ci. Ces efforts peuvent sans nuire à la généralité être divisés en deux groupes : les approches matérielles et les approches logicielles. Contrairement aux approches matérielles, les approches logicielles connaissent très peu de succès à cause de leurs complexités. En effet, elles se focalisent sur les applications et requièrent souvent une très bonne compréhension des solutions proposées et/ou de l'application considérée. Ce fait restreint leur utilisation à un nombre limité d'experts puisqu'en général les utilisateurs n'ont pas les compétences nécessaires à leurs implémentation. Aussi, les solutions actuelles en plus de leurs complexités de déploiement ne prennent en compte que le processeur alors que les composants tel que la mémoire, le stockage et le réseau sont eux aussi de gros consommateurs d'énergie. Cette thèse propose une méthodologie de réduction de la consommation énergétique des infrastructures de calcul à grande échelle. Elaborée en trois étapes à savoir : (i) détection de phases, (ii) caractérisation de phases détectées et (iii) identification de phases et reconfiguration du système ; elle s'abstrait de toute application en se focalisant sur l'infrastructure dont elle analyse le comportement au cours de son fonctionnement afin de prendre des décisions de reconfiguration. La méthodologie proposée est implémentée et évaluée sur des grappes de calcul à haute performance de tailles variées par le biais de MREEF (Multi-Resource Energy Efficient Framework). MREEF implémente la méthodologie de réduction énergétique de manière à permettre aux utilisateurs d'implémenter leurs propres mécanismes de reconfiguration du système en fonction des besoins. Les résultats expérimentaux montrent que la méthodologie proposée réduit la consommation énergétique de 24% pour seulement une perte de performance de moins de 7%. Ils montrent aussi que pour réduire la consommation énergétique des systèmes, on peut s'appuyer sur les sous-systèmes tels que les sous-systèmes de stockage et de communication. Nos validations montrent que notre méthodologie s'étend facilement à un grand nombre de grappes de calcul sensibles à l'énergie (energy aware). L'extension de MREEF dans les environnements virtualisés tel que le cloud montre que la méthodologie proposée peut être utilisée dans beaucoup d'autres environnements de calcul.
262

A Domain Specific Embedded Language in C++ for lowest-order methods for diffusive problem on general meshes

Gratien, Jean-Marc 27 May 2013 (has links) (PDF)
La spécificité des logiciels scientifiques développés par IFP Energies nouvelles tient avant tout à l'originalité des modèles représentant les situations physiques exprimés sous forme de systèmes d'EDPs assortis de lois de fermeture complexes. Le développement de ces logiciels, conçus pour être exécutés sur les super calculateurs parallèles modernes, nécessite de combiner des méthodes volumes finis robustes et efficaces avec des technologies informatiques qui permettent de tirer au mieux parti de ces calculateurs (parallélisme, gestion de la mémoire, réseaux d'interconnexion, etc). Ces technologies de plus en plus sophistiquées ne peuvent plus être maîtrisées dans leur ensemble par les chercheurs métiers chargés d'implémenter des nouveaux modèles. Dans ce rapport nous proposons un langage spécifique aux méthodes de discrétisation Volumes Finis permettant le prototypage rapide de codes industriels ou de recherche. Nous décrivons le cadre mathématique sur lequel nous nous basons ainsi que la mise au point du nouveau langage. Les travaux out été validés sur des problèmes académiques puis par le prototypage d'une application industrielle dans le cadre de l'axe ''CO2 maîtrisé''.
263

Contributions to parallel stochastic simulation: Application of good software engineering practices to the distribution of pseudorandom streams in hybrid Monte-Carlo simulations

Passerat-Palmbach, Jonathan 11 October 2013 (has links) (PDF)
The race to computing power increases every day in the simulation community. A few years ago, scientists have started to harness the computing power of Graphics Processing Units (GPUs) to parallelize their simulations. As with any parallel architecture, not only the simulation model implementation has to be ported to the new parallel platform, but all the tools must be reimplemented as well. In the particular case of stochastic simulations, one of the major element of the implementation is the pseudorandom numbers source. Employing pseudorandom numbers in parallel applications is not a straightforward task, and it has to be done with caution in order not to introduce biases in the results of the simulation. This problematic has been studied since parallel architectures are available and is called pseudorandom stream distribution. While the literature is full of solutions to handle pseudorandom stream distribution on CPU-based parallel platforms, the young GPU programming community cannot display the same experience yet. In this thesis, we study how to correctly distribute pseudorandom streams on GPU. From the existing solutions, we identified a need for good software engineering solutions, coupled to sound theoretical choices in the implementation. We propose a set of guidelines to follow when a PRNG has to be ported to GPU, and put these advice into practice in a software library called ShoveRand. This library is used in a stochastic Polymer Folding model that we have implemented in C++/CUDA. Pseudorandom streams distribution on manycore architectures is also one of our concerns. It resulted in a contribution named TaskLocalRandom, which targets parallel Java applications using pseudorandom numbers and task frameworks. Eventually, we share a reflection on the methods to choose the right parallel platform for a given application. In this way, we propose to automatically build prototypes of the parallel application running on a wide set of architectures. This approach relies on existing software engineering tools from the Java and Scala community, most of them generating OpenCL source code from a high-level abstraction layer.
264

Thermal finite element analysis of ceramic/metal joining for fusion using X-ray tomography data

Evans, Llion Marc January 2013 (has links)
A key challenge facing the nuclear fusion community is how to design a reactor that will operate in environmental conditions not easily reproducible in the laboratory for materials testing. Finite element analysis (FEA), commonly used to predict components’ performance, typically uses idealised geometries. An emerging technique shown to have improved accuracy is image based finite element modelling (IBFEM). This involves converting a three dimensional image (such as from X ray tomography) into an FEA mesh. A main advantage of IBFEM is that models include micro structural and non idealised manufacturing features. The aim of this work was to investigate the thermal performance of a CFC Cu divertor monoblock, a carbon fibre composite (CFC) tile joined through its centre to a CuCrZr pipe with a Cu interlayer. As a plasma facing component located where thermal flux in the reactor is at its highest, one of its primary functions is to extract heat by active cooling. Therefore, characterisation of its thermal performance is vital. Investigation of the thermal performance of CFC Cu joining methods by laser flash analysis and X ray tomography showed a strong correlation between micro structures at the material interface and a reduction in thermal conductivity. Therefore, this problem leant itself well to be investigated further by IBFEM. However, because these high resolution models require such large numbers of elements, commercial FEA software could not be used. This served as motivation to develop parallel software capable of performing the necessary transient thermal simulations. The resultant code was shown to scale well with increasing problem sizes and a simulation with 137 million elements was successfully completed using 4096 cores. In comparison with a low resolution IBFEM and traditional FEA simulations it was demonstrated to provide additional accuracy. IBFEM was used to simulate a divertor monoblock mock up, where it was found that a region of delamination existed on the CFC Cu interface. Predictions showed that if this was aligned unfavourably it would increase thermal gradients across the component thus reducing lifespan. As this was a feature introduced in manufacturing it would not have been accounted for without IBFEM.The technique developed in this work has broad engineering applications. It could be used similarly to accurately model components in conditions unfeasible to produce in the laboratory, to assist in research and development of component manufacturing or to verify commercial components against manufacturers’ claims.
265

HIGH-PERFORMANCE COMPUTING MODEL FOR A BIO-FUEL COMBUSTION PREDICTION WITH ARTIFICIAL INTELLIGENCE

Veeraraghava Raju Hasti (8083571) 06 December 2019 (has links)
<p>The main accomplishments of this research are </p> <p>(1) developed a high fidelity computational methodology based on large eddy simulation to capture lean blowout (LBO) behaviors of different fuels; </p> <p>(2) developed fundamental insights into the combustion processes leading to the flame blowout and fuel composition effects on the lean blowout limits; </p> <p>(3) developed artificial intelligence-based models for early detection of the onset of the lean blowout in a realistic complex combustor. </p> <p>The methodologies are demonstrated by performing the lean blowout (LBO) calculations and statistical analysis for a conventional (A-2) and an alternative bio-jet fuel (C-1).</p> <p>High-performance computing methodology is developed based on the large eddy simulation (LES) turbulence models, detailed chemistry and flamelet based combustion models. This methodology is employed for predicting the combustion characteristics of the conventional fuels and bio-derived alternative jet fuels in a realistic gas turbine engine. The uniqueness of this methodology is the inclusion of as-it-is combustor hardware details such as complex hybrid-airblast fuel injector, thousands of tiny effusion holes, primary and secondary dilution holes on the liners, and the use of highly automated on the fly meshing with adaptive mesh refinement. The flow split and mesh sensitivity study are performed under non-reacting conditions. The reacting LES simulations are performed with two combustion models (finite rate chemistry and flamelet generated manifold models) and four different chemical kinetic mechanisms. The reacting spray characteristics and flame shape are compared with the experiment at the near lean blowout stable condition for both the combustion models. The LES simulations are performed by a gradual reduction in the fuel flow rate in a stepwise manner until a lean blowout is reached. The computational methodology has predicted the fuel sensitivity to lean blowout accurately with correct trends between the conventional and alternative bio-jet fuels. The flamelet generated manifold (FGM) model showed 60% reduction in the computational time compared to the finite rate chemistry model. </p> <p>The statistical analyses of the results from the high fidelity LES simulations are performed to gain fundamental insights into the LBO process and identify the key markers to predict the incipient LBO condition in swirl-stabilized spray combustion. The bio-jet fuel (C-1) exhibits significantly larger CH<sub>2</sub>O concentrations in the fuel-rich regions compared to the conventional petroleum fuel (A-2) at the same equivalence ratio. It is observed from the analysis that the concentration of formaldehyde increases significantly in the primary zone indicating partial oxidation as we approach the LBO limit. The analysis also showed that the temperature of the recirculating hot gases is also an important parameter for maintaining a stable flame. If this temperature falls below a certain threshold value for a given fuel, the evaporation rates and heat release rated decreases significantly and consequently leading to the global extinction phenomena called lean blowout. The present study established the minimum recirculating gas temperature needed to maintain a stable flame for the A-2 and C-1 fuels. </p> The artificial intelligence (AI) models are developed based on high fidelity LES data for early identification of the incipient LBO condition in a realistic gas turbine combustor under engine relevant conditions. The first approach is based on the sensor-based monitoring at the optimal probe locations within a realistic gas turbine engine combustor for quantities of interest using the Support Vector Machine (SVM). Optimal sensor locations are found to be in the flame root region and were effective in detecting the onset of LBO ~20ms ahead of the event. The second approach is based on the spatiotemporal features in the primary zone of the combustor. A convolutional autoencoder is trained for feature extraction from the mass fraction of the OH ( data for all time-steps resulting in significant dimensionality reduction. The extracted features along with the ground truth labels are used to train the support vector machine (SVM) model for binary classification. The LBO indicator is defined as the output of the SVM model, 1 for unstable and 0 for stable. The LBO indicator stabilized to the value of 1 approximately 30 ms before complete blowout.
266

Contribution à la modélisation eulérienne unifiée de l’injection : de la zone dense au spray polydispersé / Contribution to a unified Eulerian modeling of fuel injection : from dense liquid to polydisperse spray

Essadki, Mohamed 13 February 2018 (has links)
L’injection directe à haute pression du carburant dans les moteurs à combustion interne permet une atomisation compacte et efficace. Dans ce contexte, la simulation numérique de l’injection est devenue un outil fondamental pour la conception industrielle. Cependant,l’écoulement du carburant liquide dans une chambre occupée initialement par l’air est un écoulement diphasique très complexe ; elle implique une très large gamme d’échelles. L’objectif de cette thèse est d’apporter de nouveaux éléments de modélisation et de simulation afin d’envisager une simulation prédictive de ce type d’écoulement avec un coût de calcul abordable dans un contexte industriel. En effet, au vu du coût de calcul prohibitif de la simulation directe de l’ensemble des échelles spatiales et temporelles, nous devons concevoir une gamme de modèles d’ordre réduit prédictifs. En outre, des méthodes numériques robustes, précises et adaptées au calcul de haute performance sont primordiales pour des simulations complexes.Cette thèse est dédiée au développement d’un modèle d’ordre réduit Eulérien capable de capter tant la polydispersiond’un brouillard de goutte dans la zone dispersée,que la dynamique de l’interface dans le régime de phases séparées. En s’appuyant sur une extension des méthodes de moments d’ordre élevé à des moments fractionnaires qui représentent des quantités géométriques de l’interface, et sur l’utilisation de variables géométrique sen sous-échelle dans la zone où l’interface gaz-liquide ne peut plus être complètement résolue, nous proposons une approche unifiée où un ensemble de variables géométriques sont transportées et valides dans les deux régimes d’écoulement [...]. / Direct fuel injection systems are widely used in combustionengines to better atomize and mix the fuel withthe air. The design of new and efficient injectors needsto be assisted with predictive simulations. The fuel injectionprocess involves different two-phase flow regimesthat imply a large range of scales. In the context of thisPhD, two areas of the flow are formally distinguished:the dense liquid core called separated phases and thepolydisperse spray obtained after the atomization. Themain challenge consists in simulating the combinationof these regimes with an acceptable computational cost.Direct Numerical Simulations, where all the scales needto be solved, lead to a high computational cost for industrialapplications. Therefore, modeling is necessaryto develop a reduced order model that can describe allregimes of the flow. This also requires major breakthroughin terms of numerical methods and High PerformanceComputing (HPC).This PhD investigates Eulerian reduced order models todescribe the polydispersion in the disperse phase andthe gas-liquid interface in the separated phases. First,we rely on the moment method to model the polydispersionin the downstream region of the flow. Then,we propose a new description of the interface by usinggeometrical variables. These variables can provide complementaryinformation on the interface geometry withrespect to a two-fluid model to simulate the primary atomization.The major contribution of this work consistsin using a unified set of variables to describe the tworegions: disperse and separated phases. In the case ofspherical droplets, we show that this new geometricalapproach can degenerate to a moment model similar toEulerian Multi-Size Model (EMSM). However, the newmodel involves fractional moments, which require somespecific treatments. This model has the same capacityto describe the polydispersion as the previous Eulerianmoment models: the EMSM and the multi-fluid model.But, it also enables a geometrical description of the interface...].
267

Scalable Parallel Machine Learning on High Performance Computing Systems–Clustering and Reinforcement Learning

Weijian Zheng (14226626) 08 December 2022 (has links)
<p>High-performance computing (HPC) and machine learning (ML) have been widely adopted by both academia and industries to address enormous data problems at extreme scales. While research has reported on the interactions of HPC and ML, achieving high performance and scalability for parallel and distributed ML algorithms is still a challenging task. This dissertation first summarizes the major challenges for applying HPC to ML applications: 1) poor performance and scalability, 2) loss of the convergence rate, 3) lower quality of the trained model, and 4) a lack of performance optimization techniques designed for specific applications. Researchers can address the four challenges in new ML applications. This dissertation shows how to solve them for two specific applications: 1) a clustering algorithm and 2) graph optimization algorithms that use reinforcement learning (RL).</p> <p>As to the clustering algorithm, we first propose an algorithm called the simulated-annealing clustering algorithm. By combining a blocked data layout and asynchronous local optimization within each thread, the simulated-annealing enhanced clustering algorithm has a convergence rate that is comparable to the K-means algorithm but with much higher performance. Experiments with synthetic and real-world datasets show that the simulated-annealing enhanced clustering algorithm is significantly faster than the MPI K-means library using up to 1024 cores. However, the optimization costs (Sum of Square Error (SSE)) of the simulated-annealing enhanced clustering algorithm became higher than the original costs. To tackle this problem, we devise a new algorithm called the full-step feel-the-way clustering algorithm. In the full-step feel-the-way algorithm, there are L local steps within each block of data points. We use the first local step’s results to compute accurate global optimization costs. Our results show that the full-step algorithm can significantly reduce the global number of iterations needed to converge while obtaining low SSE costs. However, the time spent on the local steps is greater than the benefits of the saved iterations. To improve this performance, we next optimize the local step time by incorporating a sampling-based method called reassignment-history-aware sampling. Extensive experiments with various synthetic and real world datasets (e.g., MNIST, CIFAR-10, ENRON, and PLACES-2) show that our parallel algorithms can outperform the fastest open-source MPI K-means implementation by up to 110% on 4,096 CPU cores with comparable SSE costs.</p> <p>Our evaluations of the sampling-based feel-the-way algorithm establish the effectiveness of the local optimization strategy, the blocked data layout, and the sampling methods for addressing the challenges of applying HPC to ML applications. To explore more parallel strategies and optimization techniques, we focus on a more complex application: graph optimization problems using reinforcement learning (RL). RL has proved successful for automatically learning good heuristics to solve graph optimization problems. However, the existing RL systems either do not support graph RL environments or do not support multiple or many GPUs in a distributed setting. This has compromised RL’s ability to solve large scale graph optimization problems due to the lack of parallelization and high scalability. To address the challenges of parallelization and scalability, we develop OpenGraphGym-MG, a high performance distributed-GPU RL framework for solving graph optimization problems. OpenGraphGym-MG focuses on a class of computationally demanding RL problems in which both the RL environment and the policy model are highly computation intensive. In this work, we distribute large-scale graphs across distributed GPUs and use spatial parallelism and data parallelism to achieve scalable performance. We compare and analyze the performance of spatial and data parallelism and highlight their differences. To support graph neural network (GNN) layers that take data samples partitioned across distributed GPUs as input, we design new parallel mathematical kernels to perform operations on distributed 3D sparse and 3D dense tensors. To handle costly RL environments, we design new parallel graph environments to scale up all RL-environment-related operations. By combining the scalable GNN layers with the scalable RL environment, we are able to develop high performance OpenGraphGym-MG training and inference algorithms in parallel.</p> <p>To summarize, after proposing the major challenges for applying HPC to ML applications, this thesis explores several parallel strategies and performance optimization techniques using two ML applications. Specifically, we propose a local optimization strategy, a blocked data layout, and sampling methods for accelerating the clustering algorithm, and we create a spatial parallelism strategy, a parallel graph environment, agent, and policy model, and an optimized replay buffer, and multi-node selection strategy for solving large optimization problems over graphs. Our evaluations prove the effectiveness of these strategies and demonstrate that our accelerations can significantly outperform the state-of-the-art ML libraries and frameworks without loss of quality in trained models.</p>
268

Compréhension des mécanismes d'interaction entre des nanotubes de carbone et une membrane biologique : effets toxiques et vecteurs de médicaments potentiels

Kraszewski, Sebastian 17 September 2010 (has links) (PDF)
Ce travail de thèse concerne l'étude théorique des mécanismes d'interaction de nanostructures à base de carbone avec les membranes cellulaires, constituant l'essentiel des cellules vivantes. Ce sujet très complexe compte tenu de la pluridisciplinarité de la thématique a été essentiellement réalisé à l'aide de simulations numériques. Nous avons volontairement partagé ce travail en deux parties distinctes. Nous avons d'abord étudié le fonctionnement des canaux ioniques à l'aide de la dynamique moléculaire et des études ab-initio. Ces canaux sont d'une part des protéines membranaires essentielles pour la fonction cellulaire, et d'autre part, elles constituent aussi des cibles thérapeutiques fréquentes dans la recherche des nouveaux médicaments. Dans une seconde partie, nous avons étudié le comportement d'espèces carbonées nus et fonctionnalisés tels que les fullerènes (C60) et les nanotubes (CNT) en présence de la membrane cellulaire en analysant finement le mécanisme d'ingestion (ang. uptake) de ces vecteurs de médicaments potentiels par les membranes biologiques. Ces études en dynamique moléculaire sur des temps très longs (sub-1 μs) et sur des systèmes très vastes étaient aussi le challenge du point de vue informatique. Pour palier la problématique dans le temps limitée d'une thèse le développement des calculs parallèles de haute performance CPU/GPU a du être mis en place. Les résultats obtenus tentent de mettre en évidence le rôle toxique que peuvent présentées certaines nanostructures vis-à-vis des protéines membranaires précédemment étudiées. Ce travail de thèse ouvre naturellement la voie à l'étude des nanovecteurs biocompatibles pour la délivrance des médicaments.
269

Modeling, Simulation, and Injection of Camera Images/Video to Automotive Embedded ECU : Image Injection Solution for Hardware-in-the-Loop Testing

Lind, Anton January 2023 (has links)
Testing, verification and validation of sensors, components and systems is vital in the early-stage development of new cars with computer-in-the-car architecture. This can be done with the help of the existing technique, hardware-in-the-loop (HIL) testing which, in the close loop testing case, consists of four main parts: Real-Time Simulation Platform, Sensor Simulation PC, Interface Unit (IU), and unit under test which is, for instance, a Vehicle Computing Unit (VCU). The purpose of this degree project is to research and develop a proof of concept for in-house development of an image injection solution (IIS) on the IU in the HIL testing environment. A proof of concept could confirm that editing, customizing, and having full control of the IU is a possibility. This project was initiated by Volvo Cars to optimize the use of the HIL testing environment currently available, making the environment more changeable and controllable while the IIS remains a static system. The IU is an MPSoC/FPGA based design that uses primarily Xilinx hardware and software (Vivado/Vitis) to achieve the necessary requirements for image injection in the HIL testing environment. It consists of three stages in series: input, image processing, and output. The whole project was divided in three parts based on the three stages and carried out at Volvo Cars in cooperation by three students, respectively. The author of this thesis was responsible for the output stage, where the main goal was to find a solution for converting, preferably, AXI4 RAW12 image data into data on CSI2 format. This CSI2 data can then be used as input to serializers, which in turn transmit the data via fiber-optic cable on GMSL2 format to the VCU. Associated with the output stage, extensive simulations and hardware tests have been done on a preliminary solution that partially worked on the hardware, producing signals in parts of the design that could be read and analyzed. However, a final definite solution that fully functions on the hardware has not been found, because the work is at the initial phase of an advanced and very complex project. Presented in this thesis is: important theory regarding, for example, protocols CSI2, AXI4, GMSL2, etc., appropriate hardware selection for an IIS in HIL (FPGA, MPSoC, FMC, etc.), simulations of AXI4 and CSI2 signals, comparisons of those simulations with the hardware signals of an implemented design, and more. The outcome was heavily dependent on getting a certain hardware (TEF0010) to transmit the GMSL2 data. Since the wrong card was provided, this was the main problem that hindered the thesis from reaching a fully functioning implementation. However, these results provide a solid foundation for future work related to image injection in a HIL environment.

Page generated in 0.0452 seconds