Spelling suggestions: "subject:"ppc"" "subject:"dppc""
191 |
Hardening strategies for HPC applications / Estratégias de enrobustecimento para aplicações PADOliveira, Daniel Alfonso Gonçalves de January 2017 (has links)
A confiabilidade de dispositivos de Processamentos de Alto Desempenho (PAD) é uma das principais preocupações dos supercomputadores hoje e para a próxima geração. De fato, o alto número de dispositivos em grandes centros de dados faz com que a probabilidade de ter pelo menos um dispositivo corrompido seja muito alta. Neste trabalho, primeiro avaliamos o problema realizando experimentos de radiação. Os dados dos experimentos nos dão uma taxa de erro realista de dispositivos PAD. Além disso, avaliamos um conjunto representativo de algoritmos que derivam entendimentos gerais de algoritmos paralelos e a confiabilidade de abordagens de programação. Para entender melhor o problema, propomos uma nova metodologia para ir além da quantificação do problema. Qualificamos o erro avaliando a importância de cada execução corrompida por meio de um conjunto dedicado de métricas. Mostramos que em relação a computação imprecisa, a simples detecção de incompatibilidade não é suficiente para avaliar e comparar a sensibilidade à radiação de dispositivos e algoritmos PAD. Nossa análise quantifica e qualifica os efeitos da radiação na saída das aplicações, correlacionando o número de elementos corrompidos com sua localidade espacial. Também fornecemos o erro relativo médio (em nível do conjunto de dados) para avaliar a magnitude do erro induzido pela radiação. Além disso, desenvolvemos um injetor de falhas, CAROL-FI, para entender melhor o problema coletando informações usando campanhas de injeção de falhas, o que não é possível através de experimentos de radiação. Injetamos diferentes modelos de falha para analisar a sensitividade de determinadas aplicações. Mostramos que partes de aplicações podem ser classificadas com diferentes criticalidades. As técnicas de mitigação podem então ser relaxadas ou enrobustecidas com base na criticalidade de partes específicas da aplicação. Este trabalho também avalia a confiabilidade de seis arquiteturas diferentes, variando de dispositivos PAD a embarcados, com o objetivo de isolar comportamentos dependentes de código e arquitetura. Para esta avaliação, apresentamos e discutimos experimentos de radiação que abrangem um total de mais de 352.000 anos de exposição natural e análise de injeção de falhas com base em um total de mais de 120.000 injeções. Por fim, as estratégias de ECC, ABFT e de duplicação com comparação são apresentadas e avaliadas em dispositivos PAD por meio de experimentos de radiação. Apresentamos e comparamos a melhoria da confiabilidade e a sobrecarga imposta das soluções de enrobustecimento selecionadas. Em seguida, propomos e analisamos o impacto do enrobustecimento seletivo para algoritmos de PAD. Realizamos campanhas de injeção de falhas para identificar as variáveis de código-fonte mais críticas e apresentamos como selecionar os melhores candidatos para maximizar a relação confiabilidade/sobrecarga. / HPC device’s reliability is one of the major concerns for supercomputers today and for the next generation. In fact, the high number of devices in large data centers makes the probability of having at least a device corrupted to be very high. In this work, we first evaluate the problem by performing radiation experiments. The data from the experiments give us realistic error rate of HPC devices. Moreover, we evaluate a representative set of algorithms deriving general insights of parallel algorithms and programming approaches reliability. To understand better the problem, we propose a novel methodology to go beyond the quantification of the problem. We qualify the error by evaluating the criticality of each corrupted execution through a dedicated set of metrics. We show that, as long as imprecise computing is concerned, the simple mismatch detection is not sufficient to evaluate and compare the radiation sensitivity of HPC devices and algorithms. Our analysis quantifies and qualifies radiation effects on applications’ output correlating the number of corrupted elements with their spatial locality. We also provide the mean relative error (dataset-wise) to evaluate radiation-induced error magnitude. Furthermore, we designed a homemade fault-injector, CAROL-FI, to understand further the problem by collecting information using fault injection campaigns that is not possible through radiation experiments. We inject different fault models to analyze the sensitivity of given applications. We show that portions of applications can be graded by different criticalities. Mitigation techniques can then be relaxed or hardened based on the criticality of the particular portions. This work also evaluates the reliability behaviors of six different architectures, ranging from HPC devices to embedded ones, with the aim to isolate code- and architecturedependent behaviors. For this evaluation, we present and discuss radiation experiments that cover a total of more than 352,000 years of natural exposure and fault-injection analysis based on a total of more than 120,000 injections. Finally, Error-Correcting Code, Algorithm-Based Fault Tolerance, and Duplication With Comparison hardening strategies are presented and evaluated on HPC devices through radiation experiments. We present and compare both the reliability improvement and imposed overhead of the selected hardening solutions. Then, we propose and analyze the impact of selective hardening for HPC algorithms. We perform fault-injection campaigns to identify the most critical source code variables and present how to select the best candidates to maximize the reliability/overhead ratio.
|
192 |
Tolérance aux pannes dans des environnements de calcul parallèle et distribué : optimisation des stratégies de sauvegarde/reprise et ordonnancement / Fault tolerance in the parallel and distributed environments : optimizing the checkpoint restart strategy and schedulingBouguerra, Mohamed Slim 02 April 2012 (has links)
Le passage de l'échelle des nouvelles plates-formes de calcul parallèle et distribué soulève de nombreux défis scientifiques. À terme, il est envisageable de voir apparaître des applications composées d'un milliard de processus exécutés sur des systèmes à un million de coeurs. Cette augmentation fulgurante du nombre de processeurs pose un défi de résilience incontournable, puisque ces applications devraient faire face à plusieurs pannes par jours. Pour assurer une bonne exécution dans ce contexte hautement perturbé par des interruptions, de nombreuses techniques de tolérance aux pannes telle que l'approche de sauvegarde et reprise (checkpoint) ont été imaginées et étudiées. Cependant, l'intégration de ces approches de tolérance aux pannes dans le couple formé par l'application et la plate-forme d'exécution soulève des problématiques d'optimisation pour déterminer le compromis entre le surcoût induit par le mécanisme de tolérance aux pannes d'un coté et l'impact des pannes sur l'exécution d'un autre coté. Dans la première partie de cette thèse nous concevons deux modèles de performance stochastique (minimisation de l'impact des pannes et du surcoût des points de sauvegarde sur l'espérance du temps de complétion de l'exécution en fonction de la distribution d'inter-arrivées des pannes). Dans la première variante l'objectif est la minimisation de l'espérance du temps de complétion en considérant que l'application est de nature préemptive. Nous exhibons dans ce cas de figure tout d'abord une expression analytique de la période de sauvegarde optimale quand le taux de panne et le surcoût des points de sauvegarde sont constants. Par contre dans le cas où le taux de panne ou les surcoûts des points de sauvegarde sont arbitraires nous présentons une approche numérique pour calculer l'ordonnancement optimal des points de sauvegarde. Dans la deuxième variante, l'objectif est la minimisation de l'espérance de la quantité totale de temps perdu avant la première panne en considérant les applications de nature non-préemptive. Dans ce cas de figure, nous démontrons tout d'abord que si les surcoûts des points sauvegarde sont arbitraires alors le problème du meilleur ordonnancement des points de sauvegarde est NP-complet. Ensuite, nous exhibons un schéma de programmation dynamique pour calculer un ordonnancement optimal. Dans la deuxième partie de cette thèse nous nous focalisons sur la conception des stratégies d'ordonnancement tolérant aux pannes qui optimisent à la fois le temps de complétion de la dernière tâche et la probabilité de succès de l'application. Nous mettons en évidence dans ce cas de figure qu'en fonction de la nature de la distribution de pannes, les deux objectifs à optimiser sont tantôt antagonistes, tantôt congruents. Ensuite en fonction de la nature de distribution de pannes nous donnons des approches d'ordonnancement avec des ratios de performance garantis par rapport aux deux objectifs. / The parallel computing platforms available today are increasingly larger. Typically the emerging parallel platforms will be composed of several millions of CPU cores running up to a billion of threads. This intensive growth of the number of parallel threads will make the application subject to more and more failures. Consequently it is necessary to develop efficient strategies providing safe and reliable completion for HPC parallel applications. Checkpointing is one of the most popular and efficient technique for developing fault-tolerant applications on such a context. However, checkpoint operations are costly in terms of time, computation and network communications. This will certainly affect the global performance of the application. In the first part of this thesis, we propose a performance model that expresses formally the checkpoint scheduling problem. Two variants of the problem have been considered. In the first variant, the objective is the minimization of the expected completion time. Under this model we prove that when the failure rate and the checkpoint cost are constant the optimal checkpoint strategy is necessarily periodic. For the general problem when the failure rate and the checkpoint cost are arbitrary we provide a numerical solution for the problem. In the second variant if the problem, we exhibit the tradeoff between the impact of the checkpoints operations and the lost computation due to failures. In particular, we prove that the checkpoint scheduling problem is NP-hard even in the simple case of uniform failure distribution. We also present a dynamic programming scheme for determining the optimal checkpointing times in all the variants of the problem. In the second part of this thesis, we design several fault tolerant scheduling algorithms that minimize the application makespan and in the same time maximize the application reliability. Mainly, in this part we point out that the growth rate of the failure distribution determines the relationship between both objectives. More precisely we show that when the failure rate is decreasing the two objectives are antagonist. In the second hand when the failure rate is increasing both objective are congruent. Finally, we provide approximation algorithms for both failure rate cases.
|
193 |
Client-side threats and a honeyclient-based defense mechanism, HoneyscoutClementson, Christian January 2009 (has links)
Client-side computers connected to the Internet today are exposed to a lot malicious activity. Browsing the web can easily result in malware infection even if the user only visits well known and trusted sites. Attackers use website vulnerabilities and ad-networks to expose their malicious code to a large user base. The continuing trend of the attackers seems to be botnet construction that collects large amounts of data which could be a serious threat to company secrets and personal integrity. Meanwhile security researches are using a technology known as honeypots/honeyclients to find and analyze new malware. This thesis takes the concept of honeyclients and combines it with a proxy and database software to construct a new kind of real time defense mechanism usable in live environments. The concept is given the name Honeyscout and it analyzes any content before it reaches the user by using visited sites as a starting point for further crawling, blacklisting any malicious content found. A proof-of-concept honeyscout has been developed using the honeyclient Monkey-Spider by Ali Ikinci as a base. Results from the evaluation shows that the concept has potential as an effective and user-friendly defense technology. There are however large needs to further optimize and speed up the crawling process.
|
194 |
Développements HPC pour une nouvelle méthode de docking inverse : applications aux protéines matricielles. / HPC developpements for a new inverse docking method and matrix proteins applications.Vasseur, Romain 29 January 2015 (has links)
Ce travail de thèse consiste au développement méthodologique et logiciel d'une méthode de docking moléculaire dite inverse. Cette méthode propose à travers le programme AMIDE — Automatic Inverse Docking Engine — de distribuer un grand nombres de simulations d'amarrage moléculaire sur des architectures HPC (clusters de calcul) avec les applications AutoDock 4.2 et AutoDock Vina. Le principe de cette méthode consiste à tester de petites molécules sur un ensemble de protéines cibles potentielles. Les paramètres optimaux ont été définis à partir d'une étude pilote et le protocole a été validé sur des ligands et peptides liants les protéines MMPs et EBP de la matrice extracellulaire. Cette méthode montre qu'elle permet d‘améliorer la recherche conformationnelle lors du calcul de docking sur des structures expérimentales par rapport à des protocoles existants (blind docking). Il est montré que le programme AMIDE permet de discriminer des sites de fixation privilégiés lors d'expériences de criblage inverse de protéines de manière plus performante que par blind docking. Ces résultats sont obtenus par la mise en place de méthodes de partitionnement de l'espace de recherche qui permettent également à travers un système de distribution hybride de déployer un ensemble de tâches indépendantes pour un traitement autorisant le passage d'échelle. / This work is a methodological and software development of so-called inverse molecular docking method. This method offers through an in house program AMIDE — Automatic Reverse Docking Engine — to distribute large numbers of molecular docking simulations on HPC architectures (com- puting clusters) with AutoDock 4.2 and AutoDock Vina applications. The principle of this method is to test small molecules on a set of potential target proteins. The program optimum parameters were defined from a pilot study and the protocol was validated on ligands and peptides binding MMPs and EBP extracellular matrix proteins. This method improves the conformational search in docking computation on experimental structures compared to existing protocols (blind docking). It is shown that the AMIDE program is more efficient to discriminate preferred binding sites in inverse proteins screening experiments than blind docking. These results are obtained by the implemen- tation of methods for partitioning the search space that also allow through a hybrid distribution system to deploy a set of independent embarassingly parallel tasks perfectly scalable.
|
195 |
Efektivní komunikace v multi-GPU systémech / Efficient Communication in Multi-GPU SystemsŠpeťko, Matej January 2018 (has links)
After the introduction of CUDA by Nvidia, the GPUs became devices capable of accelerating any general purpose computation. GPUs are designed as parallel processors which posses huge computation power. Modern supercomputers are often equipped with GPU accelerators. Sometimes single GPU performance is not enough for a scientific application and it needs to scale over multiple GPUs. During the computation, there is a need for the GPUs to exchange partial results. This communication represents computation overhead and it is important to research methods of the effective communication between GPUs. This means less CPU involvement, lower latency and shared system buffers. This thesis is focused on inter-node and intra-node GPU-to-GPU communication using GPUDirect technologies from Nvidia and CUDA-Aware MPI. Subsequently, k-Wave toolbox for simulating the propagation of acoustic waves is introduced. This application is accelerated by using CUDA-Aware MPI. Peer-to-peer transfer support is also integrated to k-Wave using CUDA Inter-process Communication.
|
196 |
Řešení pro clusterování serverů / Server clustering techniquesČech, Martin January 2009 (has links)
The work is given an analysis of Open Source Software (further referred as OSS), which allows use and create computer clusters. It explored the issue of clustering and construction of clusters. All installations, configuration and cluster management have been done on the operating system GNU / Linux. Presented OSS makes possible to compile a storage cluster, cluster with load distribution, cluster with high availability and computing cluster. Different types of benchmarks was theoretically analyzed, and practically used for measuring cluster’s performance. Results were compared with others, eg. the TOP500 list of the best clusters available online. Practical part of the work deals with comparing performance computing clusters. With several tens of computational nodes has been established cluster, where was installed package OpenMPI, which allows parallelization of calculations. Subsequently, tests were performed with the High Performance Linpack, which by calculation of linear equations provides total performance. Influence of the parallelization to algorithm PEA was also tested. To present practical usability, cluster has been tested by program John the Ripper, which serves to cracking users passwords. The work shall include the quantity of graphs clarifying the function and mainly showing the achieved results.
|
197 |
Enhancing an InfiniBand driver by utilizing an efficient malloc/free library supporting multiple page sizesRex, Robert 18 September 2006 (has links)
Despite using high-speed network interconnection
systems like InfiniBand, the communication
overhead for parallel applications, especially
in the area of High-Performance Computing (HPC), is still high. Using large
page frames - so called hugepages in Linux - can
improve the crucial work of registering
communication buffers to the network adapter.
Thus, an InfiniBand driver was modified. But these
hugepages do not only reduce communication costs
but can also improve computation time in a
perceptible manner, e.g. by less TLB misses. To
bypass the outlay of rewriting applications, a
preload library was implemented that is able
to utilize large page frames transparently.
This work also shows benchmark results with these
components and performance improvements of up to
10 %.
|
198 |
Execution of SPE code in an Opteron-Cell/B.E. hybrid systemHeinig, Andreas 11 March 2008 (has links)
It is a great research interest to integrate the Cell/B.E. processor into an AMD Opteron system. The result is a system benefiting from the advantages of both processors: the high computational power of the Cell/B.E. and the high I/O throughput of the Opteron.
The task of this diploma thesis is to accomplish, that Cell-SPU code initially residing on the Opteron could be executed on the Cell under the GNU/Linux operating system. However, the SPUFS (Synergistic Processing Unit File System), provided from STI (Sony, Toshiba, IBM), does exactly the same thing on the Cell. The Cell is a combination of a PowerPC core and Synergistic Processing elements (SPE). The main work is to analyze the SPUFS and migrate it to the Opteron System.
The result of the migration is a project called RSPUFS (Remote Synergistic Processing Unit File System), which provides nearly the same interface as SPUFS on the Cell side. The differences are caused by the TCP/IP link between Opteron and Cell, where no Remote Direct Memory Access (RDMA) is available. So it is not possible to write synchronously to the local store of the SPEs. The synchronization occurs implicitly before executing the Cell-SPU code. But not only semantics have changed: to access the XDR memory RSPUFS extends SPUFS with a special XDR interface, where the application can map the XDR into the local address space. The application must be aware of synchronization with an explicit call of the provided ''xdr\_sync'' routine. Another difference is, that RSPUFS does not support the gang principle of SPUFS, which is necessary to set the affinity between the SPEs.
This thesis deals not only with the operating system part, but also with a library called ''libspe''. Libspe provides a wrapper around the SPUFS system calls. It is essential to port this library to the Opteron, because most of the Cell applications use it. Libspe is not only a wrapper, it saves a lot of work for the developer as well, like loading the Cell-SPU code or managing the context and system calls initiated by the SPE. Thus it has to be ported, too.
The result of the work is, that an application can link against the modified libspe on the Opteron gaining direct access to the Synergistic Processor Elements.
|
199 |
Concepts and Prototype for a Collective Offload UnitSchneider, Timo, Eckelmann, Sven 15 December 2011 (has links)
Optimized implementations of blocking and nonblocking collective operations are most important for scalable high-performance applications. Offloading such collective operations into the communication layer can improve performance and asynchronous progression of the operations. However, it is most important that such offloading schemes remain flexible in order to support user-defined (sparse neighbor) collective communications. In this work we propose a design for a collective offload unit.
Our hardware design is able to execute dependency graph based representations of collective functions. To cope with the scarcity of memory resources we designed a new point to point messaging protocol which does not need to store information about unexpected messages. The offload unit proposed in this thesis could be integrated into high performance networks such as EXTOLL. Our design achieves a clock frequency of 212 MHz on a Xilinx Virtex6 FPGA, while using less than 10% of the available logic slices and less than 30% of the available memory blocks. Due to the specialization of our design we can accelerate important tasks of the message passing framework, such as message matching by a factor of two, compared to a software implementation running on a CPU with a ten times higher clock speed.:1. Task Description
1.1. Theses
2. Introduction
2.1. Motivation
2.2. Outline of this Thesis
2.3. Related Work
2.3.1. NIC Based Packet Forwarding
2.3.2. Hardware Barrier Implementations
2.3.3. ConnectX2 CORE-Direct Collective Offload Support
2.3.4. Collective Offload Support in the Portals 4 API
2.4. Group Operation Assembly Language
2.4.1. GOAL API
2.4.2. Scratchpad Buffer
2.4.3. Schedule Execution
2.5. The EXTOLL Network
2.6. Field Programmable Gate Arrays
3. Dealing with Constrained Resources
3.1. Hardware Limitations
3.2. Common Collective Functions in GOAL
3.3. Schedule Representation for the Hardware GOAL Interpreter
3.4. Executing Large Schedules using a small amount of Memory
3.4.1. Limits of Previously Suggested Approaches
3.4.2. Testing for Deadlocks in Schedules
3.4.3. Transforming Process Local Schedules into Global Schedules
3.4.4. Predetermined Buffer Locations
3.5. Queueing Active Operations in Hardware
3.6. Designing a Low-Memory-Footprint Point to Point Protocol
3.6.1. Arrival Times
3.6.2. Eager Protocol
3.6.3. Rendezvous Protocol
3.6.4. A Protocol without an Unexpected Queue
3.7. Protocol Verification
3.7.1. Capabilities of the Model Checker SPIN
3.7.2. Modeling the Protocol
3.7.3. Limitations of the Basic Protocol
4. The Matching Problem
4.1. Matching on the Host CPU
4.2. Implementation Methodology
4.3. Matching Unit Interface
4.4. Matching Unit Implementation
4.4.1. Slot Management Unit
4.4.2. The Input Consumer
4.4.3. The Output Generator
4.4.4. The Matching Unit
4.5. Slot Management Unit for Non-synchronous Transfers
5. The GOAL Interpreter
5.1. Schedule Interpreter Design
5.1.1. The Active Queue
5.1.2. The Dependency Resolver
5.2. Transceiver Interface
5.3. The Starter
5.3.1. Starting Operations
5.3.2. Processing Incoming Packets
5.3.3. Incoming Non-synchronous Packets
5.3.4. Presorting the Active Queue
5.3.5. Arbitration Units
5.3.6. IN-Filter
5.3.7. Outcommand Manager
5.3.8. Non-synchronous Protocol
5.3.9. Send Protocol
5.3.10. Receive Protocol
5.3.11. Local Operations on FPGA
6 Evaluation
6.1. Performance Analysis
6.2. Future Work
6.3. Conclusions
Bibliography
|
200 |
Numerical simulations of the shock wave-boundary layer interactions / Simulations numériques de l’interaction onde de choc couche limiteBen Hassan Saïdi, Ismaïl 04 November 2019 (has links)
Les situations dans lesquelles une onde de choc interagit avec une couche limite sont nombreuses dans les industries aéronautiques et spatiales. Sous certaines conditions (nombre de Mach élevé, grand angle de choc…), ces interactions entrainent un décollement de la couche limite. Des études antérieures ont montré que la zone de recirculation et le choc réfléchi sont tous deux soumis à un mouvement d'oscillation longitudinale à basse fréquence connu sous le nom d’instabilité de l’interaction onde de choc / couche limite (IOCCL). Ce phénomène appelé soumet les structures à des chargement oscillants à basse fréquence qui peuvent endommager les structures.L’objectif du travail de thèse est de réaliser des simulations instationnaires de l’IOCCL afin de contribuer à une meilleure compréhension de l’instabilité de l’IOCCL et des mécanismes physiques sous-jacents.Pour effectuer cette étude, une approche numérique originale est utilisée. Un schéma « One step » volume fini qui couple l’espace et le temps, repose sur une discrétisation des flux convectifs par le schéma OSMP développé jusqu’à l’ordre 7 en temps et en espace. Les flux visqueux sont discrétisés en utilisant un schéma aux différences finies centré standard. Une contrainte de préservation de la monotonie (MP) est utilisée pour la capture de choc. La validation de cette approche démontre sa capacité à calculer les écoulements turbulents et la grande efficacité de la procédure MP pour capturer les ondes de choc sans dégrader la solution pour un surcoût négligeable. Il est également montré que l’ordre le plus élevé du schéma OSMP testé représente le meilleur compromis précision / temps de calcul. De plus un ordre de discrétisation des flux visqueux supérieur à 2 semble avoir une influence négligeable sur la solution pour les nombres de Reynolds relativement élevés considérés.En simulant un cas d’IOCCL 3D avec une couche limite incidente laminaire, l’influence des structures turbulentes de la couche limite sur l’instabilité de l’IOCCL est supprimée. Dans ce cas, l’unique cause d’IOCCL suspectée est liée à la dynamique de la zone de recirculation. Les résultats montrent que seul le choc de rattachement oscille aux fréquences caractéristiques de la respiration basse fréquence du bulbe de recirculation. Le point de séparation ainsi que le choc réfléchi ont une position fixe. Cela montre que dans cette configuration, l’instabilité de l’IOCCL n’a pas été reproduite.Afin de reproduire l’instabilité de l’IOCCL, la simulation de l’interaction entre une onde de choc et une couche limite turbulente est réalisée. Une méthode de turbulence synthétique (Synthetic Eddy Method - SEM) est développée et utilisée à l’entrée du domaine de calcul pour initier une couche limite turbulente à moindre coût. L’analyse des résultats est effectuée en utilisant notamment la méthode snapshot-POD (Proper Orthogonal Decomposition). Pour cette simulation, l’instabilité de l’IOCCL a été reproduite. Les résultats suggèrent que la dynamique du bulbe de recirculation est dominée par une respiration à moyenne fréquence. Ces cycles successifs de remplissage / vidange de la zone séparée sont irréguliers dans le temps avec une taille maximale du bulbe de recirculation variant d’un cycle à l’autre. Ce comportement du bulbe de recirculation traduit une modulation basse fréquence des amplitudes des oscillations des points de séparation et de recollement et donc une respiration basse fréquence de la zone séparée. Ces résultats suggèrent que l’instabilité de l’IOCCL est liée à cette dynamique basse fréquence du bulbe de recirculation, les oscillations du pied du choc réfléchi étant en phase avec le point de séparation. / Situations where an incident shock wave impinges upon a boundary layer are common in the aeronautical and spatial industries. Under certain circumstances (High Mach number, large shock angle...), the interaction between an incident shock wave and a boundary layer may create an unsteady separation bubble. This bubble, as well as the subsequent reflected shock wave, are known to oscillate in a low-frequency streamwise motion. This phenomenon, called the unsteadiness of the shock wave boundary layer interaction (SWBLI), subjects structures to oscillating loads that can lead to damages for the solid structure integrity.The aim of the present work is the unsteady numerical simulation of (SWBLI) in order to contribute to a better understanding of the SWBLI unsteadiness and the physical mechanism causing these low frequency oscillations of the interaction zone.To perform this study, an original numerical approach is used. The one step Finite Volume approach relies on the discretization of the convective fluxes of the Navier Stokes equations using the OSMP scheme developed up to the 7-th order both in space and time, the viscous fluxes being discretized using a standard centered Finite-Difference scheme. A Monotonicity-Preserving (MP) constraint is employed as a shock capturing procedure. The validation of this approach demonstrates the correct accuracy of the OSMP scheme to predict turbulent features and the great efficiency of the MP procedure to capture discontinuities without spoiling the solution and with an almost negligible additional cost. It is also shown that the use of the highest order tested of the OSMP scheme is relevant in term of simulation time and accuracy compromise. Moreover, an order of accuracy higher than 2-nd order for approximating the diffusive fluxes seems to have a negligible influence on the solution for such relatively high Reynolds numbers.By simulating the 3D unsteady interaction between a laminar boundary layer and an incident shock wave, we suppress the suspected influence of the large turbulent structures of the boundary layer on the SWBLI unsteadiness, the only remaining suspected cause of unsteadiness being the dynamics of the separation bubble. Results show that only the reattachment point oscillates at low frequencies characteristic of the breathing of the separation bubble. The separation point of the recirculation bubble and the foot of the reflected shock wave have a fixed location along the flat plate with respect to time. It shows that, in this configuration, the SWBLI unsteadiness is not observed.In order to reproduce and analyse the SWBLI unsteadiness, the simulation of a shock wave turbulent boundary layer interaction (SWTBLI) is performed. A Synthetic Eddy Method (SEM), adapted to compressible flows, has been developed and used at the inlet of the simulation domain for initiating the turbulent boundary layer without prohibitive additional computational costs. Analyses of the results are performed using, among others, the snapshot Proper Orthogonal Decomposition (POD) technique. For this simulation, the SWBLI unsteadiness has been observed. Results suggest that the dominant flapping mode of the recirculation bubble occurs at medium frequency. These cycles of successive enlargement and shrinkage of the separated zone are shown to be irregular in time, the maximum size of the recirculation bubble being submitted to discrepancies between successive cycles. This behaviour of the separation bubble is responsible for a low frequency temporal modulation of the amplitude of the separation and reattachment point motions and thus for the low frequency breathing of the separation bubble. These results tend to suggest that the SWBLI unsteadiness is related to this low frequency dynamics of the recirculation bubble; the oscillations of the reflected shocks foot being in phase with the motion of the separation point.
|
Page generated in 0.0632 seconds