Global ETD Search

181	Une méthode multidomaine parallèle pour les écoulements incompressibles en géométries cylindriques : applications aux écoulements turbulents soumis à la rotation / A parallelized multidomain compact solver for incompressible turbulent flows in cylindrical geometries : application to the simulation of turbulent rotating flows Oguic, Romain 19 October 2015 (has links) Ce travail concerne l’étude d’écoulements incompressibles soumis à la rotation avec un solveur haute précision dans des géométries semi-complexes. La technique numérique mise en œuvre combine des schémas compacts, une méthode de projection multi domaine directe et un traitement efficace de la singularité à l’axe basé sur des conditions de parité dans l’espace de Fourier. Le solveur a été parallélisé avec une approche hybride MPI-OpenMP pour réduire les temps de calcul. Dans un premier temps, les précisions spatiales et temporelles de la méthode numérique et la scalabilité du solveur ont été vérifiées. La capacité du solveur à traiter des écoulements plus complexes a été évaluée en considérant des écoulements de type éclatement tourbillonnaire et un écoulement turbulent en conduite cylindrique. Dans un second temps, plusieurs écoulements typiques des machines tournantes ont été étudiés. Le premier écoulement est un écoulement turbulent incompressible isotherme dans un étage simplifié d’un compresseur haute pression d’une turbine à gaz. Les simulations menées ont mises en évidence l’effet de la rotation sur l’écoulement, notamment sur les instabilités se développant le long des parois et sur les différentes structures cohérentes. Le second cas traité est un écoulement turbulent de jet impactant un disque en rotation avec un fort confinement et transfert thermique. Une attention particulière a été portée sur les champs hydrodynamiques et thermique le long du rotor. Enfin, une étude préliminaire d’un jet turbulent impactant un disque fixe d’épaisseur non nulle dans une configuration moins contrainte avec prise en compte du couplage conduction-convection a été réalisée. / This work deals with the study of rotating incompressible flows with a high accurate solver in semi complex geometries. The numerical method used in this work combines compact schemes, a direct multidomain projection method and an efficient axis treatment based on parity conditions in Fourier space. The use of cylindrical coordinates introduces this mathematical singularity. In order to reduce the calculation time, the solver was parallelized with an hybrid MPI-OpenMP parallelization. First, the spatial and temporal accuracies of the numerical method and the scalability of the solver were checked. Then, the capability of the algorithm to deal with more complex flows was verified. Vortex breakdown flows and turbulent pipe flow were studied. In the second step, typical flows of turbomachineries and rotating systems were considered. The first flow was an incompressible isothermal turbulent flow in a high pressure compressor of gas turbine. The different simulations highlighted the rotation effects on the flows, especially on the instabilities appearing along the walls and the coherent structures. The second considered flow was a turbulent impinging jet on a rotating disk with heat transfer in a small aspect ratio cavity. The hydrodynamic fields and heat transfer near the rotor were analyzed in detail. Finally, a preliminary investigation of an impinging jet on a non-rotating disk in a larger aspect ratio cavity with a coupling between conduction and convection transfer was carried out. Singularité à l’axe Méthode de projection multidomaine Parallélisation hybride Ecoulement soumis à la rotation Turbulence Transfert thermique Axis singularity Projection decomposition method Hybrid parallelization Rotating flows Turbulence Heat transfer
182	Parallelization of Animation Blending on the PlayStation®3 / Parallellisering av Animationssystem på PlayStation®3 Jakobsson, Teodor January 2012 (has links) An animation system gives a dynamic and life-like feel to character motions, allowing motion behaviour that far transcends the mere spatial translations of classic computer games. This increase in behavioural complexity however does not come for free as animation systems often are haunted by considerable performance overhead, the extent of which reflecting the complexity of the desired system. In game development performance optimization is key, the pursuit of which is aided by the static hardware configuration of modern gaming consoles. These allow extensive optimization through specializing the application, at whole or in part, to the underlying hardware architecture. In this master's theses a method, that efficiently utilizes the parallel architecture of the PlayStation®3, is proposed in order to migrate the process of animation evaluation and blending from a single-thread implementation on the main processor to a fully parallelized multi-thread solution on the associated coprocessors. This method is further complimented with an in-depth study of the underlying theoretical foundations, as well as a reflection on similar works and approaches as used by other contemporary game development companies. parallelization character animation blending PlayStation Cell Broadband Engine Architecture CBEA Parallellisering karaktär animering blendning PlayStation Cell Broadband Engine Architecture CBEA Computer Sciences Datavetenskap (datalogi)
183	Determination of end user power load profiles by parallel evolutionary computing / Détermination de profils de consommation électrique par évolution artificielle parallèle Krüger, Frédéric 17 February 2014 (has links) Il est primordial, pour un distributeur d’énergie électrique, d’obtenir des estimations précises de la demande en énergie de leurs réseaux. Des outils statistiques tels que des profils de consommation électrique offrent des estimations de qualité acceptable. Ces profils ne sont cependant généralement pas assez précis, car ils ne tiennent pas compte de l’influence de facteurs tels que la présence de chauffage électrique ou le type d’habitation. Il est néanmoins possible d’obtenir des profils précis en utilisant uniquement les historiques de consommations d’énergie des clients, les mesures desdéparts 20kV, et un algorithme génétique de séparation de sources. Un filtrage et un prétraitement des données a permis de proposer à l’algorithme génétique de séparation de sources des données adaptées. La séparation de sources particulièrement bruitées est résolue par un algorithme génétique complètement parallélisé sur une carte GPGPU. Les profils de consommation électrique obtenus correspondent aux attentes initiales, et démontrent une amélioration considérable de la précision des estimations de courbes de charge de départs 20kV et de postes de transformation moyenne tension-basse tension. / Precise estimations of the energy demand of a power network are paramount for electrical distribution companies. Statistical tools such as load profiles offer acceptable estimations. These load profiles are, however, usually not precise enough for network engineering at the local level, as they do not take into account factors such as the presence of electrical heating devices or the type of housing. It is however possible to obtain accurate load profiles with no more than end user energy consumption histories, 20kV feeder load measurements, a blind source separation and a genetic algorithm. Filtering and preliminary treatments performed on the data allowed the blind source separation to work with adequate information. The blind source separation presented in this document is successfully solved by a completely parallel genetic algorithm running on a GPGPU card. The power load profiles obtained match the requirements, and demonstrate a considerable improvement in the forecast of 20kV feeder as well as MV substation load curves. Evolution artificielle Profils électrique Calcul inverse Séparation aveugle de source Parallélisation Prévision de charge Optimisation Parallel evolutionary computing Electrical profil Blind source separation Parallelization Load curve forecast 006.3
184	Efficient search-based strategies for polyhedral compilation : algorithms and experience in a production compiler / Stratégies exploratoires efficaces pour la compilation polyédrique : algorithmes et expérience dans un compilateur de production Trifunovic, Konrad 04 July 2011 (has links) Une pression accrue s'exerce sur les compilateurs pour mettre en œuvre des transformations de programmes de plus en plus complexes délivrant le potentiel de performance des processeurs multicœurs et des accélérateurs hétérogènes. L'espace de recherche des optimisations de programmes possibles est gigantesque est manque de structure. La recherche de la meilleure transformation, qui inclut la prédiction des gains estimés de performance offerts par cette transformation, constitue le problème le plus difficiles pour les compilateurs optimisants modernes. Nous avons choisi de nous concentrer sur les transformations de boucles et sur leur automatisation, exprimées dans le modèle polyédrique. Les méthodes d'optimisation de programmes dans le modèle polyédrique se répartissent grossièrement en deux classes. La première repose sur l'optimisation linéaire d'une fonction de analytique de coût. La deuxième classe de méthodes met en œuvre une recherche itérative. La première approche est rapide, mais elle est facilement mise en défaut en ce qui concerne la découverte de la solution optimale. L'approche itérative est plus précise, mais le temps de compilation peut devenir prohibitif. Cette thèse contribue une approche nouvelle de la recherche itérative de transformations de programmes dans le modèle polyédrique. La nouvelle méthode proposée possède la précision et la capacité effective à extraire des transformations profitables des méthodes itératives, tout en en minimisant les faiblesses. Notre approche repose sur l'évaluation systématique d'une fonction de coût et de prédiction de performances non-linéaire. Par ailleurs, la parallélisation automatique dans le modèle polyédrique est actuellement dominée par des outils de compilation source-à-source. Nous avons choisi au contraire d'implémenter nos techniques dans la plateforme GCC, en opérant sur une représentation de code de bas niveau, à trois adresses. Nous montrons que le niveau d'abstraction de la représentation intermédiaire choisie engendre des difficultés de passage à l'échelle, et nous montrons comment les surmonter. À l'inverse, nous montrons qu'une représentation intermédiaire de bas niveau ouvre de nouveaux degrés de liberté, bénéficiant à notre stratégie itérative de recherche de transformations, et à la compilation polyédrique de manière générale. / In order to take the performance advantages of the current multicore and heterogeneous architectures the compilers are required to perform more and more complex program transformations. The search space of the possible program optimizations is huge and unstructured. Selecting the best transformation and predicting the potential performance benefits of that transformation is the major problem in today's optimizing compilers. The promising approach to handling the program optimizations is to focus on the automatic loop optimizations expressed in the polyhedral model. The current approaches for optimizing programs in the polyhedral model broadly fall into two classes. The first class of the methods is based on the linear optimization of the analytical cost function. The second class is based on the exhaustive iterative search. While the first approach is fast, it can easily miss the optimal solution. The iterative approach is more precise, but its running time might be prohibitively expensive. In this thesis we present a novel search-based approach to program transformations in the polyhedral model. The new method combines the benefits - effectiveness and precision - of the current approaches, while it tries to minimize their drawbacks. Our approach is based on enumerating the evaluations of the precise, nonlinear performance predicting cost-function. The current practice is to use the polyhedral model in the context of source-to-source compilers. We have implemented our techniques in a GCC framework that is based on the low level three address code representation. We show that the chosen level of abstraction for the intermediate representation poses scalability challenges, and we show the ways to overcome those problems. On the other hand, it is shown that the low level IR abstraction opens new degrees of freedom that are beneficial for the search-based transformation strategies and for the polyhedral compilation in general. Compilateurs Langages de programmation Modèle polyédrique Transformations de programmes Transformations de boucles La parallélisation automatique Représentation intermédiaire Compilers Programming languages Polyhedral model Program transformations Loop transformations Automatic parallelization Intermediate representation
185	Adaptation de l’algorithmique aux architectures parallèles / Adapting algorithms to parallel architectures Borghi, Alexandre 10 October 2011 (has links) Dans cette thèse, nous nous intéressons à l'adaptation de l'algorithmique aux architectures parallèles. Les plateformes hautes performances actuelles disposent de plusieurs niveaux de parallélisme et requièrent un travail considérable pour en tirer parti. Les superordinateurs possèdent de plus en plus d'unités de calcul et sont de plus en plus hétérogènes et hiérarchiques, ce qui complexifie d'autant plus leur utilisation.Nous nous sommes intéressés ici à plusieurs aspects permettant de tirer parti des architectures parallèles modernes. Tout au long de cette thèse, plusieurs problèmes de natures différentes sont abordés, de manière plus théorique ou plus pratique selon le cadre et l'échelle des plateformes parallèles envisagées.Nous avons travaillé sur la modélisation de problèmes dans le but d'adapter leur formulation à des solveurs existants ou des méthodes de résolution existantes, en particulier dans le cadre du problème de la factorisation en nombres premiers modélisé et résolu à l'aide d'outils de programmation linéaire en nombres entiers.La contribution la plus importante de cette thèse correspond à la conception d'algorithmes pensés dès le départ pour être performants sur les architectures modernes (processeurs multi-coeurs, Cell, GPU). Deux algorithmes pour résoudre le problème du compressive sensing ont été conçus dans ce cadre : le premier repose sur la programmation linéaire et permet d'obtenir une solution exacte, alors que le second utilise des méthodes de programmation convexe et permet d'obtenir une solution approchée.Nous avons aussi utilisé une bibliothèque de parallélisation de haut niveau utilisant le modèle BSP dans le cadre de la vérification de modèles pour implémenter de manière parallèle un algorithme existant. A partir d'une unique implémentation, cet outil rend possible l'utilisation de l'algorithme sur des plateformes disposant de différents niveaux de parallélisme, tout en ayant des performances de premier ordre sur chacune d'entre elles. En l'occurrence, la plateforme de plus grande échelle considérée ici est le cluster de machines multiprocesseurs multi-coeurs. De plus, dans le cadre très particulier du processeur Cell, une implémentation a été réécrite à partir de zéro pour tirer parti de celle-ci. / In this thesis, we are interested in adapting algorithms to parallel architectures. Current high performance platforms have several levels of parallelism and require a significant amount of work to make the most of them. Supercomputers possess more and more computational units and are more and more heterogeneous and hierarchical, which make their use very difficult.We take an interest in several aspects which enable to benefit from modern parallel architectures. Throughout this thesis, several problems with different natures are tackled, more theoretically or more practically according to the context and the scale of the considered parallel platforms.We have worked on modeling problems in order to adapt their formulation to existing solvers or resolution methods, in particular in the context of integer factorization problem modeled and solved with integer programming tools.The main contribution of this thesis corresponds to the design of algorithms thought from the beginning to be efficient when running on modern architectures (multi-core processors, Cell, GPU). Two algorithms which solve the compressive sensing problem have been designed in this context: the first one uses linear programming and enables to find an exact solution, whereas the second one uses convex programming and enables to find an approximate solution.We have also used a high-level parallelization library which uses the BSP model in the context of model checking to implement in parallel an existing algorithm. From a unique implementation, this tool enables the use of the algorithm on platforms with different levels of parallelism, while obtaining cutting edge performance for each of them. In our case, the largest-scale platform that we considered is the cluster of multi-core multiprocessors. More, in the context of the very particular Cell processor, an implementation has been written from scratch to take benefit from it. Parallélisation Vectorisation Architectures parallèles Multi-coeur GPU Cell Programmation linéaire Programmation convexe Parallelization Vectorization Parallel architectures Multi-core GPU Cell Linear programming Convex programming
186	Representação Nó-profundidade em FPGA para algoritmos evolutivos aplicados ao projeto de redes de larga-escala / Node-depth representation in FPGA for evolutionary algorithms applied to network design problems of large-scale Marcilyanne Moreira Gois 26 October 2011 (has links) Diversos problemas do mundo real estão relacionados ao projeto de redes, tais como projeto de circuitos de energia elétrica, roteamento de veículos, planejamento de redes de telecomunicações e reconstrução filogenética. Em geral, esses problemas podem ser modelados por meio de grafos, que manipulam milhares ou milhões de nós (correspondendo às variáveis de entrada), dificultando a obtenção de soluções em tempo real. O Projeto de uma Rede é um problema combinatório, em que se busca encontrar a rede mais adequada segundo um critério como, por exemplo, menor custo, menor caminho e tempo de percurso. A solução desses problemas é, em geral, computacionalmente complexa. Nesse sentido, metaheurísticas como Algoritmos Evolutivos têm sido amplamente investigadas. Diversas pesquisas mostram que o desempenho de Algoritmos Evolutivos para Problemas de Projetos de Redes pode ser aumentado significativamente por meio de representações mais apropriadas. Este trabalho investiga a paralelização da Representação Nó-Profundidade (RNP) em hardware, com o objetivo de encontrar melhores soluções para Problemas de Projetos de Redes. Para implementar a arquitetura de hardware, denominada de HP-RNP (Hardware Parallelized RNP), foi utilizada a tecnologia de FPGA para explorar o alto grau de paralelismo que essa plataforma pode proporcionar. Os resultados experimentais mostraram que o HP-RNP é capaz de gerar e avaliar novas redes em tempo médio limitado por uma constante (O(1)) / Many problems related to network design can be found in real world applications, such as design of electric circuits, vehicle routing, telecommunication network planning and phylogeny reconstruction. In general, these problems can be modelled using graphs that handle thousands or millions of nodes (input variables), making it hard to obtain solutions in real-time. The Network Design is the combinatorial problem of finding the most suitable network subject to a evaluation criterion as, for example, lower cost, minimal path and time to traverse the network. The solution of those problems is in general computationally complex. Metaheuristics as Evolutionary Algorithms have been widely investigated for such problems. Several researches have shown that the performance of Evolutionary Algorithms for the Network Design Problems can be significantly increased through more appropriated dynamic data structures (encodings). This work investigates the parallelization of Node-Depth Encoding (NDE) in hardware in order to find better solutions for Network Design Problems. To implement the proposed hardware architecture, called HP-NDE (Hardware Parallellized NDE), the FPGA technology was used to explore the high degree of parallelism that such platform can provide. The experimental results have shown that the HP-NDE can generate and evaluate new networks in average time constrained by a constant (O(1)) Árvores geradoras Florestas geradoras FPGA Paralelização Problemas de projetos de redes Representação nó-profundidade FPGA Network design problem Node-depth representation Parallelization Spanning forests Spanning trees
187	Vyhledávání konkrétních osob na záznamech z bezpečnostních kamer / Search for People in Recordings from Security Cameras Jezerský, Matouš January 2018 (has links) This thesis deals with the design and implementation of a system, which allows to search for and recognize people in video recordings. The presented design is based on a preceding research in theory relating to the topics of face and people recognition. Furthermore, the system design is implemented using convolutional neural networks for face recognition, while the implementation primarily utilizes the libraries dlib and OpenFace. The design and implementation use parallelization and distribution of tasks among multiple devices to reduce computation time, while also bearing in mind the practical applications of such system, such as working with limited amounts of available information regarding the person we seek. The precision of people detection and recognition of the implemented system is about 70% to 80%, based on the performed task. Among other uses, the system can be utilized to find a particular person in a video recording, to estimate the number of passes through the monitored space of one person, or the number of passes in total, or to find unknown people in the monitored space.
188	Performance Optimizations and Operator Semantics for Streaming Data Flow Programs Sax, Matthias J. 01 July 2020 (has links) Unternehmen sammeln mehr Daten als je zuvor und müssen auf diese Informationen zeitnah reagieren. Relationale Datenbanken eignen sich nicht für die latenzfreie Verarbeitung dieser oft unstrukturierten Daten. Um diesen Anforderungen zu begegnen, haben sich in der Datenbankforschung seit dem Anfang der 2000er Jahre zwei neue Forschungsrichtungen etabliert: skalierbare Verarbeitung unstrukturierter Daten und latenzfreie Datenstromverarbeitung. Skalierbare Verarbeitung unstrukturierter Daten, auch bekannt unter dem Begriff "Big Data"-Verarbeitung, hat in der Industrie schnell Einzug erhalten. Gleichzeitig wurden in der Forschung Systeme zur latenzfreien Datenstromverarbeitung entwickelt, die auf eine verteilte Architektur, Skalierbarkeit und datenparallele Verarbeitung setzen. Obwohl diese Systeme in der Industrie vermehrt zum Einsatz kommen, gibt es immer noch große Herausforderungen im praktischen Einsatz. Diese Dissertation verfolgt zwei Hauptziele: Zuerst wird das Laufzeitverhalten von hochskalierbaren datenparallelen Datenstromverarbeitungssystemen untersucht. Im zweiten Hauptteil wird das "Dual Streaming Model" eingeführt, das eine Semantik zur gleichzeitigen Verarbeitung von Datenströmen und Tabellen beschreibt. Das Ziel unserer Untersuchung ist ein besseres Verständnis über das Laufzeitverhalten dieser Systeme zu erhalten und dieses Wissen zu nutzen um Anfragen automatisch ausreichende Rechenkapazität zuzuweisen. Dazu werden ein Kostenmodell und darauf aufbauende Optimierungsalgorithmen für Datenstromanfragen eingeführt, die Datengruppierung und Datenparallelität einbeziehen. Das vorgestellte Datenstromverarbeitungsmodell beschreibt das Ergebnis eines Operators als kontinuierlichen Strom von Veränderugen auf einer Ergebnistabelle. Dabei behandelt unser Modell die Diskrepanz der physikalischen und logischen Ordnung von Datenelementen inhärent und erreicht damit eine deterministische Semantik und eine minimale Verarbeitungslatenz. / Modern companies are able to collect more data and require insights from it faster than ever before. Relational databases do not meet the requirements for processing the often unstructured data sets with reasonable performance. The database research community started to address these trends in the early 2000s. Two new research directions have attracted major interest since: large-scale non-relational data processing as well as low-latency data stream processing. Large-scale non-relational data processing, commonly known as "Big Data" processing, was quickly adopted in the industry. In parallel, low latency data stream processing was mainly driven by the research community developing new systems that embrace a distributed architecture, scalability, and exploits data parallelism. While these systems have gained more and more attention in the industry, there are still major challenges to operate them at large scale. The goal of this dissertation is two-fold: First, to investigate runtime characteristics of large scale data-parallel distributed streaming systems. And second, to propose the "Dual Streaming Model" to express semantics of continuous queries over data streams and tables. Our goal is to improve the understanding of system and query runtime behavior with the aim to provision queries automatically. We introduce a cost model for streaming data flow programs taking into account the two techniques of record batching and data parallelization. Additionally, we introduce optimization algorithms that leverage our model for cost-based query provisioning. The proposed Dual Streaming Model expresses the result of a streaming operator as a stream of successive updates to a result table, inducing a duality between streams and tables. Our model handles the inconsistency of the logical and the physical order of records within a data stream natively, which allows for deterministic semantics as well as low latency query execution. Datenstromverarbeitung Datenflussprogram Parallelität Optimierung Verarbeitungssemantik Data Stream Processing Data Flow Program Parallelization Optimization Processing Semantics 004 Informatik ST 265 ddc:004
189	Algoritmy stochastického programování / Stochastic Programming Algorithms Klimeš, Lubomír January 2010 (has links) Stochastické programování a optimalizace jsou mocnými nástroji pro řešení široké škály inženýrských problémů zahrnujících neurčitost. Algoritmus progressive hedging je efektivní dekompoziční metoda určená pro řešení scénářových stochastických úloh. Z důvodu vertikální dekompozice je možno tento algoritmus implementovat paralelně, čímž lze významně ušetřit výpočetní čas a ostatní prostředky. Teoretická část této diplomové práce se zabývá matematickým a zejména pak stochastickým programováním a detailně popisuje algoritmus progressive hedging. V praktické části je navržena a diskutována původní paralelní implementace algoritmu progressive hedging, která je pak otestována na jednoduchých úlohách. Dále je uvedená paralelní implementace použita pro řešení inženýrského problému plynulého odlévání ocelové bramy a na závěr jsou získané výsledky zhodnoceny.
190	Segmentovaná diskrétní waveletová transformace / Segmentwise Discrete Wavelet Transform Průša, Zdeněk January 2012 (has links) Dizertační práce se zabývá algoritmy SegDWT pro segmentový výpočet Diskrétní Waveletové Transformace – DWT jedno i vícedimenzionálních dat. Segmentovým výpočtem se rozumí způsob výpočtu waveletové analýzy a syntézy po nezávislých segmentech (blocích) s určitým překryvem tak, že nevznikají blokové artefakty. Analyzující část algoritmu pracuje na principu odstranění přesahu a produkuje vždy část waveletových koeficientů z waveletové transformace celého signálu, které mohou být následně libovolně zpracovány a podrobeny zpětné transformaci. Rekonstruované segmenty jsou pak skládány podle principu přičtení přesahu. Algoritmus SegDWT, ze kterého tato práce vychází, není v současné podobně přímo použitelný pro vícerozměrné signály. Tato práce obsahuje několik jeho modifikací a následné zobecnění pro vícerozměrné signály pomocí principu separability. Kromě toho je v práci představen algoritmus SegLWT, který myšlenku SegDWT přenáší na výpočet waveletové transformace pomocí nekauzálních struktur filtrů typu lifting.

Search results