Global ETD Search

251	Análise de benefícios do paralelismo por comunicação unilateral em aplicações com grades não estruturadas / Improvement analysis of parallelism by one-sided communication on unstructured grids applications Pedro Pais Lopes 03 September 2010 (has links) A computacao paralela, empregada no meio cientifico para resolucao de problemas que de- mandam grande poder computacional, teve nos ultimos anos o surgimento de um novo tipo de comunicacao entre instancias do paralelismo. Trata-se da Comunicacao Unilateral (CUL), onde somente uma instancia realiza a operacao de transferencia de informacoes, e esta ocorre em segundo plano, ao contrario da Comunicacao Bilateral (CBL), onde uma instancia envia a informacao e a outra recebe. Neste contexto se buscou analisar os beneficios que a CUL agrega ao paralelismo de um programa que se utiliza de uma grade nao estruturada em me- moria. Duas formas de apoio ao paralelismo foram utilizadas: uma biblioteca, a \"Message Passing Interface\" (MPI) (especificamente a sua parte que descreve a CUL), e uma extensao a linguagem Fortran, o Coarray Fortran (CAF). A semantica do MPI CUL e mais complexa que a do CAF, mas a do CAF e mais restritiva. Para analisar a semantica e desempenho da CUL foi realizada uma ambientacao utilizando MPI CUL e CAF no paralelismo de um programa simples, denominado jogo da Vida (Game of Life), com grade estruturada e com otimo desempenho paralelo atraves do MPI CBL. Na programacao o MPI CUL se mostrou verborragico (aumento do numero de linhas de codigo) e complexo, principalmente quando se utiliza um controle refinado de sincronismo entre as imagens. Ja o CAF reduziu o nu- mero de linhas de codigo (entre 20% e 40%), e o sincronismo e muito mais simples. Os resultados mostraram uma piora no desempenho no caso do MPI CUL, mas para o CAF o desempenho absoluto foi melhor que a implementacao original ate o numero de nucleos de processamento que compartilham a mesma memoria. Para grades nao estruturadas se utilizou o Ocean Land Atmospheric Model (OLAM), um modelo de simulacao do sistema terrestre com grade baseada em prismas triangulares, paralelizado atraves de MPI CBL. A implementacao da comunicacao por MPI CUL na estrutura do paralelismo existente mos- trou que esta semantica possui alguns pontos que podem prejudicar a programacao, como o tratamento da exposicao de memoria (cada instancia tem uma memoria exposta de tamanho diferente) e como e realizado o sincronismo entre as instancias. Em termos de desempenho as curvas de speed-ups mostraram que o MPI CUL prejudicou o OLAM independentemente da implementacao das bibliotecas ou do equipamento utilizado, com reducao de pelo menos 20% no speed-up para sete ou mais processadores. Assim como no jogo da Vida o MPI com comunicacao unilateral penalizou o desempenho. / Parallel computing is used to solve many scientific problems that demand intensive compu- ting power. Recently a new paradigm of communication between instances of the parallelism has appeared, called the one-sided communication (OSC), where only one instance performs the operation of information transfer, occurring in the background, as opposed to the two- sided communication (TSC), where one instance sends the information and the other receives it. In this context we analyze the benefits that OSC aggregates to the parallelism of a pro- gram that uses an unstructured grid in memory. Two OSC implementations were used: the \"Message Passing Interface\" (MPI) library (specifically the part that describes OSC), and Coarray Fortran (CAF), an extension of the Fortran language. The semantics of MPI OSC is more complex than that of CAF, but the semantics of CAF is more restrictive. To analyze the semantics and performance of OSC a simple program called Game of Life is used in a structured grid, giving very good parallel performance through MPI TSC. The MPI OSC program was verbose (increase in the number of lines of code) and complex, especially when using a more refined control to synchronize the parallel instances. On the other hand, CAF has reduced the number of lines of code (between 20% to 40%), and the synchronization is very simple. The results showed a worse performance in the case of MPI OSC, but for the CAF the absolute performance was better than the original implementation up to the number of processor cores that share the same memory. For unstructured grids we used the Ocean Land Atmospheric Model (OLAM), an earth simulation model on a grid based on triangular prisms, and parallelized with MPI TSC. The implementation with MPI OSC showed that this semantics has some points that may affect the coding of the communication structure, as in the treatment of memory exposure (each instance has an exposed memory of different size) and the way to treat the synchronization among instances. In terms of performance, the speedup curves showed that MPI OSC penalized OLAM, independently of the MPI implementation or the equipment used, with a reduction of at least 20% in speedup for seven or more processors. As in the Game of Life, MPI OSC degrades the performance. CAF Grades não estruturadas MPI OLAM Paralelismo Sincronismo Speed-up CAF MPI OLAM Parallelism Speed-up Syncronism Unstructured grids
252	Uma HeurÃstica Langrangeana para o Problema de PonderaÃÃo de Rodadas / A Lagrangian Heuristic for Problem Weighting Rounds Paulo Henrique MacÃdo de AraÃjo 20 February 2014 (has links) CoordenaÃÃo de AperfeiÃoamento de Pessoal de NÃvel Superior / Conselho Nacional de Desenvolvimento CientÃfico e TecnolÃgico / Nesta dissertaÃÃo, nosso principal objetivo foi desenvolver uma tÃcnica de resoluÃÃo para um problema na Ãrea de telecomunicaÃÃes. O problema em questÃo Ã chamado de problema de PonderaÃÃo de Rodadas (PR) e foi inicialmente proposto em [Klasing,Morales,Perennes, 2008]. O contexto do problema envolve uma rede sem fio, onde as comunicaÃÃes sÃo realizadas via ondas de rÃdio e a rede funciona atravÃs de uma operaÃÃo da rede que satisfaz certas restriÃÃes. Inicialmente, explicamos como Ã formada uma rede de rÃdio e descrevemos a forma de operaÃÃo da rede de rÃdio junto Ãs restriÃÃes usando um modelo matemÃtico. Em seguida, formalizamos o problema PR como um problema de otimizaÃÃo, especificando suas restriÃÃes, correspondente Ã geraÃÃo do conjunto de possÃveis operaÃÃes da rede, e critÃrio de otimizaÃÃo, referente ao uso dos recursos da rede. Posteriormente, mostramos um estudo preliminar do problema de ColoraÃÃo FracionÃria (CF) e apresentamos uma tÃcnica de resoluÃÃo deste problema atravÃs do uso de uma heurÃstica lagrangeana baseada em uma relaxaÃÃo lagrangeana de uma formulaÃÃo de programaÃÃo inteira do problema. Essa tÃcnica de resoluÃÃo Ã entÃo adaptada para o problema PR, consistindo na principal contribuiÃÃo de nossa pesquisa. Por fim, mostramos os resultados computacionais e anÃlises das nossas implementaÃÃes para os problemas CF e PR. / In this dissertation, our main objective was to develop a technique for resolution to a problem in the area of telecommunications. The problem in question is called Round Weighting Problem (RWP) and was originally proposed in (KLASING; MORALES; PeRENNES, 2008). The context of the problem involves a wireless network where communications are performed by radio waves and the network operates through a network operation that satises the constraints of the problem. Initially, we explain how a radio network is formed and describe the mode of operation of the radio network with restrictions using a mathematical model. Then, we formalize the RWP as an optimization problem, specifying their restrictions, corresponding to the generation of the set of possible network operations, and optimization criterion, regarding the use of network resources. Subsequently, we show a preliminary study of the Fractional Coloring problem (FC problem) and present a technique to solve this problem through the use of a lagrangian heuristic based on a lagrangian relaxation of an integer programming formulation of the problem. This resolution technique is then adapted to the RWP, consisting in the main contribution of our research. Finally, we show the computational results and analyzes of our implementations for the Fractional Coloring problem and RWP. TelecomunicaÃÃes HeurÃstica Lagrangeana OtimizaÃÃo inteira Telecommunications Radio Networks Lagrangian Heuristic Parallelism CIENCIA DA COMPUTACAO
253	Granlog : um modelo para analise automatica de granulosidade na programacao em logica / Granlog a model for automatic granulariy analysis in logic programming Barbosa, Jorge Luis Victoria January 1996 (has links) A exploração do paralelismo na programação em lógica e considerada uma alternativa para simplificação da programação de maquinas paralelas e para aumento do desempenho de programas em lógica. Desta forma, a integração da programação em lógica e sistemas paralelos tornou-se nos últimos anos um centro de atenções da comunidade ciêntifica. Dentre os problemas que devem ser solucionados para exploração adequada do paralelismo, encontra-se a analise de granulosidade. A análise de granulosidade determina o tamanho dos grãos, ou seja, a complexidade dos módulos que devendo ser executados seqüencialmente num único processador. Basicamente, esta analise consiste de uma refinada identificação dos grãos, visando a máxima eficiência na exploração do paralelismo. Neste sentido, devem ser realizadas considerações sobre dependências, complexidade dos grãos e custos envolvidos na paralelização. Recentemente, a analise de granulosidade na programação em lógica tem recebido atenção especial por parte dos pesquisadores. Os grãos podem ser identificados pelo programador através de primitivas de programação ou podem ser detectados automaticamente pelo sistema paralelo. Na programação em lógica, a exploração automática do paralelismo é estimulada, devido ao paralelismo implícito existente na avaliação das expressões lógicas. Além disso, a programação em lógica permite uma clara distinção entre a semântica e o controle da linguagem, proporcionando uma abordagem distinta entre a descrição do problema e o caminho para obtenção das soluções. A detecção automática do paralelismo permite o aproveitamento de programas já existentes, alem de liberar o programador do encargo de paralelizar o problema. Este trabalho dedica-se ao estudo da analise automática de granulosidade na programação em lógica. O texto propõe um modelo para geração de informações de granulosidade, denominado GRANLOG (GRanularty ANalyzer for LOGic Programming). O GRANLOG realiza uma analise estática de um programa em 16aica. Dessa analise resulta o programa granulado, ou seja, o programa original acrescido da anotação de granulosidade. Esta anotação contem diversas informações que contribuem de forma significativa com a exploração adequada do paralelismo na programação em lógica. Durante o desenvolvimento do GRANLOG foram exploradas diversas áreas de pesquisa da programação em lógica, dentre as quais destacam-se: analise de modos, analise de tipos, análise de medidas para mensuração do tamanho de termos, interpretação abstrata, analise de dependências e analise de complexidade. A integração destes t6picos torna o GRANLOG uma rica fonte de pesquisa. Além disso, a organização modular da proposta permite o aprimoramento independente de suas partes, tornando a estrutura do modelo uma base para o desenvolvimento de novos trabalhos. Além do modelo, o texto descreve a implementação de um protótipo e propõe duas aplicações para as informações de granulosidade, ou seja, auxilio a decisões de escalonamento e simulação da execução de programas. O texto apresenta ainda uma proposta para integração do GRANLOG a um modelo para execução paralela de programas em lógica, denominado OPERA. O OPERA dedica-se a exploração do paralelismo na programação em lógica e possui atualmente um protótipo para execução paralela de programas em lógica em redes de computadores. Os bons resultados obtidos com a integração OPERA-GRANLOG demonstram a relevância das informações geradas pelo modelo proposto neste trabalho. Encontra-se ainda neste texto uma proposta para inclusão do GRANLOG numa interface gráfica, denominada XOPERA. Esta interface permite a execução do protótipo OPERA e, a partir deste trabalho, gerencia também o protótipo GRANLOG. A inclusão da gerencia do GRANLOG na interface XOPERA, contribui de forma substancial para a integração OPERA-GRANLOG. / The exploitation of parallelism in logic programming is considered an alternative for simplifying the task of programming parallel machines. Also, it provides a way to increase the performance of logic programs. Because of this, integrating parallel systems with parallel programmin g has been a topic of much interest in the scientific comunity, in the last years. Among the problems that must be solved for the adequate exploitation of parallelism, there is the granularity analysis. Granularity analysis determines the size of the grains, that is, the complexity of the modules that must be sequentially executed in a single processor. Basically, this analysis consists of a refined identification of the grains, aiming the maximum efficiency in the parallelism exploitation. In this sense, considerations must be taken about dependencies, grain complexity and costs involved in the parallelizing process. Recently, many researchers have given special attention to the granularity analysis of logic programming. The grains may be identified by the programmer via programming primitives, or they may be automatically detected by the parallel system. In logic programming, the automatic exploitation of parallelism is stimulated, because of the implicit parallelism that exists in the evaluation of the logic expressions. Besides, logic programming allows a clear distinction between the semantics and the control of the language, providing a distinct approach between the problem description and the way to obtain the results. The automatic detection of parallelism permits the utilization of already written programs, also freeing the programmer from parallelizing the program by hand. This work is dedicated to the study of automatic granularity analysis in logic programming. The text proposes a model for generating granularity informations, called GRANLOG (GRanularity Analyzer for LOGic Programming). GRANLOG performs a static analysis of a logic program. From this analysis, it results a granulated program, that is, the original program increased by the granularity annotation. This annotation has several informations that contribute in a significant way to the adequate exploitation of parallelism in logic programming. During the development of GRANLOG, several research areas have been explored, namely, mode analysis, type analysis, measure analysis for measuring the size of terms, abstract interpretation, dependencies analysis and complexity analysis. The integration of these topics makes GRANLOG a good source for researchs. Besides, the modular organization proposed permits the independent improvement of its parts, making of the model structure, a base for the development of new works. Besides the model, the text describes the implementation of a prototype and proposes two applications for the granularity informations, namely, help in scheduling decisions and program execution simulation. It also presents a proposal for integrating GRANLOG to a parallel logic execution model for logic programming, called OPERA. OPERA is dedicated to the exploitation of parallelism in logic programming and, at the present time, has a prototype for parallel execution of logic programming in computer networks. The good results obtained by integrating OPERA and GRANLOG show the importance of the information generated by the model proposed in this work. There is, also, in this work, a proposal for including GRANLOG in a graphical interface, called XOPERA. This interface allows the execution of the OPERA prototype and, from now on, also manaaes the GRANLOG prototype. The inclusion of GRANLOG in the XOPERA interfaces substantially contributes to the OPERAGRANLOG intearation. Inteligência artificial Programacao em logica Analise : Granulosidade Processamento paralelo Granularity Granularity analysis Parallel processing Parallelism in logic programming
254	Adaptive tiling algorithm based on highly correlated picture regions for the HEVC standard / Algoritmo de tiling adaptativo baseado em regiões altamente correlacionadas de um quadro para o padrão de codificação de vídeos de alta eficiência Silva, Cauane Blumenberg January 2014 (has links) Esta dissertação de mestrado propõe um algoritmo adaptativo que é capaz de dinamicamente definir partições tile para quadros intra- e inter-preditos com o objetivo de reduzir o impacto na eficiência de codificação. Tiles são novas ferramentas orientadas ao paralelismo que integram o padrão de codificação de vídeos de alta eficiência (HEVC – High Efficiency Video Coding standard), as quais dividem o quadro em regiões retangulares independentes que podem ser processadas paralelamente. Para viabilizar o paralelismo, os tiles quebram as dependências de codificação através de suas bordas, gerando impactos na eficiência de codificação. Este impacto pode ser ainda maior caso os limites dos tiles dividam regiões altamente correlacionadas do quadro, porque a maior parte das ferramentas de codificação usam informações de contexto durante o processo de codificação. Assim, o algoritmo proposto agrupa as regiões do quadro que são altamente correlacionadas dentro de um mesmo tile para reduzir o impacto na eficiência de codificação que é inerente ao uso de tiles. Para localizar as regiões altamente correlacionadas do quadro de uma maneira inteligente, as características da imagem e também as informações de codificação são analisadas, gerando mapas de particionamento que servem como parâmetro de entrada para o algoritmo. Baseado nesses mapas, o algoritmo localiza as quebras naturais de contexto presentes nos quadros do vídeo e define os limites dos tiles nessas regiões. Dessa maneira, as quebras de dependência causadas pelas bordas dos tiles coincidem com as quebras de contexto naturais do quadro, minimizando as perdas na eficiência de codificação causadas pelo uso dos tiles. O algoritmo proposto é capaz de reduzir mais de 0.4% e mais de 0.5% o impacto na eficiência de codificação causado pelos tiles em quadros intra-preditos e inter-preditos, respectivamente, quando comparado com tiles uniformes. / This Master Thesis proposes an adaptive algorithm that is able to dynamically choose suitable tile partitions for intra- and inter-predicted frames in order to reduce the impact on coding efficiency arising from such partitioning. Tiles are novel parallelismoriented tools that integrate the High Efficiency Video Coding (HEVC) standard, which divide the frame into independent rectangular regions that can be processed in parallel. To enable the parallelism, tiles break the coding dependencies across their boundaries leading to coding efficiency impacts. These impacts can be even higher if tile boundaries split highly correlated picture regions, because most of the coding tools use context information during the encoding process. Hence, the proposed algorithm clusters the highly correlated picture regions inside the same tile to reduce the inherent coding efficiency impact of using tiles. To wisely locate the highly correlated picture regions, image characteristics and encoding information are analyzed, generating partitioning maps that serve as the algorithm input. Based on these maps, the algorithm locates the natural context break of the picture and defines the tile boundaries on these key regions. This way, the dependency breaks caused by the tile boundaries match the natural context breaks of a picture, then minimizing the coding efficiency losses caused by the use of tiles. The proposed adaptive tiling algorithm, in some cases, provides over 0.4% and over 0.5% of BD-rate savings for intra- and inter-predicted frames respectively, when compared to uniform-spaced tiles, an approach which does not consider the picture context to define the tile partitions. Microeletrônica Processamento : Imagem Digital video coding High efficiency video coding standard Parallelism-oriented tools Tile partitions Coding efficiency
255	[en] QEEF-G: ADAPTIVE PARALLEL EXECUTION OF ITERATIVE QUERIES / [pt] QEEF-G: EXECUÇÃO PARALELA ADAPTATIVA DE CONSULTAS ITERATIVAS VINICIUS FONTES VIEIRA DA SILVA 25 April 2007 (has links) [pt] O processamento de consulta paralelo tradicional utilize- se de nós computacionais para reduzir o tempo de processamento de consultas. Com o surgimento das grades computacionais, milhares de nós podem ser utilizados, desafiando as atuais técnicas de processamento de consulta a oferecerem um suporte massivo ao paralelismo em um ambiente onde as condições variam todo a instante. Em adição, as aplicações científicas executadas neste ambiente oferecem novas características de processamento de dados que devem ser integradas em um sistema desenvolvido para este ambiente. Neste trabalho apresentamos o sistema de processamento de consulta paralelo do CoDIMS-G, e seu novo operador Orbit que foi desenvolvido para suportar a avaliação de consultas iterativas. Neste modelo de execução as tuplas são constantemente avaliadas por um fragmento paralelo do plano de execução. O trabalho inclui o desenvolvimento do sistema de processamento de consulta e um novo algoritmo de escalonamento que, considera as variações de rede e o throughput de cada nó, permitindo ao sistema se adaptar constantemente as variações no ambiente. / [en] Traditional parallel query processing uses multiple computing nodes to reduce query response time. Within a Grid computing context, the availability of thousands of nodes challenge current parallel query processing techniques to support massive parallelism in a constantly varying environment conditions. In addition, scientific applications running on Grids offer new data processing characteristics that shall be integrated in such a framework. In this work we present the CoDIMS-G parallel query processing system with a full-fledged new query execution operator named Orbit. Orbit is designed for evaluating massive iterative based data processing. Tuples in Orbit iterate over a parallelized fragment of the query execution plan. This work includes the development of the query processing system and a new scheduling algorithm that considers variation on network and the throughput of each node. Such algorithm permits the system to adapt constantly to the changes in the environment. [pt] PARALELISMO [en] PARALLELISM [pt] BANCO DE DADOS [en] DATABASE [pt] PROCESSAMENTO DE CONSULTAS [en] QUERY PROCESSING [pt] PROCESSAMENTO DISTRIBUIDO [en] DISTRIBUTED COMPUTING
256	Scheduling of parallel real-time DAG tasks on multiprocessor systems / Ordonnancement temps réels des tâches parallèles sur des systèmes multiprocesseurs. Qamhieh, Manar 26 January 2015 (has links) Les applications temps réel durs sont celles qui doivent exécuter en respectant des contraintes temporelles. L'ordonnancement temps réel a bien été étudié sur mono-processeurs depuis plusieurs années. Récemment, l'utilisation d'architectures multiprocesseurs a augmenté dans les applications industrielles et des architectures parallèles sont proposées pour que le logiciel devienne compatible avec ces plateformes. L'ordonnancement multiprocesseurs de tâches parallèles dépendantes n'est pas une simple généralisation du cas mono-processeur et la problématique d'ordonnancement devient plus complexe et difficile. Dans cette thèse, nous étudions le problème d'ordonnancement temps réel de graphes de tâches parallèles acycliques sur des plateformes multiprocesseurs. Dans ce modèle, un graphe est composé d'un ensemble de sous-tâches dépendantes sous contraintes de précédence qui expriment les relations de précédences entre les sous-tâches. L'ordre d'exécution des sous-tâches est dynamique, c'est-à-dire que les sous-tâches peuvent s'exécuter en parallèle ou séquentiellement par rapport aux décisions de l'ordonnanceur temps réel. Pour traiter les contraintes de précédence, nous proposons deux méthodes pour l'ordonnancement des graphes : par transformation du modèle de graphe de sous tâches parallèles en un modèle de tâches séquentielles indépendantes, plus simple à ordonnancer et par ordonnancement direct des graphes en prenant en compte les relations de dépendance entre les sous-tâches. Nous proposons un ordonnancement des graphes en prenant directement en compte les paramètres temporels des graphes et un ordonnancement au niveau des sous-tâches, par rapport à des paramètres temporels attribués aux sous-tâches par un algorithme spécifique. Enfin, nous prouvons que les deux méthodes d'ordonnancement de graphes ne sont pas comparables. Nous fournissons alors des résultats de simulation pour comparer ces méthodes en utilisant les algorithmes d'ordonnancement globaux EDF et DM. Nous avons développé un logiciel nommé YARTISS pour générer des graphes aléatoires et réaliser les simulations / The interest for multiprocessor systems has recently been increased in industrial applications, and parallel programming API's have been introduced to benefit from new processing capabilities. The use of multiprocessors for real-time systems, whose execution is performed based on certain temporal constraints is now investigated by the industry. Real-time scheduling problem becomes more complex and challenging in that context. In multiprocessor systems, a hard real-time scheduler is responsible for allocating ready jobs to available processors of the systems while respecting their timing parameters. In this thesis, we study the problem of real-time scheduling of parallel Directed Acyclic Graph (DAG) tasks on homogeneous multiprocessor systems. In this model, a DAG task consists of a set of subtasks that execute under precedence constraints. At all times, the real-time scheduler is responsible for determining how subtasks execute, either sequentially or in parallel, based on the available processors of the system. We propose two DAG scheduling approaches to determine the execution form of DAG tasks. The first approach is the DAG Stretching algorithm, from the Model Transformation approach, which forces DAG tasks to execute as sequentially as possible. The second approach is the Direct Scheduling, which aims at scheduling DAG tasks while respecting their internal dependencies. We provide real-time schedulability analyses for Direct Scheduling at DAG-Level and at Subtask-Level. Due to the incomparability of DAG scheduling approaches, we use extensive simulations to compare performance of global EDF with global DM scheduling using our simulation tool YARTISS Temps réel Parallelisme Ordonnancement Multiprocessors Graphe orienté acyclique Systemes embarques Real time Parallelism Scheduling Multiprocessors Directed Acyclic Graphs Embedded systems
257	Évolution de la divergence entre la lamproie fluviatile (Lampetra fluviatilis) et la lamproie deplaner (Lampetra planeri) inférée par approches expérimentales et de génomique des populations / Evolution of divergence between the river lamprey (Lampetra fluviatilis) and the brook lamprey (L.planeri) inferred by experimental approaches and population genomics Rougemont, Quentin 15 December 2015 (has links) Cette thèse étudie le processus de spéciation entre la lamproie fluviatile (Lampetra fluviatilis) et la lamproie de Planer (L. planeri). Les deux espèces présentent des stratégies d'histoire de vie extrêmement différentes : L. fluviatilis est parasite et anadrome alors que L. planeri n'est pas parasite et reste strictement dulcicole. Toutefois, leur degré d'isolement reproducteur et leur histoire de divergence demeurent méconnus. Ces questions ont été abordées par des approches expérimentales, de génomique de populations et de simulations démographiques. Des croisements expérimentaux ont révélé un faible isolement reproducteur, confirmé par des degrés variables de flux géniques dans les populations naturelles. Les analyses génétiques ont montré que les deux taxons représentaient probablement des écotypes avec un isolement reproducteur partiel suggérant que les barrières reproductives endogènes ne réduisaient que partiellement la migration efficace entre écotypes. L'importance du contexte géographique actuel et passé dans l'étude de la spéciation a aussi été mise en évidence par des analyses à l'échelle du génome. Ainsi, les populations isolées de L. planeri évoluent principalement sous l'effet de la dérive génétique et ont une diversité réduite. Les inférences démographiques ont suggéré que la divergence a été initiée en allopatrie puis suivie de contacts secondaires résultant en un parallélisme génomique partiel entre réplicas de paires de populations. Une hétérogénéité de la divergence génomique a démontré que les ilots génomiques de différenciation ne résultaient pas de l'action récente de la divergence écologique. En outre, nos résultats suggèrent un impact faible de la fragmentation anthropique des cours d'eau sur la diversité génétique des populations de L. planeri. Les populations résidentes possèdent une diversité génétique plus grande lorsque le flux de gènes avec L. fluviatilis dans les parties aval des cours d'eau. Globalement cette thèse a démontré que les paires d'écotypes parasites et non-parasites de lamproies représentent un excellent modèle d'étude de la spéciation et notamment de l'architecture génomique de la divergence. / This thesis investigates the process of speciation between the European lampreys Lampetra fluviatilis and L. planeri. The two species have drastically different life history strategies: L. fluviatilis is parasitic and anadromous while L. planeri is non-parasitic and strictly freshwater resident. Yet their level of reproductive isolation and history of divergence remain poorly understood. A multidisciplinary approach including experiments, population genomics analyses and historical reconstruction was undertaken to address these issues. Experimental crosses revealed a very low level of reproductive isolation, partially mirrored by variable levels of gene flow in wild populations. Genetic analyses revealed that the two taxa were best described as partially reproductively isolated ecotypes suggesting that endogenous genetic barriers partially reduced effective migration between ecotypes. Genome wide analyses showed the importance of the current and ancient geographical context of speciation. In particular, parapatric L. planeri populations diverged mostly through drift and displayed a reduced genetic diversity . Demographic inferences suggested that divergence have likely emerged in allopatry and then secondary contacts resulted in partial parallelism between replicate population pairs. A strong heterogeneity of divergence across the genome was revealed by sympatric populations suggesting that genomic islands of differentiation were not linked to ongoing ecological divergence. Further investigations showed that the genetic diversity of L. planeri populations was weakly affected by human-induced river fragmentation. Resident populations displayed a higher diversity when gene flow was possible with L. fluviatilis populations in downstream sections of rivers. Overall this thesis showed that parasitic and non-parasitic lamprey ecotypes represent a promising model for studying speciation and notably the genomic architecture of divergence. Spéciation Flux de gènes Parallélisme Modélisation Biogéographie Histoire évolutive Lampetra Speciation Gene flow Parallelism Modelling Biogeography Evolutionary history Lampetra
258	Scheduling Algorithms for Instruction Set Extended Symmetrical Homogeneous Multiprocessor Systems-on-Chip Montcalm, Michael R. January 2011 (has links) Embedded system designers face multiple challenges in fulfilling the runtime requirements of programs. Effective scheduling of programs is required to extract as much parallelism as possible. These scheduling algorithms must also improve speedup after instruction-set extensions have occurred. Scheduling of dynamic code at run time is made more difficult when the static components of the program are scheduled inefficiently. This research aims to optimize a program’s static code at compile time. This is achieved with four algorithms designed to schedule code at the task and instruction level. Additionally, the algorithms improve scheduling using instruction set extended code on symmetrical homogeneous multiprocessor systems. Using these algorithms, we achieve speedups up to 3.86X over sequential execution for a 4-issue 2-processor system, and show better performance than recent heuristic techniques for small programs. Finally, the algorithms generate speedup values for a 64-point FFT that are similar to the test runs. Scheduling ILP System on Chip SoC Instruction level parallelism Integer Linear Program Custom Instruction Instruction Set Extension Multiprocessor
259	Conception et mise en oeuvre d'une plate-forme de pilotage de simltions numériques parallèles et distribuées Richart, Nicolas 20 January 2010 (has links) Le domaine de la simulation numérique évolue vers des simulations de phénomènes physiques toujours plus complexes. Cela se traduit typiquement par le couplage de plusieurs codes de simulation, où chaque code va gérer une physique (simulations multi-physiques) ou une échelle particulière (simulations multi-échelles). Dans ce cadre, l'analyse des résultats des simulations est un point clé, que ce soit en phase de développement pour valider les codes ou détecter des erreurs, ou en phase de production pour confronter les résultats à la réalité expérimentale. Dans tous les cas, le pilotage de simulations peut aider durant ce processus d'analyse des résultats. L'objectif de cette thèse est de concevoir et de réaliser une plate-forme logicielle permettant de piloter de telles simulations. Plus précisément, il s'agit à partir d'un client de pilotage distant d'accéder ou de modifier les données de la simulation de manière cohérente, afin par exemple de visualiser "en-ligne" les résultats intermédiaires. Pour ce faire, nous avons proposé un modèle de pilotage permettant de représenter des simulations couplées et d'interagir avec elles efficacement et de manière cohérente. Ces travaux ont été validés sur une simulation multi-échelles en physique des matériaux. / The numerical simulations evolve more and more to simulations of complex physical phenomena through multi-scale or multi-physics codes. For these kind of simulations data analysis is a main issue for many reasons, as detecting bugs during the development phase or to understand the dynamic of the physical phenomena simulated during the production phase. The computational steering is a technique well suited to do all this kind of data analysis. The goal of this thesis is to design and develop a computational steering framework that take into account the complexity of coupled simulations. So, through a computational steering client we want to interact coherently with data generated in coupled simulations. This afford for example to visualize on-line the intermediate results of simulations. In order to make this possible we will introduce an abstract model that enables to represent coupled simulations and to know when we can interact coherently with them. These works have been validated on a legacy multi-scale simulation of material physics. Pilotage de simulation Simulation numérique Parallélisme Couplage de codes Modélisation de simulations Computational steering Numerical simulation Parallelism Code coupling Simulation modelisation
260	Placement d'applications parallèles en fonction de l'affinité et de la topologie / Placement of parallel applications according to the topology and the affinity Tessier, Francois 26 January 2015 (has links) La simulation numérique est un des piliers des Sciences et de l’industrie. La simulationmétéorologique, la cosmologie ou encore la modélisation du coeur humain sont autantde domaines dont les besoins en puissance de calcul sont sans cesse croissants. Dès lors,comment passer ces applications à l’échelle ? La parallélisation et les supercalculateurs massivementparallèles sont les seuls moyens d’y parvenir. Néanmoins, il y a un prix à payercompte tenu des topologies matérielles de plus en plus complexes, tant en terme de réseauque de hiérarchie mémoire. La question de la localité des données devient ainsi centrale :comment réduire la distance entre une entité logicielle et les données auxquelles elle doitaccéder ? Le placement d’applications est un des leviers permettant de traiter ce problème.Dans cette thèse, nous présentons l’algorithme de placement TreeMatch et ses applicationsdans le cadre du placement statique, c’est-à-dire au lancement de l’application, et duplacement dynamique. Pour cette seconde approche, nous proposons la prise en comptede la localité des données dans le cadre d’un algorithme d’équilibrage de charge. Les différentesapproches abordées sont validées par des expériences réalisées tant sur des codesd’évaluation de performances que sur des applications réelles. / Computer simulation is one of the pillars of Sciences and industry. Climate simulation,cosmology, or heart modeling are all areas in which computing power needs are constantlygrowing. Thus, how to scale these applications ? Parallelization and massively parallel supercomputersare the only ways to do achieve. Nevertheless, there is a price to pay consideringthe hardware topologies incessantly complex, both in terms of network and memoryhierarchy. The issue of data locality becomes central : how to reduce the distance betweena processing entity and data to which it needs to access ? Application placement is one ofthe levers to address this problem. In this thesis, we present the TreeMatch algorithmand its application for static mapping, that is to say at the lauchtime of the application,and the dynamic placement. For this second approach, we propose the awareness of datalocality within a load balancing algorithm. The different approaches discussed are validatedby experiments both on benchmarking codes and on real applications. Calcul haute performance Parallélisme Localité Affinité Topologie Placement Équilibrage de charge High performance computing Parallelism Locality Affinity Topology Placement Load balancing

Search results