Global ETD Search

1	IMPROVING MESSAGE-PASSING PERFORMANCE AND SCALABILITY IN HIGH-PERFORMANCE CLUSTERS RASHTI, Mohammad Javad 26 January 2011 (has links) High Performance Computing (HPC) is the key to solving many scientific, financial, and engineering problems. Computer clusters are now the dominant architecture for HPC. The scale of clusters, both in terms of processor per node and the number of nodes, is increasing rapidly, reaching petascales these days and soon to exascales. Inter-process communication plays a significant role in the overall performance of HPC applications. With the continuous enhancements in interconnection technologies and node architectures, the Message Passing Interface (MPI) needs to be improved to effectively utilize the modern technologies for higher performance. After providing a background, I present a deep analysis of the user level and MPI libraries over modern cluster interconnects: InfiniBand, iWARP Ethernet, and Myrinet. Using novel techniques, I assess characteristics such as overlap and communication progress ability, buffer reuse effect on latency, and multiple-connection scalability. The outcome highlights some of the inefficiencies that exist in the communication libraries. To improve communication progress and overlap in large message transfers, a method is proposed which uses speculative communication to overlap communication with computation in the MPI Rendezvous protocol. The results show up to 100% communication progress and more than 80% overlap ability over iWARP Ethernet. An adaptation mechanism is employed to avoid overhead on applications that do not benefit from the method due to their timing specifications. To reduce MPI communication latency, I have proposed a technique that exploits the application buffer reuse characteristics for small messages and eliminates the sender-side copy in both two-sided and one-sided MPI small message transfer protocols. The implementation over InfiniBand improves small message latency up to 20%. The implementation adaptively falls back to the current method if the application does not benefit from the proposed technique. Finally, to improve scalability of MPI applications on ultra-scale clusters, I have proposed an extension to the current iWARP standard. The extension improves performance and memory usage for large-scale clusters. The extension equips Ethernet with an efficient zero-copy, connection-less datagram transport. The software-level evaluation shows more than 40% performance benefits and 30% memory usage reduction for MPI applications on a 64-core cluster. / Thesis (Ph.D, Electrical & Computer Engineering) -- Queen's University, 2010-10-16 12:25:18.388 High Performance Computing Message Passing Computer Clusters Interconnection Networks
2	ESTUDO SOBRE O CONSUMO DE ENERGIA ELÉTRICA EM AGLOMERADOS DE COMPUTADORES COM UTILIZAÇÃO DO FRAMEWORK OAR / STUDY ON THE ELECTRICITY CONSUMPTION IN COMPUTER CLUSTERS WITH USE OF THE OAR FRAMEWORK Albiero, Fábio Weber 25 February 2013 (has links) Fundação de Amparo a Pesquisa no Estado do Rio Grande do Sul / Our society increasingly relies on the use of computers to perform various tasks. The high rate of use of such equipment causes the increase in electricity consumption. To meet the growing demand for energy, there are two possible solutions. The first solution is to increase production, which is a difficult task because of the need to build new sources of energy. The second solution is to promote more efficient use of energy, so that the demand for computing power can be met without increasing the power consumption. That means optimizing the energy performance of electronic devices of the computational systems in this case. The systems of high performance (computer clusters and grids) are excellent targets for optimizing the energy consumption, since they consume large amount of electricity. Therefore, this paper presents a study on the energy consumption in computer clusters through the use of the OAR framework (Optimal Allocation of Resources). The study aims to measure the electricity consumed in various settings of computer clusters. In terms of computational resources available, the measurement will help to answer important questions concerning to the management of electrical energy, such as: what is the best setting to save energy and how much energy can be saved. / Nossa sociedade apoia-se cada vez mais na utilização de computadores para a realização de diversas tarefas. A elevada taxa de utilização desses equipamentos ocasiona o aumento do consumo de energia elétrica. Para atender a demanda crescente de energia, existem duas soluções possíveis. A primeira solução é aumentar a produção, o que é uma tarefa difícil devido a necessidade de construção de novas fontes geradoras de energia. A segunda solução é promover o uso mais eficiente da energia, de modo que a demanda por poder computacional possa ser atendida sem ampliar o consumo de energia elétrica. Isso significa otimizar o desempenho energético dos aparelhos eletrônicos, neste caso, dos sistemas computacionais. Os sistemas de alto desempenho (aglomerados de computadores e grades computacionais) são excelentes alvos para a otimização do consumo de energético, já que consomem grande quantidade de energia elétrica. Diante disso, este trabalho apresenta um estudo sobre o consumo de energia elétrica em aglomerados de computadores através do uso do framework OAR (Optimal Allocation of Resources). O estudo visa medir a energia elétrica consumida em várias configurações de utilização dos aglomerados. Em nível dos recursos computacionais disponíveis, a medição ajudará a responder questões importantes relativas a gerência de energia elétrica, tais como: qual é a melhor configuração para se economizar energia e quanta energia pode ser poupada. Consumo de energia elétrica Aglomerados de computadores OAR Electricity consumption Computer clusters OAR
3	Políticas de escalonamento memory-intensive para aplicações distribuídas / Memory-intensive scheduling policies for distributed applications Alves, Luís Cézar Darienzo 24 June 2008 (has links) Esta dissertação aborda o escalonamento de processos em sistemas de clusters de computadores, tanto em plataformas homogêneas quanto heterogêneas. As heterogeneidades abordadas incluem a potência computacional dos processadores, quantidade de memória principal do sistema e o tempo médio de acesso ao disco. Neste trabalho são propostas quatro novas políticas destinadas a realizar o compartilhamento de carga nesses ambientes, considerando cargas de trabalho com aplicações variando entre CPU-bound e memoryintensive. Dentre as quatro políticas, uma utiliza apenas índices de CPU, enquanto as demais utilizam também índices de memória. Os resultados foram obtidos através de simulações baseadas em trace e mostram reduções significativas das perdas de desempenho observadas nos resultados obtidos com as políticas de escalonamento propostas. Como referências foram utilizadas políticas de escalonamento tradicionais encontradas na literatura / This dissertation approaches the process scheduling on clusters of computers, on both homogeneous and heterogeneous platforms. The heterogeneities considered include processor computational power, system main memory quantity and the average disk access time. In this work are proposed four novel policies aimed at realizing the work load sharing on these environments, considering workloads with applications varying between CPU-bound and memory-intensive. Among the four policies, one of them uses only CPU indices, while the others also use memory indices. The results were obtained by means of trace-based simulations and show a significant reduction on the performance losses observed on the results obtained with the proposed scheduling policies. As references were used traditional scheduling policies found in the literature Avaliação de desempenho Cluster de computadores Computer clusters Escalonamento de processos Performance evaluation Processes scheduling Single-system image Sistemas de imagem única
4	Políticas de escalonamento memory-intensive para aplicações distribuídas / Memory-intensive scheduling policies for distributed applications Luís Cézar Darienzo Alves 24 June 2008 (has links) Esta dissertação aborda o escalonamento de processos em sistemas de clusters de computadores, tanto em plataformas homogêneas quanto heterogêneas. As heterogeneidades abordadas incluem a potência computacional dos processadores, quantidade de memória principal do sistema e o tempo médio de acesso ao disco. Neste trabalho são propostas quatro novas políticas destinadas a realizar o compartilhamento de carga nesses ambientes, considerando cargas de trabalho com aplicações variando entre CPU-bound e memoryintensive. Dentre as quatro políticas, uma utiliza apenas índices de CPU, enquanto as demais utilizam também índices de memória. Os resultados foram obtidos através de simulações baseadas em trace e mostram reduções significativas das perdas de desempenho observadas nos resultados obtidos com as políticas de escalonamento propostas. Como referências foram utilizadas políticas de escalonamento tradicionais encontradas na literatura / This dissertation approaches the process scheduling on clusters of computers, on both homogeneous and heterogeneous platforms. The heterogeneities considered include processor computational power, system main memory quantity and the average disk access time. In this work are proposed four novel policies aimed at realizing the work load sharing on these environments, considering workloads with applications varying between CPU-bound and memory-intensive. Among the four policies, one of them uses only CPU indices, while the others also use memory indices. The results were obtained by means of trace-based simulations and show a significant reduction on the performance losses observed on the results obtained with the proposed scheduling policies. As references were used traditional scheduling policies found in the literature Avaliação de desempenho Cluster de computadores Escalonamento de processos Sistemas de imagem única Computer clusters Performance evaluation Processes scheduling Single-system image
5	Execution Of Distributed Database Queries On A Hpc System Onder, Ibrahim Seckin 01 January 2010 (has links) (PDF) Increasing performance of computers and ability to connect computers with high speed communication networks make distributed databases systems an attractive research area. In this study, we evaluate communication and data processing capabilities of a HPC machine. We calculate accurate cost formulas for high volume data communication between processing nodes and experimentally measure sorting times. A left deep query plan executer has been implemented and experimentally used for executing plans generated by two different genetic algorithms for a distributed database environment using message passing paradigm to prove that a parallel system can provide scalable performance by increasing the number of nodes used for storing database relations and processing nodes. We compare the performance of plans generated by genetic algorithms with optimal plans generated by exhaustive search algorithm. Our results have verified that optimal plans are better than those of genetic algorithms, as expected.

1

Page generated in 0.0855 seconds