Global ETD Search

1	Measuring, modeling, and optimizing counterintuitive performance phenomena in power-scalable, parallel systems Chang, Hung-Ching 09 April 2015 (has links) The demands of exascale computing systems and applications have pushed for a rapid, continual design paradigm coupled with increasing design complexities from the interaction between the application, the middleware, and the underlying system hardware, which forms a breeding ground for inefficiency. This work seeks to improve system efficiency by exposing the root causes of unexpected performance slowdowns (e.g., lower performance at higher processor speeds) that occur more frequently in power-scalable systems where raw processor speed varies. More precisely, we perform an exhaustive empirical study that conclusively shows that increasing processor speed often reduces performance and wastes energy. Our experimental work shows that the frequency of occurrence and magnitude of slowdowns grow with clock frequency and parallelism, indicating that such slowdowns will increasingly be observed with trends in processor and system design. Performance speedups at lower frequencies (or slowdowns at higher frequencies) have been anecdotally observed in the prevailing literature since 2004, but no research has explained nor exploited this phenomenon. This work conclusively demonstrates that performance slowdowns during processor speedup phases can exceed 47% in common I/O workloads. Our hypothesis challenges (and ultimately debunks) a fundamental assumption in computer systems: faster processor speeds result in the same or better performance. In this work, with the use of code and kernel instrumentation, exhaustive experiments, and deep insight into the inner workings of the Linux I/O subsystem, I overcome the aforementioned challenges of variance, complexity, and nondeterminism and identify the I/O resource contention as the root cause of the slowdowns during processor speedup. Specifically, such contention comes from the Linux kernel when the journaling block device (JBD) interacts with the ext3/4 file system that introduces file write delays and file synchronization delays. To fully explain how such I/O contention causes performance anomaly, I propose analytical models of resource contention among I/O threads to describe the root cause of the observed I/O slowdowns when processors speed up. To this end, I introduce LUC, a runtime system to limit the unintended consequences of power scaling and demonstrate the effectiveness of the LUC system for two critical parallel transaction-oriented workloads, including a mail server (varMail) and online transaction processing (oltp). / Ph. D. parallel and distributed processing I/O performance power runtime systems
2	Efficient Parallelization of 2D Ising Spin Systems Feng, Shuangtong 28 December 2001 (has links) The problem of efficient parallelization of 2D Ising spin systems requires realistic algorithmic design and implementation based on an understanding of issues from computer science and statistical physics. In this work, we not only consider fundamental parallel computing issues but also ensure that the major constraints and criteria of 2D Ising spin systems are incorporated into our study. This realism in both parallel computation and statistical physics has rarely been reflected in previous research for this problem. In this thesis,we designed and implemented a variety of parallel algorithms for both sweep spin selection and random spin selection. We analyzed our parallel algorithms on a portable and general parallel machine model, namely the LogP model. We were able to obtain rigorous theoretical run-times on LogP for all the parallel algorithms. Moreover, a guiding equation was derived for choosing data layouts (blocked vs. stripped) for sweep spin selection. In regards to random spin selection, we were able to develop parallel algorithms with efficient communication schemes. We analyzed randomness of our schemes using statistical methods and provided comparisons between the different schemes. Furthermore, algorithms were implemented and performance data gathered and analyzed in order to determine further design issues and validate theoretical analysis. / Master of Science spin selection schemes data layout optimization computational science Ising model LogP parallel model performance prediction parallel and distributed processing
3	Processamento eficiente de junção espacial em ambiente paralelo e distribuído baseado em Spatialhadoop Mendes, Eduardo Fernando 17 February 2017 (has links) Submitted by Alison Vanceto (alison-vanceto@hotmail.com) on 2017-08-17T12:19:08Z No. of bitstreams: 1 TeseEFM.pdf: 31334481 bytes, checksum: 966afb8a981794db0aee3bc97ee11d5b (MD5) / Approved for entry into archive by Ronildo Prado (producaointelectual.bco@ufscar.br) on 2017-10-25T17:55:23Z (GMT) No. of bitstreams: 1 TeseEFM.pdf: 31334481 bytes, checksum: 966afb8a981794db0aee3bc97ee11d5b (MD5) / Approved for entry into archive by Ronildo Prado (producaointelectual.bco@ufscar.br) on 2017-10-25T17:55:35Z (GMT) No. of bitstreams: 1 TeseEFM.pdf: 31334481 bytes, checksum: 966afb8a981794db0aee3bc97ee11d5b (MD5) / Made available in DSpace on 2017-10-25T18:01:51Z (GMT). No. of bitstreams: 1 TeseEFM.pdf: 31334481 bytes, checksum: 966afb8a981794db0aee3bc97ee11d5b (MD5) Previous issue date: 2017-02-17 / Não recebi financiamento / The huge volume of spatial data generated and made available in recent years from different sources, such as remote sensing, smart phones, space telescopes, and satellites, has motivated researchers and practitioners around the world to find out a way to process efficiently this huge volume of spatial data. Systems based on the MapReduce programming paradigm, such as Hadoop, have proven to be an efficient framework for processing huge volumes of data in many applications. However, Hadoop has showed not to be adequate in native support for spatial data due to its central structure is not aware of the spatial characteristics of such data. The solution to this problem gave rise to SpatialHadoop, which is a Hadoop extension with native support for spatial data. However, SpatialHadoop does not enable to jointly allocate related spatial data and also does not take into account any characteristics of the data in the process of task scheduler for processing on the nodes of a cluster of computers. Given this scenario, this PhD dissertation aims to propose new strategies to improve the performance of the processing of the spatial join operations for huge volumes of data using SpatialHadoop. For this purpose, the proposed solutions explore the joint allocation of related spatial data and the scheduling strategy of MapReduce for related spatial data also allocated in a jointly form. The efficient data access is an essential step in achieving better performance during query processing. Therefore, the proposed solutions allow the reduction of network traffic and I/O operations to the disk and consequently improve the performance of spatial join processing by using SpatialHadoop. By means of experimental evaluations, it was possible to show that the novel data allocation policies and scheduling tasks actually improve the total processing time of the spatial join operations. The performance gain varied from 14.7% to 23.6% if compared to the baseline proposed by CoS-HDFS and varied from 8.3% to 65% if compared to the native support of SpatialHadoop. / A explosão no volume de dados espaciais gerados e disponibilizados nos últimos anos, provenientes de diferentes fontes, por exemplo, sensoriamento remoto, telefones inteligentes, telescópios espaciais e satélites, motivaram pesquisadores e profissionais em todo o mundo a encontrar uma forma de processar de forma eficiente esse grande volume de dados espaciais. Sistemas baseados no paradigma de programação MapReduce, como exemplo Hadoop, provaram ser durante anos um framework eficiente para o processamento de enormes volumes de dados em muitas aplicações. No entanto, o Hadoop demonstrou não ser adequado no suporte nativo a dados espaciais devido a sua estrutura central não ter conhecimento das características espaciais desses dados. A solução para este problema deu origem ao SpatialHadoop, uma extensão do Hadoop, com suporte nativo para dados espaciais. Entretanto o SpatialHadoop não é capaz de alocar conjuntamente dados espaciais relacionados e também não leva em consideração qualquer característica dos dados no processo de escalonamento das tarefas para processamento nos nós de um cluster de computadores. Diante deste cenário, esta tese tem por objetivo propor novas estratégias para melhorar o desempenho do processamento das operações de junção espacial para grandes volumes de dados usando o SpatialHadoop. Para tanto, as soluções propostas exploram a alocação conjunta dos dados espaciais relacionados e a estratégia de escalonamento de tarefas MapReduce para dados espaciais relacionados também alocados de forma conjunta. Acredita-se que o acesso eficiente aos dados é um passo essencial para alcançar um melhor desempenho durante o processamento de consultas. Desta forma, as soluções propostas permitem a redução do tráfego de rede e operações de Entrada/Saída para o disco e consequentemente melhoram o desempenho no processamento de junção espacial usando SpatialHadoop. Por meio de testes de desempenho experimentais foi possível comprovar que as novas políticas de alocação de dados e escalonamento de tarefas de fato melhoram o tempo total de processamento das operações de junção espacial. O ganho de desempenho variou de 14,7% a 23,6% com relação ao baseline proposto por CoS-HDFS e variou de 8,3% a 65% com relação ao suporte nativo do SpatialHadoop. Banco de dados espaciais Processamento de consulta Junção espacial Processamento paralelo e distribuído Clusters de computadores Spatial databases Query processing Spatial join Parallel and distributed processing Cluster computing
4	Impact de la coopération dans les nouvelles plates-formes de calcul à hautes performances / Impact de la coopération dans les nouvelles plates-formes de calcul à hautes performances Angelis Cordeiro, Daniel de 09 February 2012 (has links) L'informatique a changé profondément les aspects méthodologiques du processus de découverte dans les différents domaines du savoir. Les chercheurs ont à leur disposition aujourd'hui de nouvelles capacités qui permettent d'envisager la résolution de nouveaux problèmes. Les plates-formes parallèles et distribués composées de ressources partagés entre différents participants peuvent rendre ces nouvelles capacités accessibles à tout chercheur et offre une puissance de calcul qui a été limitée jusqu'à présent, aux projets scientifiques les plus grands (et les plus riches). Dans ce document qui regroupe les résultats obtenus pendant mon doctorat, nous explorons quatre facettes différentes de la façon dont les organisations s'engagent dans une collaboration sur de plates-formes parallèles et distribuées. En utilisant des outils classiques de l'analyse combinatoire, de l'ordonnancement multi-objectif et de la théorie des jeux, nous avons montré comment calculer des ordonnancements avec un bon compromis entre les résultats obtenu par les participants et la performance globale de la plate-forme. En assurant des résultats justes et en garantissant des améliorations de performance pour les différents participants, nous pouvons créer une plate-forme efficace où chacun se sent toujours encourager à collaborer et à partager ses ressources. Tout d'abord, nous étudions la collaboration entre organisations égoïstes. Nous montrons que le comportement égoïste entre les participants impose une borne inférieure sur le makespan global. Nous présentons des algorithmes qui font face à l'égoïsme des organisations et qui présentent des résultats équitables. La seconde étude porte sur la collaboration entre les organisations qui peuvent tolérer une dégradation limitée de leur performance si cela peut aider à améliorer le makespan global. Nous améliorons les bornes d'inapproximabilité connues sur ce problème et nous présentons de nouveaux algorithmes dont les garanties sont proches de l'ensemble de Pareto (qui regroupe les meilleures solutions possibles). La troisième forme de collaboration étudiée est celle entre des participants rationnels qui peuvent choisir la meilleure stratégie pour leur tâches. Nous présentons un modèle de jeu non coopératif pour le problème et nous montrons comment l'utilisation de "coordination mechanisms" permet la création d'équilibres approchés avec un prix de l'anarchie borné. Finalement, nous étudions la collaboration entre utilisateurs partageant un ensemble de ressources communes. Nous présentons une méthode qui énumère la frontière des solutions avec des meilleurs compromis pour les utilisateurs et sélectionne la solution qui apporte la meilleure performance globale. / Computer science is deeply changing methodological aspects of the discovery process in different areas of knowledge. Researchers have at their disposal new capabilities that can create novel research opportunities. Parallel and distributed platforms composed of resources shared between different participants can make these new capabilities accessible to every researcher at every level, delivering computational power that was restricted before to bigger (and wealthy) scientific projects. This work explores four different facets of the rules that govern how organizations engage in collaboration on modern parallel and distributed platforms. Using classical combinatorial tools, multi-objective scheduling and game-theory, we showed how to compute schedules with good trade-offs between the results got by the participants and the global performance of the platform. By ensuring fair results and guaranteeing performance improvements for the participants, we can create an efficient platform where everyone always feels encouraged to collaborate and to share its resources. First, we study the collaboration between selfish organizations. We show how the selfish behavior between the participants imposes a lower bound on the global makespan. We present algorithms that cope with the selfishness of the organizations and that achieve good fairness in practice. The second study is about collaboration between organizations that can tolerate a limited degradation on their performance if this can help ameliorate the global makespan. We improve the existing inapproximation bounds for this problem and present new algorithms whose guarantees are close to the Pareto set. The third form of collaboration studied is between rational participants that can independently choose the best strategy for their jobs. We present a non-cooperative game-theoretic model for the problem and show how coordination mechanisms allow the creation of approximate pure equilibria with bounded price of anarchy. Finally, we study collaboration between users sharing a set of common resources. We present a method that enumerates the frontier of best compromise solutions for the users and selects the solution that brings the best value for the global performance function. Ordonnancement Équilibrage de charge Optimisation multi-objectif Théorie des jeux Calcul parallèle et distribué Scheduling Load balancing Multi-objective optimization Game theory Parallel and distributed processing
5	AI-WSN: Adaptive and Intelligent Wireless Sensor Networks Li, Jiakai 24 September 2012 (has links) No description available. Computer Science Electrical Engineering artificial neural network wireless sensor network parallel and distributed processing weakly connected dominating set sensor network infrastructure Hopfield network connected dominating set
6	High-performant, Replicated, Queue-oriented Transaction Processing Systems on Modern Computing Infrastructures Thamir Qadah (11132985) 27 July 2021 (has links) With the shifting landscape of computing hardware architectures and the emergence of new computing environments (e.g., large main-memory systems, hundreds of CPUs, distributed and virtualized cloud-based resources), state-of-the-art designs of transaction processing systems that rely on conventional wisdom suffer from lost performance optimization opportunities. This dissertation challenges conventional wisdom to rethink the design and implementation of transaction processing systems for modern computing environments.<div><br></div><div>We start by tackling the vertical hardware scaling challenge, and propose a deterministic approach to transaction processing on emerging multi-sockets, many-core, shared memory architecture to harness its unprecedented available parallelism. Our proposed priority-based queue-oriented transaction processing architecture eliminates the transaction contention footprint and uses speculative execution to improve the throughput of centralized deterministic transaction processing systems. We build QueCC and demonstrate up to two orders of magnitude better performance over the state-of-the-art.<br></div><div><br></div><div>We further tackle the horizontal scaling challenge and propose a distributed queue-oriented transaction processing engine that relies on queue-oriented communication to eliminate the traditional overhead of commitment protocols for multi-partition transactions. We build Q-Store, and demonstrate up to 22x improvement in system throughput over the state-of-the-art deterministic transaction processing systems.<br></div><div><br></div><div>Finally, we propose a generalized framework for designing distributed and replicated deterministic transaction processing systems. We introduce the concept of speculative replication to hide the latency overhead of replication. We prototype the speculative replication protocol in QR-Store and perform an extensive experimental evaluation using standard benchmarks. We show that QR-Store can achieve a throughput of 1.9 million replicated transactions per second in under 200 milliseconds and a replication overhead of 8%-25%compared to non-replicated configurations.<br></div> Computer Engineering Applied Computer Science Computer Software Computer System Architecture Database Management Database systems Transaction processing Transaction Management Concurrency control Queue-oriented paradigm Speculative Execution Deterministic transaction processing Commit protocols Distributed database systems Parallel database systems Data Replication Parallel and Distributed Processing

1

Page generated in 0.127 seconds