Spelling suggestions: "subject:"parallel anda distributed processing"" "subject:"parallel ando distributed processing""
1 |
Measuring, modeling, and optimizing counterintuitive performance phenomena in power-scalable, parallel systemsChang, Hung-Ching 09 April 2015 (has links)
The demands of exascale computing systems and applications have pushed for a rapid, continual design paradigm coupled with increasing design complexities from the interaction between the application, the middleware, and the underlying system hardware, which forms a breeding ground for inefficiency. This work seeks to improve system efficiency by exposing the root causes of unexpected performance slowdowns (e.g., lower performance at higher processor speeds) that occur more frequently in power-scalable systems where raw processor speed varies. More precisely, we perform an exhaustive empirical study that conclusively shows that increasing processor speed often reduces performance and wastes energy. Our experimental work shows that the frequency of occurrence and magnitude of slowdowns grow with clock frequency and parallelism, indicating that such slowdowns will increasingly be observed with trends in processor and system design.
Performance speedups at lower frequencies (or slowdowns at higher frequencies) have been anecdotally observed in the prevailing literature since 2004, but no research has explained nor exploited this phenomenon. This work conclusively demonstrates that performance slowdowns during processor speedup phases can exceed 47% in common I/O workloads. Our hypothesis challenges (and ultimately debunks) a fundamental assumption in computer systems: faster processor speeds result in the same or better performance.
In this work, with the use of code and kernel instrumentation, exhaustive experiments, and deep insight into the inner workings of the Linux I/O subsystem, I overcome the aforementioned challenges of variance, complexity, and nondeterminism and identify the I/O resource contention as the root cause of the slowdowns during processor speedup. Specifically, such contention comes from the Linux kernel when the journaling block device (JBD) interacts with the ext3/4 file system that introduces file write delays and file synchronization delays. To fully explain how such I/O contention causes performance anomaly, I propose analytical models of resource contention among I/O threads to describe the root cause of the observed I/O slowdowns when processors speed up. To this end, I introduce LUC, a runtime system to limit the unintended consequences of power scaling and demonstrate the effectiveness of the LUC system for two critical parallel transaction-oriented workloads, including a mail server (varMail) and online transaction processing (oltp). / Ph. D.
|
2 |
Efficient Parallelization of 2D Ising Spin SystemsFeng, Shuangtong 28 December 2001 (has links)
The problem of efficient parallelization of 2D Ising spin systems requires realistic algorithmic design and implementation based on an understanding of issues from computer science and statistical physics. In this work, we not only consider fundamental parallel computing issues but also ensure that the major constraints and criteria of 2D Ising spin systems are incorporated into our study. This realism in both parallel computation and statistical physics has rarely been reflected in previous research for this problem.
In this thesis,we designed and implemented a variety of parallel algorithms for both sweep spin selection and random spin selection. We analyzed our parallel algorithms on a portable and general parallel machine model, namely the LogP model. We were able to obtain rigorous theoretical run-times on LogP for all the parallel algorithms. Moreover, a guiding equation was derived for choosing data layouts (blocked vs. stripped) for sweep spin selection. In regards to random spin selection, we were able to develop parallel algorithms with efficient communication schemes. We analyzed randomness of our schemes using statistical methods and provided comparisons between the different schemes. Furthermore, algorithms were implemented and performance data gathered and analyzed in order to determine further design issues and validate theoretical analysis. / Master of Science
|
3 |
Processamento eficiente de junção espacial em ambiente paralelo e distribuído baseado em SpatialhadoopMendes, Eduardo Fernando 17 February 2017 (has links)
Submitted by Alison Vanceto (alison-vanceto@hotmail.com) on 2017-08-17T12:19:08Z
No. of bitstreams: 1
TeseEFM.pdf: 31334481 bytes, checksum: 966afb8a981794db0aee3bc97ee11d5b (MD5) / Approved for entry into archive by Ronildo Prado (producaointelectual.bco@ufscar.br) on 2017-10-25T17:55:23Z (GMT) No. of bitstreams: 1
TeseEFM.pdf: 31334481 bytes, checksum: 966afb8a981794db0aee3bc97ee11d5b (MD5) / Approved for entry into archive by Ronildo Prado (producaointelectual.bco@ufscar.br) on 2017-10-25T17:55:35Z (GMT) No. of bitstreams: 1
TeseEFM.pdf: 31334481 bytes, checksum: 966afb8a981794db0aee3bc97ee11d5b (MD5) / Made available in DSpace on 2017-10-25T18:01:51Z (GMT). No. of bitstreams: 1
TeseEFM.pdf: 31334481 bytes, checksum: 966afb8a981794db0aee3bc97ee11d5b (MD5)
Previous issue date: 2017-02-17 / Não recebi financiamento / The huge volume of spatial data generated and made available in recent years from
different sources, such as remote sensing, smart phones, space telescopes, and
satellites, has motivated researchers and practitioners around the world to find out a way
to process efficiently this huge volume of spatial data. Systems based on the MapReduce
programming paradigm, such as Hadoop, have proven to be an efficient framework for
processing huge volumes of data in many applications. However, Hadoop has showed
not to be adequate in native support for spatial data due to its central structure is not
aware of the spatial characteristics of such data. The solution to this problem gave rise to
SpatialHadoop, which is a Hadoop extension with native support for spatial data.
However, SpatialHadoop does not enable to jointly allocate related spatial data and also
does not take into account any characteristics of the data in the process of task scheduler
for processing on the nodes of a cluster of computers. Given this scenario, this PhD
dissertation aims to propose new strategies to improve the performance of the processing
of the spatial join operations for huge volumes of data using SpatialHadoop. For this
purpose, the proposed solutions explore the joint allocation of related spatial data and the
scheduling strategy of MapReduce for related spatial data also allocated in a jointly form.
The efficient data access is an essential step in achieving better performance during
query processing. Therefore, the proposed solutions allow the reduction of network traffic
and I/O operations to the disk and consequently improve the performance of spatial join
processing by using SpatialHadoop. By means of experimental evaluations, it was
possible to show that the novel data allocation policies and scheduling tasks actually
improve the total processing time of the spatial join operations. The performance gain
varied from 14.7% to 23.6% if compared to the baseline proposed by CoS-HDFS and
varied from 8.3% to 65% if compared to the native support of SpatialHadoop. / A explosão no volume de dados espaciais gerados e disponibilizados nos últimos anos,
provenientes de diferentes fontes, por exemplo, sensoriamento remoto, telefones
inteligentes, telescópios espaciais e satélites, motivaram pesquisadores e profissionais
em todo o mundo a encontrar uma forma de processar de forma eficiente esse grande
volume de dados espaciais. Sistemas baseados no paradigma de programação
MapReduce, como exemplo Hadoop, provaram ser durante anos um framework eficiente
para o processamento de enormes volumes de dados em muitas aplicações. No entanto,
o Hadoop demonstrou não ser adequado no suporte nativo a dados espaciais devido a
sua estrutura central não ter conhecimento das características espaciais desses dados.
A solução para este problema deu origem ao SpatialHadoop, uma extensão do Hadoop,
com suporte nativo para dados espaciais. Entretanto o SpatialHadoop não é capaz de
alocar conjuntamente dados espaciais relacionados e também não leva em consideração
qualquer característica dos dados no processo de escalonamento das tarefas para
processamento nos nós de um cluster de computadores. Diante deste cenário, esta tese
tem por objetivo propor novas estratégias para melhorar o desempenho do
processamento das operações de junção espacial para grandes volumes de dados
usando o SpatialHadoop. Para tanto, as soluções propostas exploram a alocação
conjunta dos dados espaciais relacionados e a estratégia de escalonamento de tarefas
MapReduce para dados espaciais relacionados também alocados de forma conjunta.
Acredita-se que o acesso eficiente aos dados é um passo essencial para alcançar um
melhor desempenho durante o processamento de consultas. Desta forma, as soluções
propostas permitem a redução do tráfego de rede e operações de Entrada/Saída para o
disco e consequentemente melhoram o desempenho no processamento de junção
espacial usando SpatialHadoop. Por meio de testes de desempenho experimentais foi
possível comprovar que as novas políticas de alocação de dados e escalonamento de
tarefas de fato melhoram o tempo total de processamento das operações de junção
espacial. O ganho de desempenho variou de 14,7% a 23,6% com relação ao baseline
proposto por CoS-HDFS e variou de 8,3% a 65% com relação ao suporte nativo do
SpatialHadoop.
|
4 |
Impact de la coopération dans les nouvelles plates-formes de calcul à hautes performances / Impact de la coopération dans les nouvelles plates-formes de calcul à hautes performancesAngelis Cordeiro, Daniel de 09 February 2012 (has links)
L'informatique a changé profondément les aspects méthodologiques du processus de découverte dans les différents domaines du savoir. Les chercheurs ont à leur disposition aujourd'hui de nouvelles capacités qui permettent d'envisager la résolution de nouveaux problèmes. Les plates-formes parallèles et distribués composées de ressources partagés entre différents participants peuvent rendre ces nouvelles capacités accessibles à tout chercheur et offre une puissance de calcul qui a été limitée jusqu'à présent, aux projets scientifiques les plus grands (et les plus riches). Dans ce document qui regroupe les résultats obtenus pendant mon doctorat, nous explorons quatre facettes différentes de la façon dont les organisations s'engagent dans une collaboration sur de plates-formes parallèles et distribuées. En utilisant des outils classiques de l'analyse combinatoire, de l'ordonnancement multi-objectif et de la théorie des jeux, nous avons montré comment calculer des ordonnancements avec un bon compromis entre les résultats obtenu par les participants et la performance globale de la plate-forme. En assurant des résultats justes et en garantissant des améliorations de performance pour les différents participants, nous pouvons créer une plate-forme efficace où chacun se sent toujours encourager à collaborer et à partager ses ressources. Tout d'abord, nous étudions la collaboration entre organisations égoïstes. Nous montrons que le comportement égoïste entre les participants impose une borne inférieure sur le makespan global. Nous présentons des algorithmes qui font face à l'égoïsme des organisations et qui présentent des résultats équitables. La seconde étude porte sur la collaboration entre les organisations qui peuvent tolérer une dégradation limitée de leur performance si cela peut aider à améliorer le makespan global. Nous améliorons les bornes d'inapproximabilité connues sur ce problème et nous présentons de nouveaux algorithmes dont les garanties sont proches de l'ensemble de Pareto (qui regroupe les meilleures solutions possibles). La troisième forme de collaboration étudiée est celle entre des participants rationnels qui peuvent choisir la meilleure stratégie pour leur tâches. Nous présentons un modèle de jeu non coopératif pour le problème et nous montrons comment l'utilisation de "coordination mechanisms" permet la création d'équilibres approchés avec un prix de l'anarchie borné. Finalement, nous étudions la collaboration entre utilisateurs partageant un ensemble de ressources communes. Nous présentons une méthode qui énumère la frontière des solutions avec des meilleurs compromis pour les utilisateurs et sélectionne la solution qui apporte la meilleure performance globale. / Computer science is deeply changing methodological aspects of the discovery process in different areas of knowledge. Researchers have at their disposal new capabilities that can create novel research opportunities. Parallel and distributed platforms composed of resources shared between different participants can make these new capabilities accessible to every researcher at every level, delivering computational power that was restricted before to bigger (and wealthy) scientific projects. This work explores four different facets of the rules that govern how organizations engage in collaboration on modern parallel and distributed platforms. Using classical combinatorial tools, multi-objective scheduling and game-theory, we showed how to compute schedules with good trade-offs between the results got by the participants and the global performance of the platform. By ensuring fair results and guaranteeing performance improvements for the participants, we can create an efficient platform where everyone always feels encouraged to collaborate and to share its resources. First, we study the collaboration between selfish organizations. We show how the selfish behavior between the participants imposes a lower bound on the global makespan. We present algorithms that cope with the selfishness of the organizations and that achieve good fairness in practice. The second study is about collaboration between organizations that can tolerate a limited degradation on their performance if this can help ameliorate the global makespan. We improve the existing inapproximation bounds for this problem and present new algorithms whose guarantees are close to the Pareto set. The third form of collaboration studied is between rational participants that can independently choose the best strategy for their jobs. We present a non-cooperative game-theoretic model for the problem and show how coordination mechanisms allow the creation of approximate pure equilibria with bounded price of anarchy. Finally, we study collaboration between users sharing a set of common resources. We present a method that enumerates the frontier of best compromise solutions for the users and selects the solution that brings the best value for the global performance function.
|
5 |
AI-WSN: Adaptive and Intelligent Wireless Sensor NetworksLi, Jiakai 24 September 2012 (has links)
No description available.
|
6 |
High-performant, Replicated, Queue-oriented Transaction Processing Systems on Modern Computing InfrastructuresThamir Qadah (11132985) 27 July 2021 (has links)
With the shifting landscape of computing hardware architectures and the emergence of new computing environments (e.g., large main-memory systems, hundreds of CPUs, distributed and virtualized cloud-based resources), state-of-the-art designs of transaction processing systems that rely on conventional wisdom suffer from lost performance optimization opportunities. This dissertation challenges conventional wisdom to rethink the design and implementation of transaction processing systems for modern computing environments.<div><br></div><div>We start by tackling the vertical hardware scaling challenge, and propose a deterministic approach to transaction processing on emerging multi-sockets, many-core, shared memory architecture to harness its unprecedented available parallelism. Our proposed priority-based queue-oriented transaction processing architecture eliminates the transaction contention footprint and uses speculative execution to improve the throughput of centralized deterministic transaction processing systems. We build QueCC and demonstrate up to two orders of magnitude better performance over the state-of-the-art.<br></div><div><br></div><div>We further tackle the horizontal scaling challenge and propose a distributed queue-oriented transaction processing engine that relies on queue-oriented communication to eliminate the traditional overhead of commitment protocols for multi-partition transactions. We build Q-Store, and demonstrate up to 22x improvement in system throughput over the state-of-the-art deterministic transaction processing systems.<br></div><div><br></div><div>Finally, we propose a generalized framework for designing distributed and replicated deterministic transaction processing systems. We introduce the concept of speculative replication to hide the latency overhead of replication. We prototype the speculative replication protocol in QR-Store and perform an extensive experimental evaluation using standard benchmarks. We show that QR-Store can achieve a throughput of 1.9 million replicated transactions per second in under 200 milliseconds and a replication overhead of 8%-25%compared to non-replicated configurations.<br></div>
|
Page generated in 0.1682 seconds