31 |
Parallel Computing for Applications in Aeronautical CFDYtterström, Anders January 2001 (has links)
No description available.
|
32 |
Performance Benchmarking of Fast Multipole MethodsAl-Harthi, Noha A. 06 1900 (has links)
The current trends in computer architecture are shifting towards smaller byte/flop ratios, while available parallelism is increasing at all levels of granularity – vector length, core count, and MPI process. Intel’s Xeon Phi coprocessor, NVIDIA’s Kepler GPU, and IBM’s BlueGene/Q all have a Byte/flop ratio close to 0.2, which makes it very difficult for most algorithms to extract a high percentage of the theoretical peak flop/s from these architectures. Popular algorithms in scientific computing such as FFT are continuously evolving to keep up with this trend in hardware. In the meantime it is also necessary to invest in novel algorithms that are more suitable for computer architectures of the future.
The fast multipole method (FMM) was originally developed as a fast algorithm for ap- proximating the N-body interactions that appear in astrophysics, molecular dynamics, and vortex based fluid dynamics simulations. The FMM possesses have a unique combination of being an efficient O(N) algorithm, while having an operational intensity that is higher than a matrix-matrix multiplication. In fact, the FMM can reduce the requirement of Byte/flop to around 0.01, which means that it will remain compute bound until 2020 even if the cur- rent trend in microprocessors continues. Despite these advantages, there have not been any benchmarks of FMM codes on modern architectures such as Xeon Phi, Kepler, and Blue- Gene/Q.
This study aims to provide a comprehensive benchmark of a state of the art FMM code “exaFMM” on the latest architectures, in hopes of providing a useful reference for deciding when the FMM will become useful as the computational engine in a given application code. It may also serve as a warning to certain problem size domains areas where the FMM will exhibit insignificant performance improvements. Such issues depend strongly on the asymptotic constants rather than the asymptotics themselves, and therefore are strongly implementation and hardware dependent. The primary objective of this study is to provide these constants on various computer architectures.
|
33 |
Multipath Probabilistic Early Response TCPSingh, Ankit 2012 August 1900 (has links)
Many computers and devices such as smart phones, laptops and tablet devices are now equipped with multiple network interfaces, enabling them to use multiple paths to access content over the network. If the resources could be used concurrently, end user experience can be greatly improved. The recent studies in MPTCP suggest that improved reliability, load balancing and mobility are feasible. The thesis presents a new multipath delay based algorithm, MPPERT (Multipath Probabilistic Early response TCP), which provides high throughput and efficient load balancing. In all-PERT environment, MPPERT suffers no packet loss and maintains much smaller queue sizes compared to existing MPTCP, making it suitable for real time data transfer. MP-PERT is suitable for incremental deployment in a heterogeneous environment. It also presents a parametrized approach to tune the amount of traffic shift off the congested path.
Multipath approach is benefited from having multiple connections between end hosts. However, it is desired to keep the connection set minimal as increasing number of paths may not always provide significant increase in the performance. Moreover, higher number of paths unnecessarily increase computational requirement. Ideally, we should suppress paths with low throughputs and avoid paths with shared bottlenecks. In case of MPTCP, there is no efficient way to detect a common bottleneck between subflows. MPTCP applies a constraint of best single-path TCP throughput, to ensure fair share at a common bottleneck link. The best path throughput constraint along with traffic shift, from more congested to less congested paths, provide better opportunity for the competing flows to achieve higher throughput. However, the disadvantage is that even if there are no shared links, the same constraint would decrease the overall achievable throughput of a multipath flow.
PERT, being a delay based TCP protocol, has continuous information about the state of the queue. This information is valuable in enabling MPPERT to detect subflows sharing a common bottleneck and obtain a smaller set of disjoint subflows. This information can even be used to switch from coupled (a set of subflows having interdependent increase/decrease of congestion windows) to uncoupled (independent increase/decrease of congestion windows) subflows, yielding higher throughput when best single-path TCP constraint is relaxed. The ns-2 simulations support MPPERT as a highly competitive multipath approach, suitable for real time data transfer, which is capable of offering higher throughput and improved reliability.
|
34 |
A case study of handling load spikes in authentication systemsSverrisson, Kristjon January 2008 (has links)
<p>The user growth in Internet services for the past years has caused a need to re-think methods for user authentication, authorization and accounting for network providers. To deal with this growing demand for Internet services, the underlying user authentication systems have to be able to, among other things, handle load spikes. This can be achieved by using loadbalancing, and there are both adaptive and non-adaptive methods of loadbalancing.</p><p>This case study compares adaptive and non-adaptive loadbalancing for user authentication in terms of average throughput. To do this we set up a lab where we test two different load-balancing methods; a non-adaptive and a adaptive.</p><p>The non-adaptive load balancing method is simple, only using a pool of servers to direct the load to in a round-robin way, whereas the adaptive load balancing method tries to direct the load using a calculation of the previous requests.</p>
|
35 |
Energy-aware load balancing approaches to improve energy efficiency on HPC systems / Abordagens de balanceamento de carga ciente de energia para melhorar a eficiência energética em sistemas HPCPadoin, Edson Luiz January 2016 (has links)
Os atuais sistemas de HPC tem realizado simulações mais complexas possíveis, produzindo benefícios para diversas áreas de pesquisa. Para atender à crescente demanda de processamento dessas simulações, novos equipamentos estão sendo projetados, visando à escala exaflops. Um grande desafio para a construção destes sistemas é a potência que eles vão demandar, onde perspectivas atuais alcançam GigaWatts. Para resolver este problema, esta tese apresenta uma abordagem para aumentar a eficiência energética usando recursos de HPC, objetivando reduzir os efeitos do desequilíbrio de carga e economizar energia. Nós desenvolvemos uma estratégia baseada no consumo de energia, chamada ENERGYLB, que considera características da plataforma, irregularidade e dinamicidade de carga das aplicações para melhorar a eficiência energética. Nossa estratégia leva em conta carga computacional atual e a frequência de clock dos cores, para decidir entre chamar uma estratégia de balanceamento de carga que reduz o desequilíbrio de carga migrando tarefas, ou usar técnicas de DVFS par ajustar as frequências de clock dos cores de acordo com suas cargas computacionais ponderadas. Como as diferentes arquiteturas de processador podem apresentam dois níveis de granularidade de DVFS, DVFS-por-chip ou DVFS-por-core, nós criamos dois diferentes algoritmos para a nossa estratégia. O primeiro, FG-ENERGYLB, permite um controle fino da frequência dos cores em sistemas que possuem algumas dezenas de cores e implementam DVFS-por-core. Por outro lado, CG-ENERGYLB é adequado para plataformas de HPC composto de vários processadores multicore que não permitem tal refinado controle, ou seja, que só executam DVFS-por-chip. Ambas as abordagens exploram desbalanceamentos residuais em aplicações interativas e combinam balanceamento de carga dinâmico com técnicas de DVFS. Assim, eles reduzem a frequência de clock dos cores com menor carga computacional os quais apresentam algum desequilíbrio residual mesmo após as tarefas serem remapeadas. Nós avaliamos a aplicabilidade das nossas abordagens utilizando o ambiente de programação paralela CHARM++ sobre benchmarks e aplicações reais. Resultados experimentais presentaram melhorias no consumo de energia e na demanda potência sobre algoritmos do estado-da-arte. A economia de energia com ENERGYLB usado sozinho foi de até 25% com nosso algoritmo FG-ENERGYLB, e de até 27% com nosso algoritmo CG-ENERGYLB. No entanto, os desequilíbrios residuais ainda estavam presentes após as serem tarefas remapeadas. Neste caso, quando as nossas abordagens foram empregadas em conjunto com outros balanceadores de carga, uma melhoria na economia de energia de até 56% é obtida com FG-ENERGYLB e de até 36% com CG-ENERGYLB. Estas economias foram obtidas através da exploração do desbalanceamento residual em aplicações interativas. Combinando balanceamento de carga dinâmico com DVFS nossa estratégia é capaz de reduzir a demanda de potência média dos sistemas paralelos, reduzir a migração de tarefas entre os recursos disponíveis, e manter o custo de balanceamento de carga baixo. / Current HPC systems have made more complex simulations feasible, yielding benefits to several research areas. To meet the increasing processing demands of these simulations, new equipment is being designed, aiming at the exaflops scale. A major challenge for building these systems is the power that they will require, which current perspectives reach the GigaWatts. To address this problem, this thesis presents an approach to increase the energy efficiency using of HPC resources, aiming to reduce the effects of load imbalance to save energy. We developed an energy-aware strategy, called ENERGYLB, which considers platform characteristics, and the load irregularity and dynamicity of the applications to improve the energy efficiency. Our strategy takes into account the current computational load and clock frequency, to decide whether to call a load balancing strategy that reduces load imbalance by migrating tasks, or use Dynamic Voltage and Frequency Scaling (DVFS) technique to adjust the clock frequencies of the cores according to their weighted loads. As different processor architectures can feature two levels of DVFS granularity, per-chip DVFS or per-core DVFS, we created two different algorithms for our strategy. The first one, FG-ENERGYLB, allows a fine control of the clock frequency of cores in systems that have few tens of cores and feature per-core DVFS control. On the other hand, CGENERGYLB is suitable for HPC platforms composed of several multicore processors that do not allow such a fine-grained control, i.e., that only perform per-chip DVFS. Both approaches exploit residual imbalances on iterative applications and combine dynamic load balancing with DVFS techniques. Thus, they reduce the clock frequency of underloaded computing cores, which experience some residual imbalance even after tasks are remapped. We evaluate the applicability of our approaches using the CHARM++ parallel programming system over benchmarks and real world applications. Experimental results present improvements in energy consumption and power demand over state-of-the-art algorithms. The energy savings with ENERGYLB used alone were up to 25%with our FG-ENERGYLB algorithm, and up to 27%with our CG-ENERGYLB algorithm. Nevertheless, residual imbalances were still present after tasks were remapped. In this case, when our approaches were employed together with these load balancers, an improvement in energy savings of up to 56% is achieved with FG-ENERGYLB and up to 36% with CG-ENERGYLB. These savings were obtained by exploiting residual imbalances on iterative applications. By combining dynamic load balancing with the DVFS technique, our approach is able to reduce the average power demand of parallel systems, reduce the task migration among the available resources, and keep load balancing overheads low.
|
36 |
Energy-aware load balancing approaches to improve energy efficiency on HPC systems / Abordagens de balanceamento de carga ciente de energia para melhorar a eficiência energética em sistemas HPCPadoin, Edson Luiz January 2016 (has links)
Os atuais sistemas de HPC tem realizado simulações mais complexas possíveis, produzindo benefícios para diversas áreas de pesquisa. Para atender à crescente demanda de processamento dessas simulações, novos equipamentos estão sendo projetados, visando à escala exaflops. Um grande desafio para a construção destes sistemas é a potência que eles vão demandar, onde perspectivas atuais alcançam GigaWatts. Para resolver este problema, esta tese apresenta uma abordagem para aumentar a eficiência energética usando recursos de HPC, objetivando reduzir os efeitos do desequilíbrio de carga e economizar energia. Nós desenvolvemos uma estratégia baseada no consumo de energia, chamada ENERGYLB, que considera características da plataforma, irregularidade e dinamicidade de carga das aplicações para melhorar a eficiência energética. Nossa estratégia leva em conta carga computacional atual e a frequência de clock dos cores, para decidir entre chamar uma estratégia de balanceamento de carga que reduz o desequilíbrio de carga migrando tarefas, ou usar técnicas de DVFS par ajustar as frequências de clock dos cores de acordo com suas cargas computacionais ponderadas. Como as diferentes arquiteturas de processador podem apresentam dois níveis de granularidade de DVFS, DVFS-por-chip ou DVFS-por-core, nós criamos dois diferentes algoritmos para a nossa estratégia. O primeiro, FG-ENERGYLB, permite um controle fino da frequência dos cores em sistemas que possuem algumas dezenas de cores e implementam DVFS-por-core. Por outro lado, CG-ENERGYLB é adequado para plataformas de HPC composto de vários processadores multicore que não permitem tal refinado controle, ou seja, que só executam DVFS-por-chip. Ambas as abordagens exploram desbalanceamentos residuais em aplicações interativas e combinam balanceamento de carga dinâmico com técnicas de DVFS. Assim, eles reduzem a frequência de clock dos cores com menor carga computacional os quais apresentam algum desequilíbrio residual mesmo após as tarefas serem remapeadas. Nós avaliamos a aplicabilidade das nossas abordagens utilizando o ambiente de programação paralela CHARM++ sobre benchmarks e aplicações reais. Resultados experimentais presentaram melhorias no consumo de energia e na demanda potência sobre algoritmos do estado-da-arte. A economia de energia com ENERGYLB usado sozinho foi de até 25% com nosso algoritmo FG-ENERGYLB, e de até 27% com nosso algoritmo CG-ENERGYLB. No entanto, os desequilíbrios residuais ainda estavam presentes após as serem tarefas remapeadas. Neste caso, quando as nossas abordagens foram empregadas em conjunto com outros balanceadores de carga, uma melhoria na economia de energia de até 56% é obtida com FG-ENERGYLB e de até 36% com CG-ENERGYLB. Estas economias foram obtidas através da exploração do desbalanceamento residual em aplicações interativas. Combinando balanceamento de carga dinâmico com DVFS nossa estratégia é capaz de reduzir a demanda de potência média dos sistemas paralelos, reduzir a migração de tarefas entre os recursos disponíveis, e manter o custo de balanceamento de carga baixo. / Current HPC systems have made more complex simulations feasible, yielding benefits to several research areas. To meet the increasing processing demands of these simulations, new equipment is being designed, aiming at the exaflops scale. A major challenge for building these systems is the power that they will require, which current perspectives reach the GigaWatts. To address this problem, this thesis presents an approach to increase the energy efficiency using of HPC resources, aiming to reduce the effects of load imbalance to save energy. We developed an energy-aware strategy, called ENERGYLB, which considers platform characteristics, and the load irregularity and dynamicity of the applications to improve the energy efficiency. Our strategy takes into account the current computational load and clock frequency, to decide whether to call a load balancing strategy that reduces load imbalance by migrating tasks, or use Dynamic Voltage and Frequency Scaling (DVFS) technique to adjust the clock frequencies of the cores according to their weighted loads. As different processor architectures can feature two levels of DVFS granularity, per-chip DVFS or per-core DVFS, we created two different algorithms for our strategy. The first one, FG-ENERGYLB, allows a fine control of the clock frequency of cores in systems that have few tens of cores and feature per-core DVFS control. On the other hand, CGENERGYLB is suitable for HPC platforms composed of several multicore processors that do not allow such a fine-grained control, i.e., that only perform per-chip DVFS. Both approaches exploit residual imbalances on iterative applications and combine dynamic load balancing with DVFS techniques. Thus, they reduce the clock frequency of underloaded computing cores, which experience some residual imbalance even after tasks are remapped. We evaluate the applicability of our approaches using the CHARM++ parallel programming system over benchmarks and real world applications. Experimental results present improvements in energy consumption and power demand over state-of-the-art algorithms. The energy savings with ENERGYLB used alone were up to 25%with our FG-ENERGYLB algorithm, and up to 27%with our CG-ENERGYLB algorithm. Nevertheless, residual imbalances were still present after tasks were remapped. In this case, when our approaches were employed together with these load balancers, an improvement in energy savings of up to 56% is achieved with FG-ENERGYLB and up to 36% with CG-ENERGYLB. These savings were obtained by exploiting residual imbalances on iterative applications. By combining dynamic load balancing with the DVFS technique, our approach is able to reduce the average power demand of parallel systems, reduce the task migration among the available resources, and keep load balancing overheads low.
|
37 |
A case study of handling load spikes in authentication systemsSverrisson, Kristjon January 2008 (has links)
The user growth in Internet services for the past years has caused a need to re-think methods for user authentication, authorization and accounting for network providers. To deal with this growing demand for Internet services, the underlying user authentication systems have to be able to, among other things, handle load spikes. This can be achieved by using loadbalancing, and there are both adaptive and non-adaptive methods of loadbalancing. This case study compares adaptive and non-adaptive loadbalancing for user authentication in terms of average throughput. To do this we set up a lab where we test two different load-balancing methods; a non-adaptive and a adaptive. The non-adaptive load balancing method is simple, only using a pool of servers to direct the load to in a round-robin way, whereas the adaptive load balancing method tries to direct the load using a calculation of the previous requests.
|
38 |
Enhancing Load Balancing Efficiency Based on Migration Delay for Distributed Virtual SimulationsAlghamdi, Turki January 2015 (has links)
Load management is an essential and important factor for distributed simulations running on shared resources due to load imbalances that can caused considerable performance loss. High Level Architecture (HLA) -based simulation is a framework that works to facilitate the design and management of distributed simulations. HLA coordinates the interaction between simulation entities (federates). However, HLA-based simulation standards do not present the ability to manage resources or help detect load imbalances that could directly cause decrease of performance. Focusing on this constraint, a migration-aware dynamic balancing system has been designed for HLA simulations to offer an efficient load-balancing scheme that works in large-scale environments. This system presents some limitations on estimating costs and benefits, so we propose an enhancement to this existing load balancing system, which improves the accuracy of estimating the number of migrations for the next load redistribution. The proposed scheme detects the load imbalances by evaluating the recourses overhead. The scheme classifies the recourses based on the overhead as overloaded and underloaded, followed by matching the highest overloaded recourses with the lowest underloaded recourses. Furthermore, the proposed scheme aims to precisely estimate the number of migrations by evaluating and analyzing the recourses to obtain the best number of migrations. Therefore, certain migrations that do not contribute to an improvement in the simulation performance are avoided. This avoidance is based on comparing time delay and time gain. Moreover, to be considered for migration, the overall sum of the time gains should be larger than the overall sum of the time delays. The proposed scheme has shown an improvement on decreasing the execution time.
|
39 |
Vyvažování zátěže v sítích OpenFlow / Load Balancing in OpenFlow NetworksMarciniak, Petr January 2013 (has links)
The aim of this thesis is to develop a load balancing tool for OpenFlow networks. Software-defined networking (SDN) principles are introduced (OpenFlow protocol used as an example) and compared to the legacy routing and switching technology. Openflow is the first protocol/API enabling communication between the control and infrastructure planes of the software-defined networking model. Key features of the protocol are described and several OpenFlow controllers are introduced. Current best practices in computer networks load balancing are discussed as well. The load balancing application development process is described including the test laboratory setups - Mininet (SW) and OFELIA (HW). The application test results are evaluated and possible further enhancements to the program are discussed.
|
40 |
Clustering a load balancing serveru pro zpracování řečiTrnka, Miroslav January 2017 (has links)
This paper deals with the possibilities for load balancing and clustering of an existing server for speech processing. The paper analyzes problems of load balancing and clustering. There are also described the concepts of network programming and options for I/O processing. A new design of a load balancer is created, fully customized for the needs of speech processing server. This newly designed load balancer is implemented and thoroughly tested.
|
Page generated in 0.1047 seconds