Spelling suggestions: "subject:"infinidade"" "subject:"infini""
41 |
Fast Barrier Synchronization for InfiniBandHoefler, Torsten 04 January 2006 (has links)
Barrier Synchronization is crucial for many parallel systems. This talk introduces different synchronization mechanisms and demonstrates new approaches to leverage special hardware properties of InfiniBand to lower the Barrier latency.
|
42 |
Integration einer neuen InfiniBand-Schnittstelle in die vorhandene InfiniBand MPICH2 SoftwareMosch, Marek 25 April 2006 (has links)
Entwurf einer einheitlichen API zur Nutzung
von Mellanox V-API und OpenIB Verbs auf Basis
von C Pre-Prozessor Makros und Integration der API
in das vorhandene MPICH2-CH3 Device für Infiniband
|
43 |
Enhancing an InfiniBand driver by utilizing an efficient malloc/free library supporting multiple page sizesRex, Robert 18 September 2006 (has links)
Despite using high-speed network interconnection
systems like InfiniBand, the communication
overhead for parallel applications, especially
in the area of High-Performance Computing (HPC), is still high. Using large
page frames - so called hugepages in Linux - can
improve the crucial work of registering
communication buffers to the network adapter.
Thus, an InfiniBand driver was modified. But these
hugepages do not only reduce communication costs
but can also improve computation time in a
perceptible manner, e.g. by less TLB misses. To
bypass the outlay of rewriting applications, a
preload library was implemented that is able
to utilize large page frames transparently.
This work also shows benchmark results with these
components and performance improvements of up to
10 %.
|
44 |
Optimierte Implementierung ausgewählter kollektiver Operationen unter Ausnutzung der Hardwareparallelität des InfiniBand NetzwerkesFranke, Maik 30 April 2007 (has links)
Ziel der Arbet ist eine optimierte Implementierung der im MPI-1 Standard definierten Reduktionsoperationen MPI_Reduce(), MPI_Allreduce(), MPI_Scan(), MPI_Reduce_scatter() für das InfiniBand Netzwerk. Hierbei soll besonderer Wert auf spezielle InfiniBand Operationen und die Hardwareparallelität gelegt werden.
InfiniBand ermöglicht es Kommunikationsoperationen klar von Berechnungen zu trennen, was eine Überlappung beider Operationstypen in der Reduktion ermöglicht. Das Potential dieser Methode soll modelltheoretisch als auch praktisch in einer prototypischen Implementierung im Rahmen des Open MPI Frameworks erfolgen. Das Endresultat soll mit vorhandenen Implementierungen (z.B. MVAPICH) verglichen werden. / The performance of collective communication operations is one of the deciding factors in the overall performance of a MPI application. Current implementations of MPI use the point-to-point components to access the InfiniBand network. Therefore it is tried to improve the performance of a collective component by accessing the InfiniBand network directly. This should avoid overhead and make it possible to tune the algorithms to this specific network. Various algorithms for the MPI_Reduce, MPI_Allreduce, MPI_Scan and MPI_Reduce_scatter operations are presented. The theoretical performance of the algorithms is analyzed with the LogfP and LogGP models. Selected algorithms are implemented as part of an Open MPI collective component. Finally the performance of different algorithms and different MPI implementations is compared.
|
45 |
Evaluating and Improving the Performance of MPI-Allreduce on QLogic HTX/PCIe InifiniBand HCAMittenzwey, Nico 31 March 2009 (has links)
This thesis analysed the QLogic InfiniPath QLE7140 HCA and its onload architecture
and compared the results to the Mellanox InfiniHost III Lx HCA which uses an offload
architecture. As expected, the QLogic InfiniPath QLE7140 HCA can outperform the
Mellanox InfiniHost III Lx HCA in latency and bandwidth terms on our test system in
various test scenarios. The benchmarks showed, that sending messages with multiple
threads in parallel can increase the bandwidth greatly while bi-directional sends cut
the effective bandwidth for one HCA by up to 30%.
Different all-reduce algorithms where evaluated and compared with the help of the
LogGP model. The comparison showed that new all-reduce algorithms can outperform the ones already implemented in Open MPI for different scenarios.
The thesis also demonstrated, that one can implement multicast algorithms for InfiniBand
easily by using the RDMA-CM API.
|
46 |
Enabling Efficient Use of MPI and PGAS Programming Models on Heterogeneous Clusters with High Performance InterconnectsPotluri, Sreeram 18 September 2014 (has links)
No description available.
|
47 |
Designing High Performance and Scalable Unified Communication Runtime (UCR) for HPC and Big Data MiddlewareJose, Jithin 30 December 2014 (has links)
No description available.
|
48 |
High Performance Network I/O in Virtual Machines over Modern InterconnectsHuang, Wei 12 September 2008 (has links)
No description available.
|
49 |
Large-Message Nonblocking Allgather and Broadcast Offload via BlueField-2 DPUSarkauskas, Nicholas Robert 09 August 2022 (has links)
No description available.
|
50 |
Data services: bringing I/O processing to petascaleAbbasi, Mohammad Hasan 08 July 2011 (has links)
The increasing size of high performance computing systems and the associated
increase in the volume of generated data, has resulted in an I/O bottleneck for these applications.
This bottleneck is further exacerbated by the imbalance in the growth of processing
capability compared to storage capability, due mainly to the power and cost requirements
of scaling the storage. This thesis introduces data services, a new abstraction which provides
significant benefits for data intensive applications. Data services combine low overhead
data movement with flexible placement of data manipulation operations, to address
the I/O challenges of leadership class scientific applications. The impact of asynchronous
data movement on application runtime is minimized by utilizing novel server side data
movement schedulers to avoid contention related jitter in application communication. Additionally,
the JITStager component is presented. Utilizing dynamic code generation and
flexible code placement, the JITStager allows data services to be executed as a pipeline
extending from the application to storage. It is shown in this thesis that data services can
add new functionality to the application without having an significant negative impact on
performance.
|
Page generated in 0.0352 seconds