Global ETD Search

Return to search

Auto-tunable GPU BLAS

In this paper, we present our implementation of an Auto tuning system, written in C++, which incorporate the use of OpenCL kernels. We deploy this approach on different GPU architectures, evaluating the performance of the approach. Our main focus is to easily generate tuned code, that would otherwise require a large amount of empirical testing, and then run it on any kind of device. This is achieved through the auto tuning framework, which will create different kernels, compile and run them on the device and output the best performing kernel on the given platform.BLAS is much used in performance critical applications, and is a good candidate for execution on GPUs due to its potential performance increase. Our implementation was benchmarked on various of test environments, with different GPUs, where we achieved comparable results to the ViennaCL library. We also tested against the native vendor specific BLAS libraries from AMD and NVIDIA.

ntnudaim:5976

MIT informatikk

Komplekse datasystemer

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:ntnu-18411
Date	January 2012
Creators	Lien, Geir Josten
Publisher	Norges teknisk-naturvitenskapelige universitet, Institutt for datateknikk og informasjonsvitenskap, Institutt for datateknikk og informasjonsvitenskap
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0017 seconds

Auto-tunable GPU BLAS

Description

Links & Downloads

Tags

Additional Fields