Global ETD Search

Return to search

Techniques and Tools for Optimizing Codes on Modern Architectures: : A Low-Level Approach

<p>This thesis describes novel techniques and test implementations for optimizing numerically intensive codes. Our main focus is on how given algorithms can be adapted to run efficiently on modern microprocessor exploring several architectural features including, instruction selection, and access patterns related to having several levels of cache. Our approach is also shown to be relevant for multicore architectures. Our primary target applications are linear algebra routines in the form of matrix multiply with dense matrices. We analyze how current compilers, microprocessor and common optimization techniques (like loop tiling and date relocation) interact. A tunable assembly code generator is developed, built, and tested on a basic BLAS level-3 routine to side-step some of the performance issues of modern compilers. Our generator has been test on both the Intel Pentium 4 and Intel's Core 2 processors. For the Pentium 4, a 10.8 % speed-up is achieved over ATLAS's rank2k, and a 17% speed-up is achieved over MKL's implementation for 4000-by-4032 matrices. On the Core 2 we optimize our code for 2000-by-2000 matrices and achieved a 24% and 5% speed-up over ATLAS and MKL, respectively with our multi-threaded implementation. Also for other matrix sizes, descent speed-ups are shown. Considering that our implementation is far from fully tuned, we consider these result very respectable.</p>

ntnudaim

MIT informatikk

Komplekse datasystemer

Identifer	oai:union.ndltd.org:UPSALLA/oai:DiVA.org:ntnu-9827
Date	January 2009
Creators	Jensen, Rune Erlend
Publisher	Norwegian University of Science and Technology, Department of Computer and Information Science, Institutt for datateknikk og informasjonsvitenskap
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, text

Page generated in 0.0017 seconds

Techniques and Tools for Optimizing Codes on Modern Architectures: : A Low-Level Approach

Description

Links & Downloads

Tags

Additional Fields