Field Programmable Gate Array (FPGA) devices are increasing in capacity at an exponential rate, and thus there is an increasingly strong demand to accelerate simulated annealing placement. Graphics Processing Units (GPUs) offer a unique opportunity to accelerate this simulated annealing placement on a manycore architecture using only commodity hardware. GPUs are optimized for applications which can tolerate single-thread latency and so GPUs can provide high throughput across many threads. However simulated annealing is not embarrassingly parallel and so single thread latency should be minimized to improve run time. Thus it is questionable whether GPUs can achieve any speedup over a sequential implementation. In this thesis, a novel subset-based simulated annealing placement framework is proposed, which specifically targets the GPU architecture. A highly optimized framework is implemented which, on average, achieves an order of magnitude speedup with less than 1% degradation for wirelength and no loss in quality for timing on realistic architectures.
Identifer | oai:union.ndltd.org:TORONTO/oai:tspace.library.utoronto.ca:1807/25456 |
Date | 17 December 2010 |
Creators | Choong, Alexander |
Contributors | Zhu, Jianwen |
Source Sets | University of Toronto |
Language | en_ca |
Detected Language | English |
Type | Thesis |
Page generated in 0.0018 seconds