Due to their high potential performance and reduced energy and power consumption, field-programmable gate arrays (FPGAs) are widely used as accelerators for today’s computationally intensive applications. These applications use advanced algorithms more sophisticated than ever before. The high design complexity along with fast development process challenges the traditional FPGA design methodology using hardware description languages. High-level synthesis accelerates design implementation by raising the level of design abstraction beyond register transfer level. This dissertation work develops a highly energy-efficient FPGA high-level synthesis tool, ArchSyn, using an application-specific coarse-grain architecture as an intermediate synthesis step.
ArchSyn provides rapid and energy-efficient compilation of dataflow graphs (DFGs) on FPGAs by scheduling the dataflow operations on an array of directly connected simple configurable processing elements (CPEs). Each CPE in the array performs primitive compute operations according to a small local sequencer at each cycle. Data are communicated via multi-hop routing within the direct interconnect network. The scheduler schedules each compute operation of the DFG obtained from the high-level design to execute on a particular hardware CPE at a particular cycle. It also determines the communication schedule of the intermediate data among the producing and consuming CPEs, optionally buffering them with distributed memory along the path. As such, the lengthy process of synthesizing a full custom hardware design on FPGA is reduced to a scheduling and mapping process. By restricting the fine-grain programmability into a coarse grain processor network scheduling problem, the compilation time can be improved substantially, thereby improving the overall productivity of the designer.
Furthermore, taking advantage of the programmability of FPGAs, the effect of the array interconnect architecture on the energy-efficiency of the resulting system is studied. By altering the array configuration, the data communication scheme among the CPEs must also be changed. This has a net effect on both the energy consumption
spent on data movement as well as on the overall compute performance. It is shown that by using array topology that is customized to the input DFG, up to 28% improvement in energy-efficiency could be achieved. An exploratory framework based on a genetic algorithm was developed that allows us to obtain such application-specific connection network. Such degree of customization is possible only with the programmability of FPGAs. Moreover, such topology adaptation can be achieved rapidly as only routings between a fixed set of pre-placed CPEs are required.
Implementations using ArchSyn and an existing FPGA compilation tool xPilot were compared. ArchSyn gave a 2X better energy consumption and a 11X better energy-delay product for computation with very regular and simple data dependency. For computation with irregular data dependency, the energy consumption and energy-delay product improvement was 9.6X and 199X. / published_or_final_version / Electrical and Electronic Engineering / Doctoral / Doctor of Philosophy
Identifer | oai:union.ndltd.org:HKU/oai:hub.hku.hk:10722/181525 |
Date | January 2012 |
Creators | Lin, Yu, Colin., 林郁. |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Source Sets | Hong Kong University Theses |
Language | English |
Detected Language | English |
Type | PG_Thesis |
Source | http://hub.hku.hk/bib/B49799599 |
Rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works., Creative Commons: Attribution 3.0 Hong Kong License |
Relation | HKU Theses Online (HKUTO) |
Page generated in 0.0103 seconds