Return to search

A High-performance, Reconfigurable Architecture for Restricted Boltzmann Machines

Despite the popularity and success of neural networks in research, the number of resulting commercial or industrial applications have been limited. A primary cause of this lack of adoption is due to the fact that neural networks are usually implemented as software running on general-purpose processors. Hence, a hardware implementation that can take advantage of the inherent parallelism in neural networks is desired.

This thesis investigates how the Restricted Boltzmann machine, a popular type of neural network, can be effectively mapped to a high-performance hardware architecture on FPGA platforms. The proposed, modular framework is designed to reduce the time complexity of the computations through heavily customized hardware engines. The framework is tested on a platform of four Xilinx Virtex II-Pro XC2VP70 FPGAs running at 100MHz through a variety of different configurations. The maximum performance was obtained by instantiating a Restricted Boltzmann Machine of 256x256 nodes distributed across four FPGAs, which results in a computational speed of 3.13 billion connection-updates-per-second and a speed-up of 145-fold over an optimized C program running on a 2.8GHz Intel processor.

Identiferoai:union.ndltd.org:TORONTO/oai:tspace.library.utoronto.ca:1807/18805
Date15 February 2010
CreatorsLy, Daniel Le
ContributorsChow, Paul
Source SetsUniversity of Toronto
Languageen_ca
Detected LanguageEnglish
TypeThesis

Page generated in 0.002 seconds