Increased computation using parallel FPGA architectures.
Two ways to improve algorithm performance in hardware are increasing the speed of each operation, or performing multiple operations simultaneously. However, the percent speed-up for the latter depends upon not only system constraints but also design decisions. When using multiple FPGAs as the implementation target, creating an optimal configuration requires the designer to be aware of many potential issues. A neural network inversion case study is presented in order to give future FPGA algorithm designers insight into the possible problems arising from parallel FPGA implementations. Initial work is performed implementing a large Neural Network and finding its inversion via Particle Swarm Optimization on a single FPGA. This algorithm is later broken up and performed in parallel with multiple FPGAs using several strategies on various hardware and software architectures. At the end, a discussion of the potential issues that arose during these implementations is presented along with some generalized guidelines.