In this paper we present an FPGA-based dataflow architecture that both efficiently computes parallel algorithms using dedicated FPGA resources and scales well to multi-FPGA chip designs while the overall communication bandwidth increases. The basic idea is based on reconfiguration. In contrast to the concept of partially reconfiguring FPGAs, our approach is to connect computational units via a dynamically variable topology. The latter consists of dedicated switches which are individually controlled by simple shift registers. Hence, the computational result is a function of the currently configured interconnection pattern that can be updated within one single clock cycle. The scalability of this architecture is shown on a high-performance parallel FFT.
Sven-Ole VoigtMalte BaeslerStephanie Teufel
Sven-Ole VoigtStephanie Teufel
Gilles SassatelliLionel TorresPascal BenoitGaston CambonMichel RobertJérôme Galy