Future high-performance computing systems will need to include multiple specialized accelerators in a single heterogeneous system to overcome power-density limitations of CPU performance. To program such heterogeneous systems without the need to maintain multiple code bases, OpenMP device offloading constructs can be used to execute compute-intensive regions on different kinds of accelerators. In this work we present a proof-of-concept implementation of OpenMP offloading for FPGA-based hardware accelerators. Our implementation seamlessly integrates with the existing LLVM offloading infrastructure, and enables the user to move computations to a custom FPGA accelerator by simply adding OpenMP offloading directives to the input program.
D.M. Camarena CabreraXavier MartorellGeorgi GaydadjievEduard AyguadéDaniel Jiménez-González
Marius KnaustFlorian MayerThomas Steinke
A. Carrillo ÁlvarezI. UgarteVíctor FernándezPablo Sánchez
Baodi ShanMauricio Araya‐PoloJohannes DoerfertBarbara Chapman