MALMM: A multi-array architecture for large-scale matrix multiplication on FPGA

You Huang; Junzhong Shen; Yuran Qiao; Mei Wen; Chunyuan Zhang

doi:10.1587/elex.15.20180286

LETTER

MALMM: A multi-array architecture for large-scale matrix multiplication on FPGA

You Huang, Junzhong Shen, Yuran Qiao, Mei Wen, Chunyuan Zhang

Author information

Keywords: matrix multiplication, field-programmable gate arrays (FPGAs), work-stealing

JOURNAL FREE ACCESS

2018 Volume 15 Issue 10 Pages 20180286

DOI https://doi.org/10.1587/elex.15.20180286

Browse “Advance online publication” version

Details

Abstract

Large-scale floating-point matrix multiplication is widely used in many scientific and engineering applications. Most existing works focus on designing a linear array architecture for accelerating matrix multiplication on FPGAs. This paper towards the extension of this architecture by proposing a scalable and highly configurable multi-array architecture. In addition, we present a work-stealing scheme to ensure the equality in the workload partition among multiple linear arrays. Furthermore, an analytical model is developed to determine the optimal parameters for matrix multiplication acceleration. Experiments on real-life convolutional neural networks (CNNs) show that we can obtain the optimal extension of the linear array architecture.

Corresponding author

Register with J-STAGE for free!