To implement recommendation inference efficiently in the context of a real deployment, we design and implement an FPGA cluster optimizing the performance of ...
In this paper, we design and implement an FPGA cluster for recommendation inference to achieve high performance on both the embedding lookups and the FC layer ...
This work designs and implements an FPGA cluster optimizing the performance of both stages of recommendation inference, and takes advantage of the ...
This repository corresponds to FPL'21 paper: Distributed Recommendation Inference on FPGA Clusters. The FPGAs are interconnected using 100 Gbps network switch.
In this paper, we design and implement an FPGA cluster for recommendation inference to achieve high performance on both the embedding lookups and the FC layer ...
... Vector search is an essential component in retrievalaugmented language models and recommender systems. For recommender systems, previous work has already ...
Given the growth in data inputs and application complexity, it is often the case that a single hardware accelerator is not enough to solve a given problem.
Dec 18, 2023 · In this paper we propose ACCL+, an open-source versatile FPGA-based collective communication library. Portable across different platforms and supporting UDP, ...
Jul 12, 2024 · We aim to demonstrate that ACCL+ can facilitate distribut- ing DLRM inference across FPGAs to accommodate larger embedding layers, as in many ...
Distributed recommendation inference on fpga clusters. Y Zhu, Z He, W Jiang, K Zeng, J Zhou, G Alonso. 2021 31st International Conference on Field-Programmable ...