skip to main content
10.1145/2435264.2435298acmconferencesArticle/Chapter ViewAbstractPublication PagesfpgaConference Proceedingsconference-collections
research-article

Area-efficient near-associative memories on FPGAs

Published: 11 February 2013 Publication History

Abstract

Associative memories can map sparsely used keys to values with low latency but can incur heavy area overheads. The lack of customized hardware for associative memories in today's mainstream FPGAs exacerbates the overhead cost of building these memories using the fixed address match BRAMs. In this paper, we develop a new, FPGA-friendly, memory architecture based on a multiple hash scheme that is able to achieve near-associative performance (less than 5% of evictions due to conflicts) without the area overheads of a fully associative memory on FPGAs. Using the proposed architecture as a 64KB L1 data cache, we show that it is able to achieve near-associative miss-rates while consuming 6-7� less FPGA memory resources for a set of benchmark programs from the SPEC2006 suite than fully associative memories generated by the Xilinx Coregen tool. Benefits increase with match width, allowing area reduction up to 100�. At the same time, the new architecture has lower latency than the fully associative memory -- 3.7 ns for a 1024-entry flat version or 6.1 ns for an area-efficient version compared to 8.8 ns for a fully associative memory for a 64b key.

Supplementary Material

ZIP File (fpga156.zip)
Contains the Appendix to the paper and source code distribution for the designs described in the paper

References

[1]
Y. Azar, A. Z. Border, A. R. Karlin, and E. Upfal. Balanced allocation. In Proc. ACM STOC, pages 593--602, 1994. 1, 3.1
[2]
S. Battle, A. D. Hilton, M. Hempstead, and A. Roth. Flexible register management using reference counting. In Proc. Intl. Symp. on High-Perf. Comp. Arch., pages 273--284. IEEE, 2012. 5.1
[3]
B. H. Bloom. Space/time trade-offs in hash coding with allowable errors. CACM, 13(7):422--426, July 1970. 3.1
[4]
Bluespec, Inc. Bluespec SystemVerilog. 5.2
[5]
S. Bunton and G. Borriello. Practical dictionary management for hardware data compression. CACM, 35(1):95--104, 1992. 2.1
[6]
B. Chazelle, J. Kilian, R. Rubinfeld, and A. Tal. The bloomier filter: an efficient data structure for static support lookup tables. In Proc. ACM-SIAM SODA, SODA '04, pages 30--39, Philadelphia, PA, USA, 2004. Society for Industrial and Applied Mathematics. 6
[7]
Z. J. Czech, G. Havas, and B. S. Majewski. An optimal algorithm for generating minimal perfect hash functions. Information Processing Letters, 43(5):257--264, 1992. 1, 3.1, 3.2, 3.6
[8]
L. Fan, P. Cao, J. Almeida, and A. Z. Border. Sumary cache: A scalable wide-area web cache sharing protocol. IEEE/ACM Trans. Networking, 8(3):281--293, 2000. 4.3
[9]
J. L. Henning. SPEC CPU2006 benchmark descriptions. SIGARCH Comput. Archit. News, 34(4):1--17, September 2006. 5.1
[10]
J. Ho and G. Lemieux. PERG: A scalable FPGA-based pattern-matching engine with consolidated bloomier filters. In ICFPT, pages 73--80, December 2008. 6
[11]
A. Kirsch and M. Mitzenmacher. The power of one move: Hashing schemes for hardware. IEEE/ACM Trans. Networking, 18(6):1752--1765, 2010. 1, 4.5, 6
[12]
C. E. LaForest and G. Steffan. Octavo: an FPGA-centric processor family. In FPGA, pages 97--106, 2012. 1
[13]
S.-L. L. Lu, P. Yiannacouras, T. Suh, R. Kassa, and M. Konow. A desktop computer with a reconfigurable Pentium. ACM Tr. Reconfig. Tech. and Sys., 1(1), March 2008. 1
[14]
M. Mitzenmacher. Studying balanced allocation with differential equations. Combinatorics, Probability, and Computing, 8(5):473--482, 1999. 1, 3.1
[15]
J. Naous, D. Erickson, G. A. Covington, G. Appenzeller, and N. McKeown. Implementing an OpenFlow switch on the NetFPGA platform. In Proc. ACM/IEEE Symp. ANCS, pages 1--9, 2008. 2.1
[16]
D. Sanchez and C. Kozyrakis. The ZCache: Decoupling ways and associativity. In MICRO, pages 196--207, 2010. 1, 6
[17]
A. Seznec. A case for two-way skewed-associative caches. In ISCA, pages 169--178, 1993. 6
[18]
A. Seznec and F. Bodin. Skewed-associative caches. In PARLE, pages 304--316, 1993. 3.2
[19]
H. Song, S. Dharmapurikar, J. Turner, and J. Lockwood. Fast hash table lookup using extended bloom filter: an aid to network processing. In Proceedings of the Conference on Applications, technologies, architectures, and protocols for computer communications, pages 181--192, 2005. 6
[20]
J. Wawrzynek, D. Patterson, M. Oskin, S.-L. Lu, C. Kozyrakis, J. C. Hoe, D. Chiou, and K. Asanovic. RAMP: Research accelerator for multiple processors. IEEE Micro, 27(2):46--57, 2007. 1
[21]
S. Wee, J. Casper, N. Njoroge, Y. Tesylar, D. Ge, C. Kozyrakis, and K. Olukotun. A practical FPGA based framework for novel CMP research. In FPGA, pages 116--125, 2007. 1
[22]
R. Wunderlich and J. C. Hoe. In-system FPGA prototyping of an Itanium microarchitecture. In ICCD, pages 288--294, 2004. 1
[23]
Xilinx, Inc., 2100 Logic Drive, San Jose, CA 95124. Parameterizable Content-Addressable Memory, March 2011. XAPP 1151 http://www.xilinx.com/support/documentation/application_notes/xapp1151_Param_CAM.pdf. 1, 2.2
[24]
Xilinx, Inc., 2100 Logic Drive, San Jose, CA 95124. Virtex-6 FPGA Data Sheet: DC and Switching Characteristics, September 2011. DS512 http://www.xilinx.com/support/documentation/data_sheets/ds152.pdf. 2.2
[25]
P. Yiannacouras and J. Rose. A parameterized automatic cache generator for FPGAs. In ICFPT, pages 324--327, 2003. 1
[26]
P. Yiannacouras, J. G. Steffan, and J. Rose. Exploration and customization of FPGA-based soft processors. IEEE Trans. Computer-Aided Design, 26(2):266--277, 2007 2007. 1

Cited By

View all
  • (2021)SCALPEL: Exploring the Limits of Tag-enforced CompartmentalizationACM Journal on Emerging Technologies in Computing Systems10.1145/346167318:1(1-28)Online publication date: 29-Sep-2021
  • (2017)FPGA implementation of logarithmic versions of Baum-Welch and Viterbi algorithms for reduced precision hidden Markov modelsBulletin of the Polish Academy of Sciences Technical Sciences10.1515/bpasts-2017-010165:6(935-947)Online publication date: 1-Dec-2017
  • (2017)SMEFF: A scalable memory extension fabric for FPGA2017 International Conference on Field Programmable Technology (ICFPT)10.1109/FPT.2017.8280119(40-47)Online publication date: Dec-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
FPGA '13: Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
February 2013
294 pages
ISBN:9781450318877
DOI:10.1145/2435264
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 February 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. BRAM
  2. CAM
  3. FPGA
  4. associative memory
  5. cache
  6. hashing

Qualifiers

  • Research-article

Conference

FPGA '13
Sponsor:

Acceptance Rates

Overall Acceptance Rate 125 of 627 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2021)SCALPEL: Exploring the Limits of Tag-enforced CompartmentalizationACM Journal on Emerging Technologies in Computing Systems10.1145/346167318:1(1-28)Online publication date: 29-Sep-2021
  • (2017)FPGA implementation of logarithmic versions of Baum-Welch and Viterbi algorithms for reduced precision hidden Markov modelsBulletin of the Polish Academy of Sciences Technical Sciences10.1515/bpasts-2017-010165:6(935-947)Online publication date: 1-Dec-2017
  • (2017)SMEFF: A scalable memory extension fabric for FPGA2017 International Conference on Field Programmable Technology (ICFPT)10.1109/FPT.2017.8280119(40-47)Online publication date: Dec-2017
  • (2017)Comparison and evaluation of cache parameters for softcores on FPGAs2017 International Conference on FPGA Reconfiguration for General-Purpose Computing (FPGA4GPC)10.1109/FPGA4GPC.2017.8008961(19-24)Online publication date: May-2017
  • (2016)A verified information-flow architectureJournal of Computer Security10.3233/JCS-1578424:6(689-734)Online publication date: 1-Dec-2016
  • (2016)Compact Associative Memory for AER Spike Decoding in FPGA-Based Evolvable SNN EmulationArtificial Neural Networks and Machine Learning – ICANN 201610.1007/978-3-319-44778-0_47(399-407)Online publication date: 13-Aug-2016
  • (2015)Architectural Support for Software-Defined Metadata ProcessingACM SIGARCH Computer Architecture News10.1145/2786763.269438343:1(487-502)Online publication date: 14-Mar-2015
  • (2015)Architectural Support for Software-Defined Metadata ProcessingACM SIGPLAN Notices10.1145/2775054.269438350:4(487-502)Online publication date: 14-Mar-2015
  • (2015)Architectural Support for Software-Defined Metadata ProcessingProceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/2694344.2694383(487-502)Online publication date: 14-Mar-2015
  • (2015)Area-Efficient Near-Associative Memories on FPGAsACM Transactions on Reconfigurable Technology and Systems10.1145/26294717:4(1-22)Online publication date: 23-Jan-2015
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media