research-article

Area-efficient near-associative memories on FPGAs

Authors:

Andr� DeHonAuthors Info & Claims

FPGA '13: Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays

Pages 191 - 200

https://doi.org/10.1145/2435264.2435298

Published: 11 February 2013 Publication History

Abstract

Associative memories can map sparsely used keys to values with low latency but can incur heavy area overheads. The lack of customized hardware for associative memories in today's mainstream FPGAs exacerbates the overhead cost of building these memories using the fixed address match BRAMs. In this paper, we develop a new, FPGA-friendly, memory architecture based on a multiple hash scheme that is able to achieve near-associative performance (less than 5% of evictions due to conflicts) without the area overheads of a fully associative memory on FPGAs. Using the proposed architecture as a 64KB L1 data cache, we show that it is able to achieve near-associative miss-rates while consuming 6-7� less FPGA memory resources for a set of benchmark programs from the SPEC2006 suite than fully associative memories generated by the Xilinx Coregen tool. Benefits increase with match width, allowing area reduction up to 100�. At the same time, the new architecture has lower latency than the fully associative memory -- 3.7 ns for a 1024-entry flat version or 6.1 ns for an area-efficient version compared to 8.8 ns for a fully associative memory for a 64b key.

Supplementary Material

ZIP File (fpga156.zip)

Contains the Appendix to the paper and source code distribution for the designs described in the paper

Download
198.27 KB

References

[1]

Y. Azar, A. Z. Border, A. R. Karlin, and E. Upfal. Balanced allocation. In Proc. ACM STOC, pages 593--602, 1994. 1, 3.1

Digital Library

[2]

S. Battle, A. D. Hilton, M. Hempstead, and A. Roth. Flexible register management using reference counting. In Proc. Intl. Symp. on High-Perf. Comp. Arch., pages 273--284. IEEE, 2012. 5.1

Digital Library

[3]

B. H. Bloom. Space/time trade-offs in hash coding with allowable errors. CACM, 13(7):422--426, July 1970. 3.1

Digital Library

[4]

Bluespec, Inc. Bluespec SystemVerilog. 5.2

[5]

S. Bunton and G. Borriello. Practical dictionary management for hardware data compression. CACM, 35(1):95--104, 1992. 2.1

Digital Library

[6]

B. Chazelle, J. Kilian, R. Rubinfeld, and A. Tal. The bloomier filter: an efficient data structure for static support lookup tables. In Proc. ACM-SIAM SODA, SODA '04, pages 30--39, Philadelphia, PA, USA, 2004. Society for Industrial and Applied Mathematics. 6

Digital Library

[7]

Z. J. Czech, G. Havas, and B. S. Majewski. An optimal algorithm for generating minimal perfect hash functions. Information Processing Letters, 43(5):257--264, 1992. 1, 3.1, 3.2, 3.6

Digital Library

[8]

L. Fan, P. Cao, J. Almeida, and A. Z. Border. Sumary cache: A scalable wide-area web cache sharing protocol. IEEE/ACM Trans. Networking, 8(3):281--293, 2000. 4.3

Digital Library

[9]

J. L. Henning. SPEC CPU2006 benchmark descriptions. SIGARCH Comput. Archit. News, 34(4):1--17, September 2006. 5.1

Digital Library

[10]

J. Ho and G. Lemieux. PERG: A scalable FPGA-based pattern-matching engine with consolidated bloomier filters. In ICFPT, pages 73--80, December 2008. 6

[11]

A. Kirsch and M. Mitzenmacher. The power of one move: Hashing schemes for hardware. IEEE/ACM Trans. Networking, 18(6):1752--1765, 2010. 1, 4.5, 6

Digital Library

[12]

C. E. LaForest and G. Steffan. Octavo: an FPGA-centric processor family. In FPGA, pages 97--106, 2012. 1

Digital Library

[13]

S.-L. L. Lu, P. Yiannacouras, T. Suh, R. Kassa, and M. Konow. A desktop computer with a reconfigurable Pentium. ACM Tr. Reconfig. Tech. and Sys., 1(1), March 2008. 1

Digital Library

[14]

M. Mitzenmacher. Studying balanced allocation with differential equations. Combinatorics, Probability, and Computing, 8(5):473--482, 1999. 1, 3.1

Digital Library

[15]

J. Naous, D. Erickson, G. A. Covington, G. Appenzeller, and N. McKeown. Implementing an OpenFlow switch on the NetFPGA platform. In Proc. ACM/IEEE Symp. ANCS, pages 1--9, 2008. 2.1

Digital Library

[16]

D. Sanchez and C. Kozyrakis. The ZCache: Decoupling ways and associativity. In MICRO, pages 196--207, 2010. 1, 6

Digital Library

[17]

A. Seznec. A case for two-way skewed-associative caches. In ISCA, pages 169--178, 1993. 6

Digital Library

[18]

A. Seznec and F. Bodin. Skewed-associative caches. In PARLE, pages 304--316, 1993. 3.2

Digital Library

[19]

H. Song, S. Dharmapurikar, J. Turner, and J. Lockwood. Fast hash table lookup using extended bloom filter: an aid to network processing. In Proceedings of the Conference on Applications, technologies, architectures, and protocols for computer communications, pages 181--192, 2005. 6

Digital Library

[20]

J. Wawrzynek, D. Patterson, M. Oskin, S.-L. Lu, C. Kozyrakis, J. C. Hoe, D. Chiou, and K. Asanovic. RAMP: Research accelerator for multiple processors. IEEE Micro, 27(2):46--57, 2007. 1

Digital Library

[21]

S. Wee, J. Casper, N. Njoroge, Y. Tesylar, D. Ge, C. Kozyrakis, and K. Olukotun. A practical FPGA based framework for novel CMP research. In FPGA, pages 116--125, 2007. 1

Digital Library

[22]

R. Wunderlich and J. C. Hoe. In-system FPGA prototyping of an Itanium microarchitecture. In ICCD, pages 288--294, 2004. 1

Digital Library

[23]

Xilinx, Inc., 2100 Logic Drive, San Jose, CA 95124. Parameterizable Content-Addressable Memory, March 2011. XAPP 1151 http://www.xilinx.com/support/documentation/application_notes/xapp1151_Param_CAM.pdf. 1, 2.2

[24]

Xilinx, Inc., 2100 Logic Drive, San Jose, CA 95124. Virtex-6 FPGA Data Sheet: DC and Switching Characteristics, September 2011. DS512 http://www.xilinx.com/support/documentation/data_sheets/ds152.pdf. 2.2

[25]

P. Yiannacouras and J. Rose. A parameterized automatic cache generator for FPGAs. In ICFPT, pages 324--327, 2003. 1

[26]

P. Yiannacouras, J. G. Steffan, and J. Rose. Exploration and customization of FPGA-based soft processors. IEEE Trans. Computer-Aided Design, 26(2):266--277, 2007 2007. 1

Digital Library

Cited By

Roessler NDeHon A(2021)SCALPEL: Exploring the Limits of Tag-enforced CompartmentalizationACM Journal on Emerging Technologies in Computing Systems10.1145/346167318:1(1-28)Online publication date: 29-Sep-2021
https://dl.acm.org/doi/10.1145/3461673
Pietras MKlęsk P(2017)FPGA implementation of logarithmic versions of Baum-Welch and Viterbi algorithms for reduced precision hidden Markov modelsBulletin of the Polish Academy of Sciences Technical Sciences10.1515/bpasts-2017-010165:6(935-947)Online publication date: 1-Dec-2017
https://doi.org/10.1515/bpasts-2017-0101
Li WZhao YLiu YChen M(2017)SMEFF: A scalable memory extension fabric for FPGA2017 International Conference on Field Programmable Technology (ICFPT)10.1109/FPT.2017.8280119(40-47)Online publication date: Dec-2017
https://doi.org/10.1109/FPT.2017.8280119
Show More Cited By

Index Terms

Area-efficient near-associative memories on FPGAs
1. Hardware
  1. Very large scale integration design
    1. Application-specific VLSI designs
2. Information systems
  1. Information storage systems
    1. Record storage systems
      1. Record storage alternatives
        Hashed file organization

Recommendations

Area-Efficient Near-Associative Memories on FPGAs

Associative memories can map sparsely used keys to values with low latency but can incur heavy area overheads. The lack of customized hardware for associative memories in today’s mainstream FPGAs exacerbates the overhead cost of building these memories ...
Efficient multi-ported memories for FPGAs
FPGA '10: Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays

Multi-ported memories are challenging to implement with FPGAs since the provided block RAMs typically have only two ports. We present a thorough exploration of the design space of FPGA-based soft multi-ported memories by evaluating conventional ...
CAMs as Synchronizing Caches for Multithreaded Irregular Applications on FPGAs
ICCAD '15: Proceedings of the IEEE/ACM International Conference on Computer-Aided Design

Irregular applications, by their very nature, suffer from poor data locality. This often results in high miss rates for caches, and many long waits to off-chip memory. Historically, long latencies have been dealt with in two ways: (1) latency mitigation ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

FPGA '13: Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays

February 2013

294 pages

ISBN:9781450318877

DOI:10.1145/2435264

General Chair:
Brad Hutchings
Brigham Young University, USA
,
Program Chair:
Vaughn Betz
University of Toronto, Canada

Copyright � 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGDA: ACM Special Interest Group on Design Automation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 February 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

FPGA '13

Sponsor:

SIGDA

FPGA '13: The 2013 ACM/SIGDA International Symposium on Field Programmable Gate Arrays

February 11 - 13, 2013

California, Monterey, USA

Acceptance Rates

Overall Acceptance Rate 125 of 627 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
275
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Roessler NDeHon A(2021)SCALPEL: Exploring the Limits of Tag-enforced CompartmentalizationACM Journal on Emerging Technologies in Computing Systems10.1145/346167318:1(1-28)Online publication date: 29-Sep-2021
https://dl.acm.org/doi/10.1145/3461673
Pietras MKlęsk P(2017)FPGA implementation of logarithmic versions of Baum-Welch and Viterbi algorithms for reduced precision hidden Markov modelsBulletin of the Polish Academy of Sciences Technical Sciences10.1515/bpasts-2017-010165:6(935-947)Online publication date: 1-Dec-2017
https://doi.org/10.1515/bpasts-2017-0101
Li WZhao YLiu YChen M(2017)SMEFF: A scalable memory extension fabric for FPGA2017 International Conference on Field Programmable Technology (ICFPT)10.1109/FPT.2017.8280119(40-47)Online publication date: Dec-2017
https://doi.org/10.1109/FPT.2017.8280119
Eckert MMeyer DKlauer BHaase J(2017)Comparison and evaluation of cache parameters for softcores on FPGAs2017 International Conference on FPGA Reconfiguration for General-Purpose Computing (FPGA4GPC)10.1109/FPGA4GPC.2017.8008961(19-24)Online publication date: May-2017
https://doi.org/10.1109/FPGA4GPC.2017.8008961
Azevedo de Amorim ACollins NDeHon ADemange DHriţcu CPichardie DPierce BPollack RTolmach A(2016)A verified information-flow architectureJournal of Computer Security10.3233/JCS-1578424:6(689-734)Online publication date: 1-Dec-2016
https://doi.org/10.3233/JCS-15784
Zapata MMadrenas J(2016)Compact Associative Memory for AER Spike Decoding in FPGA-Based Evolvable SNN EmulationArtificial Neural Networks and Machine Learning – ICANN 201610.1007/978-3-319-44778-0_47(399-407)Online publication date: 13-Aug-2016
https://doi.org/10.1007/978-3-319-44778-0_47
Dhawan UHritcu CRubin RVasilakis NChiricescu SSmith JKnight TPierce BDeHon A(2015)Architectural Support for Software-Defined Metadata ProcessingACM SIGARCH Computer Architecture News10.1145/2786763.269438343:1(487-502)Online publication date: 14-Mar-2015
https://dl.acm.org/doi/10.1145/2786763.2694383
Dhawan UHritcu CRubin RVasilakis NChiricescu SSmith JKnight TPierce BDeHon A(2015)Architectural Support for Software-Defined Metadata ProcessingACM SIGPLAN Notices10.1145/2775054.269438350:4(487-502)Online publication date: 14-Mar-2015
https://dl.acm.org/doi/10.1145/2775054.2694383
Dhawan UHritcu CRubin RVasilakis NChiricescu SSmith JKnight TPierce BDeHon AOzturk OEbcioglu KDwarkadas S(2015)Architectural Support for Software-Defined Metadata ProcessingProceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/2694344.2694383(487-502)Online publication date: 14-Mar-2015
https://dl.acm.org/doi/10.1145/2694344.2694383
Dhawan UDehon A(2015)Area-Efficient Near-Associative Memories on FPGAsACM Transactions on Reconfigurable Technology and Systems10.1145/26294717:4(1-22)Online publication date: 23-Jan-2015
https://dl.acm.org/doi/10.1145/2629471
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents