skip to main content
10.5555/1870926.1870952acmconferencesArticle/Chapter ViewAbstractPublication PagesdateConference Proceedingsconference-collections
research-article

An efficient distributed memory interface for many-core platform with 3D stacked DRAM

Published: 08 March 2010 Publication History

Abstract

Historically, processor performance has increased at a much faster rate than that of main memory and up-coming NoC-based many-core architectures are further tightening the memory bottleneck. 3D integration based on TSV technology may provide a solution, as it enables stacking of multiple memory layers, with orders-of-magnitude increase in memory interface bandwidth, speed and energy efficiency. To fully exploit this potential, the architectural interface to vertically stacked memory must be streamlined. In this paper we present an efficient and flexible distributed memory interface for 3D-stacked DRAM. Our interface ensures ultra-low-latency access to the memory modules on top of each processing element (vertically local memory neighborhoods). Communication to these local modules do not travel through the NoC and takes full advantage of the lower latency of vertical interconnect, thus speeding up significantly the common case. The interface still supports a convenient global address space abstraction with high-latency remote access, due to the slower horizontal interconnect. Experimental results demonstrate significant bandwidth improvement that ranges from 1.44x to 7.40x as compared to the JEDEC standard, with peaks of 4.53GB/s for direct memory access, and 850MB/s for remote access through the NoC.

References

[1]
Micron Technology Inc., "Ddr sdram point-to-point simulation process," 2005, http://download.micron.com/pdf/technotes/DDR/TN4611.pdf.
[2]
Micron Technology, "Ddr sdram system-power calculator," 2009, http://www.micron.com/support/partinfo/powercalc.
[3]
R. S. Patti, "Three-dimensional integrated circuits and the future of system-on-chip designs," Proceedings of the IEEE, vol. 94, no. 6, June 2006.
[4]
A. W. Topol et al, "Three-dimensional integrated circuits," IBM Journal of Research and Development, vol. 50, no. 4/5, pp. 491--506, July/September 2006.
[5]
B. B. et al, "Die stacking (3d) microarchitecture," in 39th International Symposium on Microarchitecture, December 2006, pp. 469--479.
[6]
W. J. Dally and B. Towles, "Route packets, not wires: On-chip interconnection networks," in Proceedings of the 38th Design Automation Conference, June 2001, pp. 684--689.
[7]
L. Benini and G. De Micheli, "Networks on chips: A new SoC paradigm," IEEE Computer, vol. 35, no. 1, pp. 70--78, January 2002.
[8]
W. J. Dally and B. Towles, "Route packets, not wires: On-chip interconnection networks," in Proceedings of the 38th Design Automation Conference, June 2001, pp. 684--689.
[9]
L. Benini and G. De Micheli, "Networks on chip: a new SoC paradigm," IEEE Computer, vol. 35, no. 1, pp. 70--78, January 2002.
[10]
U. K. et al, "8gb 3d ddr3 dram using through-silicon-via technology," 2009 IEEE International Solid-State Circuit Conference, no. 1, pp. 130--132, 2009.
[11]
R. Patti, "Tezzaron semincoductor," 2006, http://www.tezzaron.com/memory/TSCLeoII.htm.
[12]
Christianto C. Liu et al, "Bridging the processor-memory performance gap with 3d ic technology," in IEEE Design and Test of Computers, Nov 2005, pp. 556--564.
[13]
Gianluca Loi et al, "A thermally-aware performance analysis of vertically integrated (3-d) processor-memory hierarchy," in Proceedings of the 43rd annual Design Automation Conference, Aug 2006, pp. 991--996.
[14]
Kgil Taeho et al, "Picoserver: using 3d stacking technology to enable a compact energy efficient chip multiprocessor," Proceedings of the 2006 ASPLOS Conference, vol. 41, no. 11, November 2006.
[15]
Gabriel H Loh et al, "3d-stacked memory architectures for multi-core processors," in International Symposium on Computer Architecture, June 2008, pp. 453--464.
[16]
Denali Software Inc., "Databahn#8482;dram memory controller ip," 2009, https://www.denali.com/en/products/databahndram.jsp.
[17]
Dennis Abts et al, "Achieving predictable performance through better memory controller placement in many-core cmps," in The 36th International Symposium on Computer Architecture, June 2009, pp. 451--461.
[18]
3D-IC Alliance, "Imis#8482;- intimate memory interface specification," 2009, http://www.3d-ic.org/standards.html.
[19]
I. Loi, F. Angiolini, and L. Benini, "Supporting vertical links for 3d networks-on-chip: Toward an automated design and analysis flow," in Proceedings of the Nano-Net Conference 2007, 2007, pp. 23--27.
[20]
Yu Aibin et al, "Development of fine pitch solder microbumps for 3d chip stacking," in Electronics Packaging Technology Conference, December 2008, pp. 387--392.

Cited By

View all
  • (2018)A load balancing technique for memory channelsProceedings of the International Symposium on Memory Systems10.1145/3240302.3240306(55-66)Online publication date: 1-Oct-2018
  • (2015)A Real-Time Multichannel Memory Controller and Optimal Mapping of Memory Clients to Memory ChannelsACM Transactions on Embedded Computing Systems10.1145/266163514:2(1-27)Online publication date: 17-Feb-2015
  • (2014)Direct distributed memory access for CMPsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2013.11.00474:2(2109-2122)Online publication date: 1-Feb-2014
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DATE '10: Proceedings of the Conference on Design, Automation and Test in Europe
March 2010
1868 pages
ISBN:9783981080162

Sponsors

  • EDAA: European Design Automation Association
  • ECSI
  • EDAC: Electronic Design Automation Consortium
  • SIGDA: ACM Special Interest Group on Design Automation
  • The IEEE Computer Society TTTC
  • The IEEE Computer Society DATC
  • The Russian Academy of Sciences: The Russian Academy of Sciences

Publisher

European Design and Automation Association

Leuven, Belgium

Publication History

Published: 08 March 2010

Check for updates

Qualifiers

  • Research-article

Conference

DATE '10
Sponsor:
  • EDAA
  • EDAC
  • SIGDA
  • The Russian Academy of Sciences
DATE '10: Design, Automation and Test in Europe
March 8 - 12, 2010
Germany, Dresden

Acceptance Rates

Overall Acceptance Rate 518 of 1,794 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2018)A load balancing technique for memory channelsProceedings of the International Symposium on Memory Systems10.1145/3240302.3240306(55-66)Online publication date: 1-Oct-2018
  • (2015)A Real-Time Multichannel Memory Controller and Optimal Mapping of Memory Clients to Memory ChannelsACM Transactions on Embedded Computing Systems10.1145/266163514:2(1-27)Online publication date: 17-Feb-2015
  • (2014)Direct distributed memory access for CMPsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2013.11.00474:2(2109-2122)Online publication date: 1-Feb-2014
  • (2013)Architecture and optimal configuration of a real-time multi-channel memory controllerProceedings of the Conference on Design, Automation and Test in Europe10.5555/2485288.2485602(1307-1312)Online publication date: 18-Mar-2013
  • (2013)TLM modelling of 3D stacked wide I/O DRAM subsystemsProceedings of the 2013 Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools10.1145/2432516.2432521(1-6)Online publication date: 21-Jan-2013
  • (2012)An energy efficient DRAM subsystem for 3D integrated SoCsProceedings of the Conference on Design, Automation and Test in Europe10.5555/2492708.2492989(1138-1141)Online publication date: 12-Mar-2012
  • (2012)Distributed memory interface synthesis for network-on-chips with 3D-stacked DRAMsProceedings of the International Conference on Computer-Aided Design10.1145/2429384.2429479(458-465)Online publication date: 5-Nov-2012
  • (2012)Asymmetric DRAM synthesis for heterogeneous chip multiprocessors in 3D-stacked architectureProceedings of the International Conference on Computer-Aided Design10.1145/2429384.2429399(73-80)Online publication date: 5-Nov-2012
  • (2012)A distributed interleaving scheme for efficient access to WideIO DRAM memoryProceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis10.1145/2380445.2380467(103-112)Online publication date: 7-Oct-2012
  • (2011)Cluster-based topologies for 3D stacked architecturesProceedings of the 8th ACM International Conference on Computing Frontiers10.1145/2016604.2016621(1-3)Online publication date: 3-May-2011
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media