skip to main content
10.1145/1088149.1088167acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
Article

Lightweight reference affinity analysis

Published: 20 June 2005 Publication History

Abstract

Previous studies have shown that array regrouping and structure splitting significantly improve data locality. The most effective technique relies on profiling every access to every data element. The high overhead impedes its adoption in a general compiler, In this paper, we show that for array regrouping in scientific programs, the overhead is not needed since the same benefit can be obtained by pure program analysis.We present an interprocedural analysis technique for array regrouping. For each global array, the analysis summarizes the access pattern by access-frequency vectors and then groups arrays with similar vectors. The analysis is context sensitive, so it tracks the exact array access. For each loop or function call, it uses two methods to estimate the frequency of the execution. The first is symbolic analysis in the compiler. The second is lightweight profiling of the code. The same interprocedural analysis is used to cumulate the overall execution frequency by considering the calling context. We implemented a prototype of both the compiler and the profiling analysis in the IBM� compiler, evaluated array regrouping on the entire set of SPEC CPU2000 FORTRAN benchmarks, and compared different analysis methods. The pure compiler-based array regrouping improves the performance for the majority of programs, leaving little room for improvement by code or data profiling.

References

[1]
R. Allen and K. Kennedy. Optimizing Compilers for Modern Architectures: A Dependence-based Approach. Morgan Kaufmann Publishers, October 2001.]]
[2]
J. Anderson, S. Amarasinghe, and M. Lam. Data and computation transformation for multiprocessors. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Santa Barbara, CA, July 1995.]]
[3]
D. Bailey. Unfavorable strides in cache memory systems. Technical Report RNR-92-015, NASA Ames Research Center, 1992.]]
[4]
M. Cierniak and W. Li. Unifying data and control transformations for distributed shared-memory machines. In Proceedings of the SIGPLAN '95 Conference on Programming Language Design and Implementation, La Jolla, California, June 1995.]]
[5]
C. Ding and K. Kennedy. Inter-array data regrouping. In Proceedings of The 12th International Workshop on Languages and Compilers for Parallel Computing, La Jolla, California, August 1999.]]
[6]
C. Ding and K. Kennedy. Improving effective bandwidth through compiler enhancement of global cache reuse. Journal of Parallel and Distributed Computing, 64(1), 2004.]]
[7]
M. Emami, R. Ghiya, and L. Hendren, Context-sensitive interprocedural points-to analysis in the presence of function pointers. In Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation, 1994.]]
[8]
T. E. Jeremiassen and S. J. Eggers. Reducing false sharing on shared memory multiprocessors through compile time data transformations. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 179--188, Santa Barbara, CA, July 1995.]]
[9]
K. Kennedy and U. Kremer. Automatic data layout for distributed memory machines. ACM Transactions on Programming Languages and Systems, 20(4), 1998.]]
[10]
G. Rivera and C.-W. Tseng. Data transformations for eliminating conflict misses. In Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation, 1998.]]
[11]
V. Sarkar. Determining average program execution times and their variance. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, Oregon, January 1989.]]
[12]
R. Silvera, R. Archambault, D. Fosbury, and B. Blainey. Branch and value profile feedback for whole program optimization. Unpublished, no date given.]]
[13]
T. A. Wagner, V. Maverick, S. L. Graham, and M. A. Harrison. Accurate static estimators for program optimization. In Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation, 1994.]]
[14]
M. Wegman and K. Zadeck. Constant propagation with conditional branches. In Conference Record of the Twelfth Annual ACM Symposium on the Principles of Programming Languages, New Orleans, LA, January 1985.]]
[15]
Y. Zhong, M. Orlovich, X. Shen, and C. Ding. Array regrouping and structure splitting using whole-program reference affinity. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, June 2004.]]
[16]
K. McKinley, O. Temam. A quantitative analysis of loop nest locality. In Proceedings of the seventh international conference on Architectural support for programming languages and operating systems, Cambridge, MA, US, 1996]]

Cited By

View all
  • (2022)A Comprehensive Survey on Affinity Analysis, Bibliomining, and Technology Mining: Past, Present, and Future ResearchApplied Sciences10.3390/app1210522712:10(5227)Online publication date: 21-May-2022
  • (2021)Nature of Volatile Organic Matter in Lake Sediments as a Reflection of Paleoclimate Changes Occurring at 4�ka in the Central Qaidam BasinFrontiers in Earth Science10.3389/feart.2021.7344589Online publication date: 11-Aug-2021
  • (2018)All timescale window co-occurrenceProceedings of the 28th Annual International Conference on Computer Science and Software Engineering10.5555/3291291.3291322(289-292)Online publication date: 29-Oct-2018
  • Show More Cited By

Index Terms

  1. Lightweight reference affinity analysis

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICS '05: Proceedings of the 19th annual international conference on Supercomputing
    June 2005
    414 pages
    ISBN:1595931678
    DOI:10.1145/1088149
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 June 2005

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. affinity
    2. compiler
    3. data interleving
    4. data regrouping
    5. frequency
    6. memory optimization

    Qualifiers

    • Article

    Conference

    ICS05
    Sponsor:
    ICS05: International Conference on Supercomputing 2005
    June 20 - 22, 2005
    Massachusetts, Cambridge

    Acceptance Rates

    Overall Acceptance Rate 629 of 2,180 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 16 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)A Comprehensive Survey on Affinity Analysis, Bibliomining, and Technology Mining: Past, Present, and Future ResearchApplied Sciences10.3390/app1210522712:10(5227)Online publication date: 21-May-2022
    • (2021)Nature of Volatile Organic Matter in Lake Sediments as a Reflection of Paleoclimate Changes Occurring at 4�ka in the Central Qaidam BasinFrontiers in Earth Science10.3389/feart.2021.7344589Online publication date: 11-Aug-2021
    • (2018)All timescale window co-occurrenceProceedings of the 28th Annual International Conference on Computer Science and Software Engineering10.5555/3291291.3291322(289-292)Online publication date: 29-Oct-2018
    • (2016)The hardness of data packingACM SIGPLAN Notices10.1145/2914770.283766951:1(232-242)Online publication date: 11-Jan-2016
    • (2016)The hardness of data packingProceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages10.1145/2837614.2837669(232-242)Online publication date: 11-Jan-2016
    • (2015)Enabling and Exploiting Flexible Task Assignment on GPU through SM-Centric Program TransformationsProceedings of the 29th ACM on International Conference on Supercomputing10.1145/2751205.2751213(119-130)Online publication date: 8-Jun-2015
    • (2014)ArrayToolProceedings of the 23rd international conference on Parallel architectures and compilation10.1145/2628071.2628102(405-416)Online publication date: 24-Aug-2014
    • (2014)Performance Metrics and Models for Shared CacheJournal of Computer Science and Technology10.1007/s11390-014-1460-729:4(692-712)Online publication date: 4-Jul-2014
    • (2012)A Compiler-Based Tool for Array Analysis in HPC ApplicationsProceedings of the 2012 41st International Conference on Parallel Processing Workshops10.1109/ICPPW.2012.63(454-463)Online publication date: 10-Sep-2012
    • (2011)Array Regrouping on CMP with Non-uniform Cache SharingLanguages and Compilers for Parallel Computing10.1007/978-3-642-19595-2_7(92-105)Online publication date: 2011
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media