skip to main content
10.1109/ICSE.2017.44acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Learning syntactic program transformations from examples

Published: 20 May 2017 Publication History

Abstract

Automatic program transformation tools can be valuable for programmers to help them with refactoring tasks, and for Computer Science students in the form of tutoring systems that suggest repairs to programming assignments. However, manually creating catalogs of transformations is complex and time-consuming. In this paper, we present Refazer, a technique for automatically learning program transformations. Refazer builds on the observation that code edits performed by developers can be used as input-output examples for learning program transformations. Example edits may share the same structure but involve different variables and subexpressions, which must be generalized in a transformation at the right level of abstraction. To learn transformations, Refazer leverages state-of-the-art programming-by-example methodology using the following key components: (a) a novel domain-specific language (DSL) for describing program transformations, (b) domain-specific deductive algorithms for efficiently synthesizing transformations in the DSL, and (c) functions for ranking the synthesized transformations.
We instantiate and evaluate Refazer in two domains. First, given examples of code edits used by students to fix incorrect programming assignment submissions, we learn program transformations that can fix other students' submissions with similar faults. In our evaluation conducted on 4 programming tasks performed by 720 students, our technique helped to fix incorrect submissions for 87% of the students. In the second domain, we use repetitive code edits applied by developers to the same project to synthesize a program transformation that applies these edits to other locations in the code. In our evaluation conducted on 56 scenarios of repetitive edits taken from three large C# open-source projects, Refazer learns the intended program transformation in 84% of the cases using only 2.9 examples on average.

References

[1]
N. Meng, M. Kim, and K. S. McKinley, "LASE: Locating and applying systematic edits by learning from examples," in Proceedings of the 35th International Conference on Software Engineering, ser. ICSE '13. Piscataway, NJ, USA: IEEE Press, 2013, pp. 502--511.
[2]
L. Wasserman, "Scalable, example-based refactorings with refaster," in Proceedings of the 2013 ACM Workshop on Workshop on Refactoring Tools, ser. WRT '13. New York, NY, USA: ACM, 2013, pp. 25--28.
[3]
Microsoft, "Visual Studio," 2016, at https://www.visualstudio.com.
[4]
The Eclipse Foundation, "Eclipse," 2016, at https://eclipse.org/.
[5]
JetBrains, "ReSharper," 2016, at https://www.jetbrains.com/resharper/.
[6]
Synopsys, Inc., "Coverity," 2016, at http://www.coverity.com/.
[7]
Google, "Error-prone," 2016, at http://errorprone.info/.
[8]
Google, "Clang-tidy," 2016, at http://clang.llvm.org/extra/clang-tidy/.
[9]
R. Singh, S. Gulwani, and A. Solar-Lezama, "Automated feedback generation for introductory programming assignments," in Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, ser. PLDI '13. New York, NY, USA: ACM, 2013, pp. 15--26.
[10]
S. Gulwani, J. Hern�ndez-Orallo, E. Kitzelmann, S. H. Muggleton, U. Schmid, and B. Zorn, "Inductive programming meets the real world," Communications of the ACM, vol. 58, no. 11, pp. 90--99, 2015.
[11]
S. Gulwani, "Automating string processing in spreadsheets using input-output examples," in Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, ser. POPL '11. New York, NY, USA: ACM, 2011, pp. 317--330.
[12]
V. Le and S. Gulwani, "FlashExtract: A framework for data extraction by examples," in Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, ser. PLDI '14, New York, NY, USA, 2014, pp. 542--553.
[13]
D. Edge, S. Gulwani, N. Milic-Frayling, M. Raza, R. Adhitya Saputra, C. Wang, and K. Yatani, "Mixed-initiative approaches to global editing in slideware," in Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, ser. CHI '15. New York, NY, USA: ACM, 2015, pp. 3503--3512.
[14]
O. Polozov and S. Gulwani, "FlashMeta: A framework for inductive program synthesis," in Proceedings of the ACM International Conference on Object-oriented Programming Systems, Languages, and Applications, ser. OOPSLA '15. New York, NY, USA: ACM, 2015, pp. 542--553.
[15]
A. Hindle, E. T. Barr, Z. Su, M. Gabel, and P. Devanbu, "On the naturalness of software," in Proceedings of the 34th International Conference on Software Engineering, ser. ICSE '12. Piscataway, NJ, USA: IEEE Press, 2012, pp. 837--847.
[16]
G. Kniesel and H. Koch, "Static composition of refactorings," Science of Computer Programming, vol. 52, no. 1-3, pp. 9--51, 2004.
[17]
Microsoft, "Project Roslyn," 2011, at https://github.com/dotnet/roslyn.
[18]
Microsoft, "Entity Framework 6," at http://www.asp.net/entity-framework.
[19]
Microsoft, "NuGet 2," at https://github.com/nuget/nuget2.
[20]
N. Meng, M. Kim, and K. S. McKinley, "Systematic editing: Generating program transformations from an example," in Proceedings of the 32Nd ACM SIGPLAN Conference on Programming Language Design and Implementation, ser. PLDI '11. New York, NY, USA: ACM, 2011, pp. 329--342.
[21]
R. Alur, R. Bodik, G. Juniwal, M. M. K. Martin, M. Raghothaman, S. A. Seshia, R. Singh, A. Solar-Lezama, E. Torlak, and A. Udupa, "Syntax-guided synthesis," in Proceedings of the 2013 Formal Methods in Computer-Aided Design, ser. FMCAD '13. Piscataway, NJ, USA: IEEE Press, 2013, pp. 1--8.
[22]
World Wide Web Consortium, "XPath," 1999, at https://www.w3.org/TR/xpath/.
[23]
M. Pawlik and N. Augsten, "RTED: A robust algorithm for the tree edit distance," Proceedings of the VLDB Endowment, vol. 5, no. 4, pp. 334--345, Dec. 2011.
[24]
K. Zhang and D. Shasha, "Simple fast algorithms for the editing distance between trees and related problems," SIAM Journal on Computing, vol. 18, no. 6, pp. 1245--1262, 1989.
[25]
M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, "A density-based algorithm for discovering clusters in large spatial databases with noise," in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. AAAI Press, 1996, pp. 226--231.
[26]
N. Meng, L. Hua, M. Kim, and K. S. McKinley, "Does automated refactoring obviate systematic editing?" in Proceedings of the 37th International Conference on Software Engineering - Volume 1, ser. ICSE '15. Piscataway, NJ, USA: IEEE Press, 2015, pp. 392--402.
[27]
R. Robbes and M. Lanza, "Example-based program transformation," in Model Driven Engineering Languages and Systems, ser. Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2008, vol. 5301, pp. 174--188.
[28]
M. Boshernitsan, S. L. Graham, and M. A. Hearst, "Aligning development tools with the way programmers think about code changes," in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ser. CHI '07. New York, NY, USA: ACM, 2007, pp. 567--576.
[29]
J. K. Feser, S. Chaudhuri, and I. Dillig, "Synthesizing data structure transformations from input-output examples," in Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, ser. PLDI '15. New York, NY, USA: ACM, 2015.
[30]
H. A. Nguyen, T. T. Nguyen, G. Wilson, Jr., A. T. Nguyen, M. Kim, and T. N. Nguyen, "A graph-based approach to API usage adaptation," in Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, ser. OOPSLA '10. New York, NY, USA: ACM, 2010, pp. 302--321.
[31]
W. Tansey and E. Tilevich, "Annotation refactoring: Inferring upgrade transformations for legacy applications," in Proceedings of the 23rd ACM SIGPLAN Conference on Object-oriented Programming Systems Languages and Applications, ser. OOPSLA '08. New York, NY, USA: ACM, 2008, pp. 295--312.
[32]
B. Hartmann, D. MacDougall, J. Brandt, and S. R. Klemmer, "What would other programmers do: Suggesting solutions to error messages," in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ser. CHI '10. New York, USA: ACM, 2010, pp. 1019--1028.
[33]
M. Asaduzzaman, C. K. Roy, S. Monir, and K. A. Schneider, "Exploring API method parameter recommendations," in Proceedings of the 31st IEEE International Conference on Software Maintenance and Evolution (ICSME), ser. ICSME '15. Washington, DC, USA: IEEE Computer Society, 2015, pp. 271--280.
[34]
V. Raychev, M. Sch�fer, M. Sridharan, and M. Vechev, "Refactoring with synthesis," in Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications, ser. OOPSLA '13. New York, NY, USA: ACM, 2013, pp. 339--354.
[35]
S. R. Foster, W. G. Griswold, and S. Lerner, "WitchDoctor: IDE support for real-time auto-completion of refactorings," in Proceedings of the 34th International Conference on Software Engineering, ser. ICSE '12. Piscataway, NJ, USA: IEEE Press, 2012, pp. 222--232.
[36]
X. Ge, Q. L. DuBose, and E. Murphy-Hill, "Reconciling manual and automatic refactoring," in Proceedings of the 34th International Conference on Software Engineering, ser. ICSE '12. Piscataway, NJ, USA: IEEE Press, 2012, pp. 211--221.
[37]
A. T. Nguyen, M. Hilton, M. Codoban, H. A. Nguyen, L. Mast, E. Rademacher, T. N. Nguyen, and D. Dig, "API code recommendation using statistical learning from fine-grained changes," in Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ser. FSE 2016. New York, NY, USA: ACM, 2016, pp. 511--522.
[38]
V. Raychev, M. Vechev, and E. Yahav, "Code completion with statistical language models," in Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, ser. PLDI '14. New York, NY, USA: ACM, 2014, pp. 419--428.
[39]
H. Lieberman, Your wish is my command: Programming by example. Morgan Kaufmann, 2001.
[40]
A. Leung, J. Sarracino, and S. Lerner, "Interactive parser synthesis by example," in Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015, pp. 565--574.
[41]
J. K. Feser, S. Chaudhuri, and I. Dillig, "Synthesizing data structure transformations from input-output examples," in Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, ser. PLDI '15. New York, NY, USA: ACM, 2015.
[42]
Y. Yuan, R. Alur, and B. T. Loo, "NetEgg: Programming network policies by examples," in Proceedings of the 13th ACM Workshop on Hot Topics in Networks, 2014, pp. 20:1--20:7.
[43]
M. Mayer, G. Soares, M. Grechkin, V. Le, M. Marron, O. Polozov, R. Singh, B. Zorn, and S. Gulwani, "User interaction models for disambiguation in programming by example," in Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology, ser. UIST '15. New York, NY, USA: ACM, 2015, pp. 291--301.
[44]
K. Rivers and K. R. Koedinger, "Data-driven hint generation in vast solution spaces: a self-improving Python programming tutor," International Journal of Artificial Intelligence in Education, pp. 1--28, 2015.
[45]
S. Kaleeswaran, A. Santhiar, A. Kanade, and S. Gulwani, "Semi-supervised verified feedback generation," in Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ser. FSE 2016. New York, NY, USA: ACM, 2016, pp. 739--750.
[46]
C. Goues, S. Forrest, and W. Weimer, "Current challenges in automatic software repair," Software Quality Journal, vol. 21, no. 3, pp. 421--443, 2013.
[47]
C. Le Goues, M. Dewey-Vogt, S. Forrest, and W. Weimer, "A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each," in Proceedings of the 34th International Conference on Software Engineering, ser. ICSE '12. Piscataway, NJ, USA: IEEE Press, 2012, pp. 3--13.
[48]
F. Long and M. Rinard, "Automatic patch generation by learning correct code," Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, vol. 51, no. 1, pp. 298--312, 2016.
[49]
L. D'Antoni, R. Samanta, and R. Singh, Qlose: Program repair with quantitative objectives. Springer International Publishing, 2016, pp. 383--401.
[50]
A. Head, E. Glassman, G. Soares, R. Suzuki, L. D'Antoni, and B. Hartmann, "Writing Reusable Code Feedback at Scale with Mixed-Initiative Program Synthesis," in L@S'17: 4th ACM Conference on Learning at Scale, 2017.

Cited By

View all
  • (2024)X-TED: Massive Parallelization of Tree Edit DistanceProceedings of the VLDB Endowment10.14778/3654621.365463417:7(1683-1696)Online publication date: 1-Mar-2024
  • (2024)Equivalence by Canonicalization for Synthesis-Backed RefactoringProceedings of the ACM on Programming Languages10.1145/36564538:PLDI(1879-1904)Online publication date: 20-Jun-2024
  • (2024)BRAFAR: Bidirectional Refactoring, Alignment, Fault Localization, and Repair for Programming AssignmentsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680326(856-868)Online publication date: 11-Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '17: Proceedings of the 39th International Conference on Software Engineering
May 2017
816 pages
ISBN:9781538638682

Sponsors

Publisher

IEEE Press

Publication History

Published: 20 May 2017

Check for updates

Author Tags

  1. program synthesis
  2. program transformation
  3. refactoring
  4. tutoring systems

Qualifiers

  • Research-article

Conference

ICSE '17
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)X-TED: Massive Parallelization of Tree Edit DistanceProceedings of the VLDB Endowment10.14778/3654621.365463417:7(1683-1696)Online publication date: 1-Mar-2024
  • (2024)Equivalence by Canonicalization for Synthesis-Backed RefactoringProceedings of the ACM on Programming Languages10.1145/36564538:PLDI(1879-1904)Online publication date: 20-Jun-2024
  • (2024)BRAFAR: Bidirectional Refactoring, Alignment, Fault Localization, and Repair for Programming AssignmentsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680326(856-868)Online publication date: 11-Sep-2024
  • (2024)Automated Deep Learning Optimization via DSL-Based Source Code TransformationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652143(479-490)Online publication date: 11-Sep-2024
  • (2024)PyDex: Repairing Bugs in Introductory Python Assignments using LLMsProceedings of the ACM on Programming Languages10.1145/36498508:OOPSLA1(1100-1124)Online publication date: 29-Apr-2024
  • (2024)Hydra: Generalizing Peephole Optimizations with Program SynthesisProceedings of the ACM on Programming Languages10.1145/36498378:OOPSLA1(725-753)Online publication date: 29-Apr-2024
  • (2024)Decomposition-based Synthesis for Applying Divide-and-Conquer-like Algorithmic ParadigmsACM Transactions on Programming Languages and Systems10.1145/364844046:2(1-59)Online publication date: 17-Jun-2024
  • (2024)Automated Program Repair for Introductory Programming Assignments via Bidirectional RefactoringProceedings of the 5th ACM/IEEE International Workshop on Automated Program Repair10.1145/3643788.3648017(53-55)Online publication date: 20-Apr-2024
  • (2024)C-Pack of IPAs: A C90 Program Benchmark of Introductory Programming AssignmentsProceedings of the 5th ACM/IEEE International Workshop on Automated Program Repair10.1145/3643788.3648010(14-21)Online publication date: 20-Apr-2024
  • (2024)Unprecedented Code Change Automation: The Fusion of LLMs and Transformation by ExampleProceedings of the ACM on Software Engineering10.1145/36437551:FSE(631-653)Online publication date: 12-Jul-2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media