Skip to main content

SerumRNN: Step by Step Audio VST Effect Programming

  • Conference paper
  • First Online:
Artificial Intelligence in Music, Sound, Art and Design (EvoMUSART 2021)

Abstract

Learning to program an audio production VST synthesizer is a time consuming process, usually obtained through inefficient trial and error and only mastered after years of experience. As an educational and creative tool for sound designers, we propose SerumRNN: a system that provides step-by-step instructions for applying audio effects to change a user’s input audio towards a desired sound. We apply our system to Xfer Records Serum: currently one of the most popular and complex VST synthesizers used by the audio production community. Our results indicate that SerumRNN is consistently able to provide useful feedback for a variety of different audio effects and synthesizer presets. We demonstrate the benefits of using an iterative system and show that SerumRNN learns to prioritize effects and can discover more efficient effect order sequences than a variety of baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ableton: Max for Live|Ableton 2020. Accessed 19 Nov 2020. https://www.ableton.com/en/live/max-for-live/

  2. Barkan, O., Tsiris, D., Katz, O., Koenigstein, N.: Inversynth: deep estimation of synthesizer parameter configurations from audio signals. IEEE/ACM Trans. Audio Speech Lang. Process. 27(12), 2385–2396 (2019). https://doi.org/10.1109/TASLP.2019.2944568

    Article  Google Scholar�

  3. C�ceres, J.P.: Sound design learning for frequency modulation synthesis parameters (2007)

    Google Scholar�

  4. Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (elus). CoRR abs/1511.07289 (2016)

    Google Scholar 

  5. Damskägg, E.P., Juvela, L., Thuillier, E., Välimäki, V.: Deep learning for tube amplifier emulation. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 471–475 (2019)

    Google Scholar 

  6. Duda, S.: Serum: Advanced Wavetable Synthesizer - Xfer Records 2020. 19 Accessed Nov 2020. https://xferrecords.com/products/serum

  7. Engel, J., Hantrakul, L.H., Gu, C., Roberts, A.: Ddsp: differentiable digital signal processing. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=B1x1ma4tDr

  8. Fedden, L.: RenderMan 2018. Accessed 19 Nov 2020. https://github.com/fedden/RenderMan

  9. Hu, Y., He, H., Xu, C., Wang, B., Lin, S.: Exposure: a white-box photo post-processing framework. ACM Trans. Graph. 37(2) (2018). https://doi.org/10.1145/3181974

  10. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2015)

    Google Scholar 

  11. Ly, E., Villegas, J.: Genetic reverb: synthesizing artificial reverberant fields via genetic algorithms. In: EvoMUSART (2020)

    Google Scholar 

  12. Macret, M., Pasquier, P.: Automatic design of sound synthesizers as pure data patches using coevolutionary mixed-typed cartesian genetic programming. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, GECCO 2014, pp. 309–316 (2014). https://doi.org/10.1145/2576768.2598303

  13. Ramírez, M.A.M., Reiss, J.: End-to-end equalization with convolutional neural networks. In: Proceedings of the 21st International Conference on Digital Audio Effects (DAFx-18) (2018)

    Google Scholar 

  14. Sheng, D., Fazekas, G.: A feature learning siamese model for intelligent control of the dynamic range compressor. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2019)

    Google Scholar 

  15. Sommer, N., Ralescu, A.: Developing a machine learning approach to controlling musical synthesizer parameters in real-time live performance. In: MAICS (2014)

    Google Scholar 

  16. Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  17. Tatar, K., Macret, M., Pasquier, P.: Automatic synthesizer preset generation with presetgen. J. New Music Res. 45, 124–144 (2016)

    Article  Google Scholar 

  18. Thio, V., Donahue, C.: Neural loops. In: NeurIPS Workshop on Machine Learning for Creativity and Design (2019)

    Google Scholar 

  19. Yee-King, M.J., Fedden, L., d’Inverno, M.: Automatic programming of VST sound synthesizers using deep networks and other techniques. IEEE Trans. Emerg. Top. Comput. Intell. 2(2), 150–159 (2018). https://doi.org/10.1109/TETCI.2017.2783885

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christopher Mitcheltree .

Editor information

Editors and Affiliations

A Appendix

A Appendix

See Tables�7, 8 and�9.

Table 7. Effect selection models prediction accuracy (Advanced Shapes and Basic Shapes preset groups).
Table 8. Evaluation metrics for SerumRNN (Advanced Shapes preset group).
Table 9. Evaluation metrics for SerumRNN (Basic Shapes preset group).

Rights and permissions

Reprints and permissions

Copyright information

� 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mitcheltree, C., Koike, H. (2021). SerumRNN: Step by Step Audio VST Effect Programming. In: Romero, J., Martins, T., Rodr�guez-Fern�ndez, N. (eds) Artificial Intelligence in Music, Sound, Art and Design. EvoMUSART 2021. Lecture Notes in Computer Science(), vol 12693. Springer, Cham. https://doi.org/10.1007/978-3-030-72914-1_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-72914-1_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-72913-4

  • Online ISBN: 978-3-030-72914-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics