SerumRNN: Step by Step Audio VST Effect Programming

Mitcheltree, Christopher; Koike, Hideki

doi:10.1007/978-3-030-72914-1_15

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12693))

Included in the following conference series:

International Conference on Computational Intelligence in Music, Sound, Art and Design (Part of EvoStar)

1851 Accesses
3 Citations

Abstract

Learning to program an audio production VST synthesizer is a time consuming process, usually obtained through inefficient trial and error and only mastered after years of experience. As an educational and creative tool for sound designers, we propose SerumRNN: a system that provides step-by-step instructions for applying audio effects to change a user’s input audio towards a desired sound. We apply our system to Xfer Records Serum: currently one of the most popular and complex VST synthesizers used by the audio production community. Our results indicate that SerumRNN is consistently able to provide useful feedback for a variety of different audio effects and synthesizer presets. We demonstrate the benefits of using an iterative system and show that SerumRNN learns to prioritize effects and can discover more efficient effect order sequences than a variety of baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

LyricJam Sonic: A Generative System for�Real-Time Composition and�Musical Improvisation

Chapter � 2023

EarGram: An Application for Interactive Exploration of Concatenative Sound Synthesis in Pure Data

A Creative Tool for the Musician Combining LSTM and Markov Chains in Max/MSP

References

Ableton: Max for Live|Ableton 2020. Accessed 19 Nov 2020. https://www.ableton.com/en/live/max-for-live/
Barkan, O., Tsiris, D., Katz, O., Koenigstein, N.: Inversynth: deep estimation of synthesizer parameter configurations from audio signals. IEEE/ACM Trans. Audio Speech Lang. Process. 27(12), 2385–2396 (2019). https://doi.org/10.1109/TASLP.2019.2944568
Article Google Scholar�
C�ceres, J.P.: Sound design learning for frequency modulation synthesis parameters (2007)
Google Scholar�
Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (elus). CoRR abs/1511.07289 (2016)
Google Scholar
Damskägg, E.P., Juvela, L., Thuillier, E., Välimäki, V.: Deep learning for tube amplifier emulation. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 471–475 (2019)
Google Scholar
Duda, S.: Serum: Advanced Wavetable Synthesizer - Xfer Records 2020. 19 Accessed Nov 2020. https://xferrecords.com/products/serum
Engel, J., Hantrakul, L.H., Gu, C., Roberts, A.: Ddsp: differentiable digital signal processing. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=B1x1ma4tDr
Fedden, L.: RenderMan 2018. Accessed 19 Nov 2020. https://github.com/fedden/RenderMan
Hu, Y., He, H., Xu, C., Wang, B., Lin, S.: Exposure: a white-box photo post-processing framework. ACM Trans. Graph. 37(2) (2018). https://doi.org/10.1145/3181974
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2015)
Google Scholar
Ly, E., Villegas, J.: Genetic reverb: synthesizing artificial reverberant fields via genetic algorithms. In: EvoMUSART (2020)
Google Scholar
Macret, M., Pasquier, P.: Automatic design of sound synthesizers as pure data patches using coevolutionary mixed-typed cartesian genetic programming. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, GECCO 2014, pp. 309–316 (2014). https://doi.org/10.1145/2576768.2598303
Ramírez, M.A.M., Reiss, J.: End-to-end equalization with convolutional neural networks. In: Proceedings of the 21st International Conference on Digital Audio Effects (DAFx-18) (2018)
Google Scholar
Sheng, D., Fazekas, G.: A feature learning siamese model for intelligent control of the dynamic range compressor. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2019)
Google Scholar
Sommer, N., Ralescu, A.: Developing a machine learning approach to controlling musical synthesizer parameters in real-time live performance. In: MAICS (2014)
Google Scholar
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)
MathSciNet MATH Google Scholar
Tatar, K., Macret, M., Pasquier, P.: Automatic synthesizer preset generation with presetgen. J. New Music Res. 45, 124–144 (2016)
Article Google Scholar
Thio, V., Donahue, C.: Neural loops. In: NeurIPS Workshop on Machine Learning for Creativity and Design (2019)
Google Scholar
Yee-King, M.J., Fedden, L., d’Inverno, M.: Automatic programming of VST sound synthesizers using deep networks and other techniques. IEEE Trans. Emerg. Top. Comput. Intell. 2(2), 150–159 (2018). https://doi.org/10.1109/TETCI.2017.2783885
Article Google Scholar

Download references

Author information

Authors and Affiliations

Tokyo Institute of Technology, Tokyo, 152-8550, Japan
Christopher Mitcheltree�&�Hideki Koike
Qosmo Inc., Tokyo, 153-0051, Japan
Christopher Mitcheltree

Authors

Christopher Mitcheltree
View author publications
You can also search for this author in PubMed�Google Scholar
Hideki Koike
View author publications
You can also search for this author in PubMed�Google Scholar

Corresponding author

Correspondence to Christopher Mitcheltree .

Editor information

Editors and Affiliations

University of A Coru�a, A Coru�a, Spain
Juan Romero
University of Coimbra, Coimbra, Portugal
Tiago Martins
University of A Coru�a, A Coru�a, Spain
Nereida Rodr�guez-Fern�ndez

A Appendix

See Tables�7, 8 and�9.

Table 7. Effect selection models prediction accuracy (Advanced Shapes and Basic Shapes preset groups).

Full size table

Table 8. Evaluation metrics for SerumRNN (Advanced Shapes preset group).

Full size table

Table 9. Evaluation metrics for SerumRNN (Basic Shapes preset group).

Full size table

Rights and permissions

Reprints and permissions

Copyright information

� 2021 Springer Nature Switzerland AG

About this paper

Cite this paper

Mitcheltree, C., Koike, H. (2021). SerumRNN: Step by Step Audio VST Effect Programming. In: Romero, J., Martins, T., Rodr�guez-Fern�ndez, N. (eds) Artificial Intelligence in Music, Sound, Art and Design. EvoMUSART 2021. Lecture Notes in Computer Science(), vol 12693. Springer, Cham. https://doi.org/10.1007/978-3-030-72914-1_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-72914-1_15
Published: 02 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72913-4
Online ISBN: 978-3-030-72914-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

SerumRNN: Step by Step Audio VST Effect Programming

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

LyricJam Sonic: A Generative System for�Real-Time Composition and�Musical Improvisation

EarGram: An Application for Interactive Exploration of Concatenative Sound Synthesis in Pure Data

A Creative Tool for the Musician Combining LSTM and Markov Chains in Max/MSP

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

SerumRNN: Step by Step Audio VST Effect Programming

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

LyricJam Sonic: A Generative System for�Real-Time Composition and�Musical Improvisation

EarGram: An Application for Interactive Exploration of Concatenative Sound Synthesis in Pure Data

A Creative Tool for the Musician Combining LSTM and Markov Chains in Max/MSP

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

A Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation