Today one of our PhD student researchers, Marco Martinez Ramirez, successfully defended his PhD. The form of these exams, or vivas, varies from country to country, and even institution to institution, which we discussed previously. Here, its pretty gruelling; behind closed doors, with two expert examiners probing every aspect of the PhD. And it was made even more challenging since it was all online due to the virus situation.
Marco’s PhD was on ‘Deep learning for audio effects modeling.’
Audio effects modeling is the process of emulating an audio effect unit and seeks to recreate the sound, behaviour and main perceptual features of an analog reference device. Both digital and analog audio effect units transform characteristics of the sound source. These transformations can be linear or nonlinear, time-invariant or time-varying and with short-term and long-term memory. Most typical audio effect transformations are based on dynamics, such as compression; tone such as distortion; frequency such as equalization; and time such as artificial reverberation or modulation based audio effects.
Simulation of audio processors is normally done by designing mathematical models of these systems. Its very difficult because it seeks to accurately model all components within the effect unit, which usually contains mechanical elements together with nonlinear and time-varying analog electronics. Most audio effects models are either simplified or optimized for a specific circuit or effect and cannot be efficiently translated to other effects.
Marco’s thesis explored deep learning architectures for audio processing in the context of audio effects modelling. He investigated deep neural networks as black-box modelling strategies to solve this task, i.e. by using only input-output measurements. He proposed several different DSP-informed deep learning models to emulate each type of audio effect transformations.
Marco then explored the performance of these models when modeling various analog audio effects, and analyzed how the given tasks are accomplished and what the models are actually learning. He investigated virtual analog models of nonlinear effects, such as a tube preamplifier; nonlinear effects with memory, such as a transistor-based limiter; and electromechanical nonlinear time-varying effects, such as a Leslie speaker cabinet and plate and spring reverberators.
Marco showed that the proposed deep learning architectures represent an improvement of the state-of-the-art in black-box modeling of audio effects and the respective directions of future work are given.
His research also led to a new start-up company, TONZ, which build on his machine learning techniques to provide new audio processing interactions for the next generation of musicians and music makers.
Here’s a list of some of Marco’s papers that relate to his PhD research while a member of the Intelligent Sound Engineering team.
- M. A. Martinez Ramirez, E. Benetos, J. D. Reiss, time-varying and nonlinear audio processing using deep neural networks, patent application number PCT/GB2020/051150, 2020
- M. A. Martinez Ramirez, E. Benetos, and J. D.Reiss, J. D., Deep learning for black-box modeling of audio effects. Applied Sciences, 10(2), 638, 2020
- M. A. Martinez Ramirez, E. Benetos, and J. D. Reiss, ‘Modeling plate and spring reverberation using a DSP-informed deep neural network,’ IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2020
- M. A. Martinez Ramirez, E. Benetos, and J. D. Reiss, “A general-purpose deep learning approach to model time-varying audio effects,” Digital Audio Effects Conference (DAFx), 2019
- M. A. Martinez Ramirez and J. D. Reiss, ‘Modeling Nonlinear Audio Effects with End-to-end Deep Neural Networks,’ IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019
- M. A. Martinez Ramirez and J. D. Reiss, ‘End-to-end equalization with convolutional neural networks,’ Digital Audio Effects (DAFx), Aveiro, Portugal, Sept. 4–8 2018.
- M. A. Martinez Ramirez and J. D. Reiss, ‘Stem Audio Mixing as a Content-Based Transformation of Audio Features,’ IEEE 19th International Workshop on Multimedia Signal Processing, Luton, UK, Oct. 16-18, 2017.
- M. A. Martinez Ramirez and J. D. Reiss, ‘Analysis and Prediction of the Audio Feature Space when Mixing Raw Recordings into Individual Stems,’ 143rd AES Convention, New York, Oct. 18-21, 2017.
- M. A. Martinez Ramirez and J. D. Reiss, ‘Deep Learning and Intelligent Audio Mixing,’ 3rd Workshop on Intelligent Music Production, Salford, UK, 15 September 2017.
- M. A. Martinez and J. D. Reiss, “Intelligent audio mixing using deep learning“, DMRN+11: Digital Music Research Network Workshop, Dec. 2016
Congratulations again, Marco!