Faders, an online collaborative digital audio workstation

Since you’re reading this blog, you probably saw the recent announcement about Nemisindo, https://nemisindo.com our new start-up offering sound design services based on procedural audio technology. This entry is about another start-up in the audio space, arising from academic research, Faders, https://faders.io . We aren’t involved in Faders, but know the team behind it very well, and have worked with them on other projects.

Faders is a spin-out from Birmingham City University: an online, collaborative, intelligent digital audio workstation (DAW) that’s free to use and has a range of intelligent features. 

Faders is built around the idea of removing all barriers in audio production, without hiding the complexity and dumbing down the interface. The result is a user-friendly, powerful, and smart DAW, which helps you by labelling tracks, doing rough mixes, trimming silence from recordings, … and a collaborative web-based platform that can accommodate cutting-edge processors that don’t have to fit a single track “plugin” for DAWs. 

Synth and FX plugins are based on the open source JSAP framework https://github.com/nickjillings/JSAP, which we worked with them on a little bit [1,2], and have talked about before. A Freesound browser is integrated https://freesound.org/. Some of the tech in this platform is based on about a decade of academic research. 

They’re looking for people to try it out and give feedback, and they have a forum at https://community.faders.io

They are especially keen to work with educators and will be releasing an education mode where one can share files, follow up on student projects, and even teach remotely with video, all inside Faders. They’re also exploring all kinds of partnerships, from software instruments over plugin development to cross-promotion.

[1] N. Jillings, Y. Wang, R. Stables and J. D. Reiss, ‘Intelligent audio plugin framework for the Web Audio API,’ Web Audio Conference, London, 2017

[2] N. Jillings, Y. Wang, J. D. Reiss and R. Stables, “JSAP: A Plugin Standard for the Web Audio API with Intelligent Functionality,” 141st Audio Engineering Society Convention, Los Angeles, USA, 2016.


Invitation to online listening study

We would like to invite you to participate in our study titled “Investigation of frequency-specific loudness discomfort levels, in listeners with migraine-related hypersensitivity to sound“.

Please note : You do not have to be a migraine sufferer to participate in this study although if you are, please make sure to specify that, when asked during the study (for more on eligibility criteria check the list below)

Our study consists of a brief questionnaire, followed by a simple listening test. This study is targeted towards listeners with and without migraine headaches and in order to participate you have to meet all of the following criteria:

1) Be 18 years old or older

2) Not have any history or diagnosis of hearing loss

3) Have access to a quiet room to take the test

4) Have access to a computer with an internet connection

5) Have access to a pair of functioning headphones

The total duration of the study is approximately 25 minutes. Your participation is voluntary however valuable, as it could provide a useful insight on the auditory manifestations of migraine, as well as aid the identification of possible differences between participants with and without migraines, this way facilitating further research on sound adaptations for migraine sufferers.

To access the study please follow the link below:

https://golisten.ucd.ie/task/hearing-test/5ff5b8ee0a6da21ed8df2fc7

If you have any questions or would like to share your feedback on this study please email a.mourgela@qmul.ac.uk or joshua.reiss@qmul.ac.uk

AES President-Elect-Elect!

“Anyone who is capable of getting themselves made President should on no account be allowed to do the job.” ― Douglas Adams, The Hitchhiker’s Guide to the Galaxy

So I’m sure you’ve all been waiting for this presidential election to end. No not that one. I’m referring to the Audio Engineering Society (AES)’s recent elections for their Board of Directors and Board of Governors.

And I’m very pleased and honored, that I (that’s Josh Reiss, the main author of this blog) have been elected as President.

Its actually three positions; in 2021 I’ll be President-Elect, 2022 President, and 2023 I’ll be Past-President. Another way to look at it is that the AES always has three presidents, one planning for the future, one getting things done and one imparting their experience and knowledge.

For those who don’t know, the AES is the largest professional society in audio engineering and related fields. It has over 12,000 members, and is the only professional society devoted exclusively to audio technology. It was founded in 1948 and has grown to become an international organisation that unites audio engineers, creative artists, scientist and students worldwide by promoting advances in audio and disseminating new knowledge and research.

My thanks to everyone who voted, to the AES in general, and to everyone who has said congratulations. And a big congratulations to all the other elected officers.

AES 148 – A Digital Vienna

Written jointly by Aggela Mourgela and JT Colonel

#VirtualVienna

The AES hosted its 148th international conference virtually this year. Despite the circumstances we find ourselves in due to covid-19, the conference put up an excellent program filled with informative talks, tutorials, and demonstrations. Below is a round-up of our favourite presentations, which run the gamut from incredibly technical talks regarding finite arithmetic systems to highly creative demonstrations of an augmented reality installation.

Tuesday

The first session on Tuesday morning, Active Sensing and Slow Listening was held by Thomas Lund & Susan E. Rogers, discussing principles of active sensing and slow listening as well as their role in pro audio product development. Lund kicked the session off by introducing the theory behind sound cognition as well as discussing about the afferent & efferent function of the brain with regards to sound perception. The session was then picked up by Rogers, who described the auditory pathway and its bidirectionality in more detail, presenting the parts of the brain engaging in sonic cognition. Rogers touched on the subjects of proprioception, the awareness of our bodies and interoception, the awareness of our feelings as well as the role of expectation when studying our responses to sound. To conclude both presenters pointed out that we should not treat listening as passive or uni-dynamic both external and internal factors influence the way we hear.   

Diagram showing the development of the tympanic ear across different geologic eras discussed in the Active Sensing and Slow Listening demonstration

Later in the day, Jamie Angus presented on Audio Signal Processing in the Real World: Dealing with the Effects of Finite Precision. At the center of the talk was a fundamental question: how does finite precision affect audio processing. Angus went into full detail regarding different finite precision arithemtics, i.e. fractional floating-point, and derived how the noise introduced by these systems impact filter design. 

The 3rd MATLAB Student Design Competition was hosted by Gabriele Bunkheila. Using the example of a stereo width expander, Bunkheila demonstrated the process of turning a simple offline MATLAB script into a real time audioPlugin class, using MATLAB’s inbuilt audio test benching app. He then proceeded to talk about C++ code generation, validation and export of the code into a VST plugin format, for use in a conventional digital audio workstation.  Bunkheila also demonstrated a simple GUI generation using MATLAB’s audioPluginInterface functionality. 

Wednesday

On Wednesday, Thomas Lund and Hyunkook Lee discussed the shift from stereo to immersive multi-channel audio in their talk Goodbye Stereo. First, Lund discussed the basics of spatial perception, the limitations of stereo in audio recording and reproduction, frequency related aspects of spatial audio, and the standards being implemented in immersive audio. Lee went on to discuss the psychoacoustic principles that apply to immersive audio as well as the differences between stereo and 3D. He expanded on limitations arising from microphones due to placement or internal characteristics and proceeded to discuss microphone array configurations that his research group is working on. The presentation was followed by a set of truly impressive immersive recordings, made in various venues with different microphone configurations and the audience was prompted to use headphones to experience them. Lee finished by introducing a 3D recorded database which will include room impulse responses, available for spatial audio research.

In his talk The Secret Life of Low Fequencies, Bruce Black discussed the trials and tribulations of acoustically treating rooms while paying special attention to their low frequency response. Black discussed the particle propagation and wave propagation models of sound transmission, and how they each require different treatments. He called specific attention to how the attenuation across low frequencies of a sound can change over the course of 200-400ms within a room. Black went on to show how Helmholtz resonators can be strategically placed in a space to smooth these uneven attenuations.

Marisa Hoeschele gave a very interesting keynote lecture on Audio from a Biological Perspective. Hoeschele began by discussing the concept of addressing human sounds from the perspective of the ”visiting alien”, where humans are studied as yet another species on the planet. Hoeschele discussed observations on shared emotional information and how we can identify sonic attributes corresponding to stress level/ excitement  across species. She then proceeded to discuss the ways in which we can study musicality as an innate human characteristic, as well as commonalities across cultures. Hoeschele then discussed ways in which other animals can inform us on musicality, by giving examples of experiments on the animals’ ability to respond to musical attributes like octave equivalence, as well as search correlations with human behavior.

Thursday

On Thursday, Brian Gibbs gave a workshop on spatial audio mixing, using a demo mix of Queen’s Bohemian Rhapsody. He began his presentation with a short discussion about the basics of spatial audio, presenting the concept of recording spatial audio from scratch or spatialization of audio recording in the studio. Gibbs also talked about the Ambix and Fuma renderers, while he also discussed higher order ambisonics and the MPEG-H format. He then proceeded to introduce the importance of loudness, giving a brief talk and demonstration of using LUFS in metering. Finally he discussed the importance of being aware of the platform or format that your work is going to end up, emphasizing on different streaming services and devices and their requirements. He ended the workshop with a listening session, where he presented Bohemian Rhapsody mixed alternating between mono, stereo and static 360 audio for his audience. 

Later, Thomas Aichinger presented Immersive Storytelling: Narrative Aspects in AR Audio Applications. Meant to be an AR installation in Vienna to coincide with the conference, Aichinger outlined the development and implementation of “SONIC TRACES,” a piece where participants would navigate an audio-based story set in the Heldenplatz. Aichinger described the difficulties and how his team overcame issues such as GPS tracking of users in the plaza, how to incorporate six degrees of freedom in motion tracking, and tweaking signal attenuation to suit the narrative. 

Thomas Aichinger’s rendering of the Heldenplatz in Unity, which is the game engine he and his team used to construct the AR experience

Friday

On the final day of the conference, Gabriele Bunkheila gave a keynote speech on Deep Learning for Audio Applications – Engineering Best Practices for Data. While deep learning applications are most frequently implemented in Python, Bunkheila made a compelling case for using MATLAB. Bunkheila pointed out the discrepancies between deep learning approaches in academia and in industry: namely, that academia focuses primarily on theorizing and developing new models, whereas industry devotes much more time to dataset construction and scrubbing. Moreover, he mentioned how deep learning for audio should not necessarily follow the best practices laid out by the image recognition community. For example, when applying noise to an audio dataset, one ought to simulate the environment in which one expects to deploy their deep learning system. So, for audio applications it makes much more sense to apply reverb or add speech to obscure your signal rather than Gaussian noise. 

In Closing

Though this conference looked nothing like what anyone expected at the beginning of this year, the AES demonstrated its ability to adapt to new paradigms and technologies. Here’s to hoping that the AES will be able to resume in-person conferences soon. In the meantime, the AES will continue its strong tradition of providing a platform for groundbreaking audio technologies and educational opportunities virtually.

Funded PhD studentships available in Data-informed Audience-centric Media Engineering

So its been a while since I’ve written a blog post. Life, work, and of course, the Covid crisis has made my time limited. But hopefully I’ll write more frequently in future.

The good news is that there are fully funded PhD studentships which you or others you know might be interested in. They are all around the concept of Data-informed Audience-centric Media Engineering (DAME). See https://dame.qmul.ac.uk/ for details.

Three studentships are available. They are all fully-funded, for four years of study, based at Queen Mary University of London, and starting January 2021. Two of the proposed topics, ‘Media engineering for hearing-impaired audiences’ and ‘Intelligent systems for radio drama production’, are supported by BBC and build on prior and ongoing work by my research team.

  • Media engineering for hearing-impaired audiences: This research proposes the exploration of ways in which media content can be automatically processed to deliver the content optimally for audiences with hearing loss. It builds on prior work by our group and the collaborator, BBC, in development of effective audio mixing techniques for broadcast audio enhancement [1,2,3]. It will form a deeper understanding of the effects of hearing loss on media content perception and enjoyment, as well as utilize this knowledge towards the development of intelligent audio production techniques and applications that could improve audio quality by providing efficient and customisable compensation. It aims to advance beyond current research [4], which does not yet fully take into account the artistic intent of the material, and requires an ‘ideal mix’ for normal hearing listeners. So a new approach that both removes constraints and is more focused on the meaning of the content is required. This approach will be derived from natural language processing and audio informatics, to prioritise sources and establish requirements for the preferred mix.
  • Intelligent systems for radio drama production: This research topic proposes methods for assisting a human creator in producing radio dramas. Radio drama consists of both literary aspects, such as plot, story characters, or environments, as well as production aspects, such as speech, music, and sound effects. This project builds on recent, high impact collaboration with BBC [3, 5], to greatly advance the understanding of radio drama production, with the goal of devising and assessing intelligent technologies to aid in its creation. The project will first be concerned with investigating rules-based systems for generating production scripts from story outlines, and producing draft content from such scripts. It will consider existing workflows for content production and where such approaches rely on heavy manual labour. Evaluation will be with expert content producers, with the goal of creating new technologies that streamline workflows and facilitate the creative process.

If you or anyone you know is interested, please look at https://dame.qmul.ac.uk/ . Consider applying and feel free to ask me any questions.

[1] A. Mourgela, T. Agus and J. D. Reiss, “Perceptually Motivated Hearing Loss Simulation for Audio Mixing Reference,” 147th AES Convention, 2019.

[2] Ward, Lauren, et al. “Casualty Accessible and Enhanced (A&E) Audio: Trialling Object-Based Accessible TV Audio.” Audio Engineering Society Convention 147. 2019.

[3] E. T. Chourdakis, L. Ward, M. Paradis and J. D. Reiss, “Modelling Experts’ Decisions on Assigning Narrative Importances of Objects in a Radio Drama Mix,” Digital Audio Effects Conference (DAFx), 2019.

[4] L. Ward and B. Shirley, Personalization in object-based audio for accessibility: a review of advancements for hearing impaired listeners. Journal of the Audio Engineering Society, 67(7/8), 584-597, 2019.

[5] E. T. Chourdakis and J. D. Reiss, ‘From my pen to your ears: automatic production of radio plays from unstructured story text,’ 15th Sound and Music Computing Conference (SMC), Limassol, Cyprus, 4-7 July, 2018

Fellow of the Audio Engineering Society

The Audio Engineering Society’s Fellowship Award is given to ‘a member who had rendered conspicuous service or is recognized to have made a valuable contribution to the advancement in or dissemination of knowledge of audio engineering or in the promotion of its application in practice’.

Today at the 147th AES Convention, I was given the Fellowship Award for valuable contributions to, and for encouraging and guiding the next generation of researchers in, the development of audio and musical signal processing.

This is quite an honour, of which I’m very proud. And it puts me in some excellent company. A lot of greats have become Fellows of the AES (Manfred SchroederVesa Valimaki, Poppy Crum, Bob Moog, Richard Heyser, Leslie Ann Jones, Gunther Thiele and Richard Small…) which also means I have a lot to live up to.

And thanks to the AES,

Josh Reiss

Nonlinear Audio Effects at ICASSP 2019

The Audio Engineering research team within the Centre for Digital Music is going to be present at ICASSP 2019.

Marco Martínez is presenting the paper ‘Modeling Nonlinear Audio Effects With End-to-end Deep Neural Networks‘, which can be found here.

Basically, given that nonlinear audio effects are widely used by musicians and sound engineers and taking into account that most existing methods for nonlinear modeling are often either simplified or optimized to a very specific circuit. In this work, we introduce a general-purpose deep learning architecture for generic black-box modeling of nonlinear and linear audio effects.

We show the model performing nonlinear modeling for distortion, overdrive, amplifier emulation and combination of linear and nonlinear audio effects.

jimi.jpg

You can listen to some audio samples here.

Details about the presentation:

Session: AASP-L6: Music Signal Analysis, Processing and Synthesis
Location: Meeting Room 1
Time: Thursday, May 16, 09:20 – 09:40 (Approximate)

Title: Modeling Nonlinear Audio Effects With End-to-end Deep Neural Networks
Authors: Marco A. Martinez Ramirez, Joshua D. Reiss

Congratulations Dr. Rod Selfridge!

This afternoon one of our PhD student researchers, Rod Selfridge, successfully defended his PhD. The form of these exams, or vivas, varies from country to country, and even institution to institution, which we discussed previously. Here, its pretty gruelling; behind closed doors, with two expert examiners probing every aspect of the PhD.

Rod’s PhD was on ‘Real-time sound synthesis of aeroacoustic sounds using physical models.’ Aeroacoustic sounds are those generated from turbulent fluid motion or aerodynamic forces, like wind whistling or the swoosh of a sword. But when researchers simulate such phenomena, they usually use highly computational approaches. If you need to analyse airplane noise, it might be okay to spend hours of computing time for a few seconds of sound, but you can’t use that approach in games or virtual reality. The alternative is procedural audio, which involves real-time and controllable sound generation. But that is usually not based on the actual physics that generated the sound. For complicated sounds, at best it is inspired by the physics.

Rod wondered if physical models could be implemented in a procedural audio context. For this, he took a fairly novel approach. Physical modelling often involves gridding up a space and looking at the interaction between each grid element , such as in finite difference time domain methods. But there are equations explaining many aspects of aeroacoustics, so why not build them directly into the model. This is like the difference between modelling a bouncing ball by building a dynamic model of the space in which it could move, or you could just apply Newton’s laws of motion. And Rod took the latter approach. Here’s a slick video summarising what the PhD is about,

It worked. He was able to apply real-time, interactive physical models of propeller sounds, aeolian tones, cavity tones, edge tones, the aeolian harp, a bullroarer. He won the Silver Design Award and Best Student Paper Award at the 141st AES Convention, and the Best Paper Award at the Sound and Music Computing conference. And he produced some great demonstration videos of his work, like

and

and

and

Rod also contributed a lot of great blog entries,

So congratulations to Dr. Rod Selfridge, and best of luck with his future endeavours. 🙂

This is the first blog entry I’ve written for a graduating PhD student. I really should do it for all of them- they’ve all been doing great stuff.

And finally, here’s a list of all Rod’s papers as a member of the Intelligent Sound Engineering team.

·        R. Selfridge, D. Moffat, E. Avital and J. D. Reiss, ‘Creating Real-Time Aeroacoustic Sound Effects Using Physically Derived Models,’ Journal of the Audio Engineering Society, 66 (7/8), pp. 594–607, July/August 2018, DOI: https://doi.org/10.17743/jaes.2018.0033

·        R. Selfridge, D. Moffat and J. D. Reiss, ‘Sound Synthesis of Objects Swinging through Air Using Physical Models,’ Applied Sciences, v. 7 (11), Nov. 2017, Online version doi:10.3390/app7111177

·        R. Selfridge, J. D. Reiss, E. Avital, Physically Derived Synthesis Model of an Edge Tone, Audio Engineering Society Convention 144, May 2018

·        R. Selfridge, D. Moffat and J. D. Reiss, ‘Physically Derived Sound Synthesis Model of a Propeller,’ Audio Mostly, London, 2017

·        R. Selfridge, D. Moffat and J. D. Reiss, ‘Physically Derived Synthesis Model of a Cavity Tone,’ Digital Audio Effects (DAFx) Conf., Edinburgh, September 5–9, 2017

·        R. Selfridge, D. J. Moffat and J. D. Reiss, ‘Real-time physical model for synthesis of sword swing sounds,’ Best paper award, Sound and Music Computing (SMC), Helsinki, July 5-8, 2017.

·        R. Selfridge, D. J. Moffat, E. Avital, and J. D. Reiss, ‘Real-time physical model of an Aeolian harp,’ 24th International Congress on Sound and Vibration (ICSV), London, July 23-27, 2017.

·        R. Selfridge, J. D. Reiss, E. Avital, and X. Tang, “Physically derived synthesis model of aeolian tones,” winner of the Best Student Paper award, 141st Audio Engineering Society Convention USA, 2016.

·        R. Selfridge and J. D. Reiss, Interactive Mixing Using the Wii Controller, AES 130th Convention, May 2011.

Aeroacoustic Sound Effects – Journal Article

I am delighted to be able to announce that my article on Creating Real-Time Aeroacoustic Sound Effects Using Physically Informed Models is in this months Journal of the Audio Engineering Society. This is an invited article following winning the best paper award at the Audio Engineering Society 141st Convention in LA. It is an open access article so free for all to download!

The article extends the original paper by examining how the Aeolian tone synthesis models can be used to create a number of sound effects. The benefits of these models are that the produce plausible sound effects which operate in real-time. Users are presented with a number of highly relevant parameters to control the effects which can be mapped directly to 3D models within game engines.

The basics of the Aeolian tone were given in a previous blog post. To summarise, a tone is generated when air passes around an object and vortices are shed behind it. Fluid dynamic equations are available which allow a prediction of the tone frequency based on the physics of the interaction between the air and object. The Aeolian tone is modelled as a compact sound source.

To model a sword or similar object a number of these compact sound sources are placed in a row. A previous blog post describes this in more detail. The majority of compact sound sources are placed at the tip as this is where the airspeed is greatest and the greatest sound is generated.

The behaviour of a sword when being swung has to be modelled which then used to control some of the parameters in the equations. This behaviour can be controlled by a game engine making fully integrated procedural audio models.

The sword model was extended to include objects like a baseball bat and golf club, as well as a broom handle. The compact sound source of a cavity tone was also added in to replicate swords which have grooved profiles. Subjective evaluation gave excellent results, especially for thicker objects which were perceived as plausible as pre-recorded samples.

The synthesis model could be extended to look at a range of sword cross sections as well as any influence of the material of the sword. It is envisaged that other sporting equipment which swing or fly through the air could be modelled using compact sound sources.

A propeller sound is one which is common in games and film and partially based on the sounds generated from the Aeolian tone and vortex shedding. As a blade passes through the air vortices are shed at a specific frequency along the length. To model individual propeller blades the profiles of a number were obtained with specific span length (centre to tip) and chord lengths (leading edge to trailing edge).

Another major sound source is the loading sounds generated by the torque and thrust. A procedure for modelling these sounds is outlined in the article. Missing from the propeller model is distortion sounds. These are more associated with rotors which turn in the horizontal plane.

An important sound when hearing a propeller powered aircraft is the engine sound. The one taken for this model was based on one of Andy Farnell’s from his book Designing Sound. Once complete a user is able to select an aircraft from a pre-programmed bank and set the flight path. If linked to a game engine the physical dimensions and flight paths can all be controlled procedurally.

Listening tests indicate that the synthesis model was as plausible as an alternative method but still not as plausible as pre-recorded samples. It is believed that results may have been more favourable if modelling electric-powered drones and aircraft which do not have the sound of a combustion engine.

The final model exploring the use of the Aeolian tone was that of an Aeolian Harp. This is a musical instrument that is activated by wind blowing around the strings. The vortices that are shed behind the string can activate a mechanical vibration if they are around the frequency of one of the strings natural harmonics. This produces a distinctive sound.

The digital model allows a user to synthesis a harp of up to 13 strings. Tension, mass density, length and diameter can all be adjusted to replicate a wide variety of string material and harp size. Users can also control a wind model modified from one presented in Andy Farnell’s book Designing Sound, with control over the amount of gusts. Listening tests indicate that the sound is not as plausible as pre-recorded ones but is as plausible as alternative synthesis methods.

The article describes the design processes in more detail as well as the fluid dynamic principles each was developed from. All models developed are open source and implemented in pure data. Links to these are in the paper as well as my previous publications. Demo videos can be found on YouTube.

FAST Industry Day, Thursday 25 October, 2 – 8 pm, Abbey Road Studios

FAST: Fusing Audio and Semantic Technologies for Intelligent Music Production and Consumption

Music’s changing fast: FAST is changing music.Showcasing the culmination of five years of digital music research, FAST (Fusing Audio and Semantic Technologies) is hosting an invite onlyindustry day at Abbey Road Studios on Thursday 25 October, 2 – 8 pm.

Presented by Professor Mark Sandler, Director of the Centre for Digital Music at Queen Mary University of London, the event will showcase to artists, journalists and industry professionals the next generation technologies that will shape the music industry – from production to consumption.

Projects on show include MusicLynx – a web AI app for journeys of discovery through the universe of music; Climb! – an intelligent game-based music composition and performance; FXive – a new start-up for sound effect synthesis; tools for interlinking composition and immersive experience, and many more. Plus, Grateful Live – a unique online fanzine for Grateful Dead fans to learn more about the band, the music and their concerts. All the research comes from the labs of 3 of the UK’s top universities:  Queen Mary’s Centre for Digital Music, Nottingham’s Mixed Reality Lab and Oxford’s e-Research Centre.

The programme will consist of an exciting afternoon and evening of talks and demonstrations, and an expert panel discussion.

Please contact Dr. Jasmina Bolfek-Radovani, FAST Programme Manager at fast-impact@qmul.ac.uk if you are interested in attending the event.

Capacity is limited and your place cannot be guaranteed.

For further details about the FAST project, visit: www.semanticaudio.ac.uk

You can also follow us on twitter @semanticaudio