What we did in 2018

2018 is coming to an end, and everyone is rushing to get their ‘Year in Review’ articles out. We’re no different in that regard. Only we’re going to do it in two parts, first what we have been doing this year, and then a second blog entry reviewing all the great breakthroughs and interesting research results in audio engineering, psychoacoustics, sound synthesis and related fields.

But first, lets talk about us. 🙂

I think we’ve all done some wonderful research this year, and the Audio Engineering team here can be proud of the results and progress.

Social Media:

First off, we’ve increased our social media presence tremendously,

• This blog, intelligentsoundengineering.wordpress.com/ has almost 22,000 views, with  1,711 followers, mostly through other social media.

• Our twitter account, twitter.com/IntelSoundEng has 886 followers. Not huge, but growing and doing well a research-focused feed.

• Our Youtube channel, www.youtube.com/user/IntelligentSoundEng has over 20,000 views and 206 subscribers. Which reminds me, I’ve got some more videos to put up.

If you haven’t already, subscribe to the feeds and tell your friends 😉 .

Awards:

Last year’s three awards was exceptional. This year I won Queen Mary University of London’s Bruce Dickinson Entrepreneur of the Year award. Here’s a little video featuring all the shortlisted nominees (I start about 50 seconds in).

I gave the keynote talk at this year’s Digital Audio Effects Conference. And not exactly an award, but still a big deal. I gave my inaugural professorship lecture, titled Do you hear what I hear? The science of everyday sounds.

People:

This was the year everyone graduated!

David Moffat, Yonghao Wang, Dave Ronan, Josh Mycroft, and Rod Selfridge  all successfully defended their PhDs. They did amazing and are all continuing to impress.

Parham Bahadoran and Tom Vassallo started exciting positions at AI Music, and Brecht de Man started with Semantic Audio. Expect great things from both those companies. There’s lots of others who moved around- too many to mention.

Grants and projects:

We finished the Cross-adaptive processing for musical intervention project  and the Autonomous Systems for Sound Integration and GeneratioN (ASSIGN) InnovateUK project. We’ve been working closely with industry on a variety of projects, especially with RPPtv, who are funding Emmanouil Chourdakis’s PhD and collaborated on InnovateUK projects. We are starting a very interesting ICASE Studentship with BBC- more on that in another entry, and may soon start a studentship with Yamaha. We formed the spin-out company FXive, which hopefully will be able to launch product soon.

Publications:

We had a great year for publications. I’ve listed all the ones I can think of below.

Journal articles

  1. Hu, W., Ma, T., Wang, Y., Xu, F., & Reiss, J. (2018). TDCS: a new scheduling framework for real-time multimedia OS. International Journal of Parallel, Emergent and Distributed Systems, 1-16.
  2. R. Selfridge, D. Moffat, E. Avital and J. D. Reiss, ‘Creating Real-Time Aeroacoustic Sound Effects Using Physically Derived Models,’ Journal of the Audio Engineering Society, 66 (7/8), pp. 594–607, July/August 2018, DOI: https://doi.org/10.17743/jaes.2018.0033
  3. J. D. Reiss, Ø. Brandtsegg, ‘Applications of cross-adaptive audio effects: automatic mixing, live performance and everything in between,’ Frontiers in Digital Humanities, 5 (17), 28 June 2018
  4. D. Moffat and J. D. Reiss, ‘Perceptual Evaluation of Synthesized Sound Effects,’ ACM Transactions on Applied Perception, 15 (2), April 2018
  5. Milo, Alessia, Nick Bryan-Kinns, and Joshua D. Reiss. “Graphical Research Tools for Acoustic Design Training: Capturing Perception in Architectural Settings” In Handbook of Research on Perception-Driven Approaches to Urban Assessment and Design, pp. 397-434. IGI Global, 2018.
  6. H. Peng and J. D. Reiss, ‘Why Can You Hear a Difference between Pouring Hot and Cold Water? An Investigation of Temperature Dependence in Psychoacoustics,’ 145th AES Convention, New York, Oct. 2018
  7. N. Jillings, B. De Man, R. Stables, J. D. Reiss, ‘Investigation into the Effects of Subjective Test Interface Choice on the Validity of Results.’ 145th AES Convention, New York, Oct. 2018
  8. P. Bahadoran, A. Benito, W. Buchanan and J. D. Reiss, “FXive: investigation and implementation of a sound effect synthesis service,” Amsterdam, International Broadcasting Convention (IBC), 2018
  9. M. A. Martinez Ramirez and J. D. Reiss, ‘End-to-end equalization with convolutional neural networks,’ Digital Audio Effects (DAFx), Aveiro, Portugal, Sept. 4–8 2018.
  10. D. Moffat and J. D. Reiss, “Objective Evaluations of Synthesised Environmental Sounds,” Digital Audio Effects (DAFx), Aveiro, Portugal, Sept. 4–8 2018
  11. W. J. Wilkinson, J. D. Reiss, D. Stowell, ‘A Generative Model for Natural Sounds Based on Latent Force Modelling,’ Arxiv pre-print version. International Conference on Latent Variable Analysis and Signal Separation, Guildford, UK, July 2018
  12. E. T. Chourdakis and J. D. Reiss, ‘From my pen to your ears: automatic production of radio plays from unstructured story text,’ 15th Sound and Music Computing Conference (SMC), Limassol, Cyprus, 4-7 July, 2018
  13. R. Selfridge, J. D. Reiss, E. Avital, Physically Derived Synthesis Model of an Edge Tone, Audio Engineering Society Convention 144, May 2018
  14. A. Pras, B. De Man, J. D Reiss, A Case Study of Cultural Influences on Mixing Practices, Audio Engineering Society Convention 144, May 2018
  15. J. Flynn, J. D. Reiss, Improving the Frequency Response Magnitude and Phase of Analogue-Matched Digital Filters, Audio Engineering Society Convention 144, May 2018
  16. P. Bahadoran, A. Benito, T. Vassallo, J. D. Reiss, FXive: A Web Platform for Procedural Sound Synthesis, Audio Engineering Society Convention 144, May 2018

 

See you in 2019!

Advertisements

Congratulations Dr. Rod Selfridge!

This afternoon one of our PhD student researchers, Rod Selfridge, successfully defended his PhD. The form of these exams, or vivas, varies from country to country, and even institution to institution, which we discussed previously. Here, its pretty gruelling; behind closed doors, with two expert examiners probing every aspect of the PhD.

Rod’s PhD was on ‘Real-time sound synthesis of aeroacoustic sounds using physical models.’ Aeroacoustic sounds are those generated from turbulent fluid motion or aerodynamic forces, like wind whistling or the swoosh of a sword. But when researchers simulate such phenomena, they usually use highly computational approaches. If you need to analyse airplane noise, it might be okay to spend hours of computing time for a few seconds of sound, but you can’t use that approach in games or virtual reality. The alternative is procedural audio, which involves real-time and controllable sound generation. But that is usually not based on the actual physics that generated the sound. For complicated sounds, at best it is inspired by the physics.

Rod wondered if physical models could be implemented in a procedural audio context. For this, he took a fairly novel approach. Physical modelling often involves gridding up a space and looking at the interaction between each grid element , such as in finite difference time domain methods. But there are equations explaining many aspects of aeroacoustics, so why not build them directly into the model. This is like the difference between modelling a bouncing ball by building a dynamic model of the space in which it could move, or you could just apply Newton’s laws of motion. And Rod took the latter approach. Here’s a slick video summarising what the PhD is about,

It worked. He was able to apply real-time, interactive physical models of propeller sounds, aeolian tones, cavity tones, edge tones, the aeolian harp, a bullroarer. He won the Silver Design Award and Best Student Paper Award at the 141st AES Convention, and the Best Paper Award at the Sound and Music Computing conference. And he produced some great demonstration videos of his work, like

and

and

and

Rod also contributed a lot of great blog entries,

So congratulations to Dr. Rod Selfridge, and best of luck with his future endeavours. 🙂

This is the first blog entry I’ve written for a graduating PhD student. I really should do it for all of them- they’ve all been doing great stuff.

And finally, here’s a list of all Rod’s papers as a member of the Intelligent Sound Engineering team.

·        R. Selfridge, D. Moffat, E. Avital and J. D. Reiss, ‘Creating Real-Time Aeroacoustic Sound Effects Using Physically Derived Models,’ Journal of the Audio Engineering Society, 66 (7/8), pp. 594–607, July/August 2018, DOI: https://doi.org/10.17743/jaes.2018.0033

·        R. Selfridge, D. Moffat and J. D. Reiss, ‘Sound Synthesis of Objects Swinging through Air Using Physical Models,’ Applied Sciences, v. 7 (11), Nov. 2017, Online version doi:10.3390/app7111177

·        R. Selfridge, J. D. Reiss, E. Avital, Physically Derived Synthesis Model of an Edge Tone, Audio Engineering Society Convention 144, May 2018

·        R. Selfridge, D. Moffat and J. D. Reiss, ‘Physically Derived Sound Synthesis Model of a Propeller,’ Audio Mostly, London, 2017

·        R. Selfridge, D. Moffat and J. D. Reiss, ‘Physically Derived Synthesis Model of a Cavity Tone,’ Digital Audio Effects (DAFx) Conf., Edinburgh, September 5–9, 2017

·        R. Selfridge, D. J. Moffat and J. D. Reiss, ‘Real-time physical model for synthesis of sword swing sounds,’ Best paper award, Sound and Music Computing (SMC), Helsinki, July 5-8, 2017.

·        R. Selfridge, D. J. Moffat, E. Avital, and J. D. Reiss, ‘Real-time physical model of an Aeolian harp,’ 24th International Congress on Sound and Vibration (ICSV), London, July 23-27, 2017.

·        R. Selfridge, J. D. Reiss, E. Avital, and X. Tang, “Physically derived synthesis model of aeolian tones,” winner of the Best Student Paper award, 141st Audio Engineering Society Convention USA, 2016.

·        R. Selfridge and J. D. Reiss, Interactive Mixing Using the Wii Controller, AES 130th Convention, May 2011.

Audiology and audio production PhD studentship available for UK residents

BBC R&D and Queen Mary University of London’s School of Electronic Engineering and Computer Science have an ICASE PhD studentship available for a talented researcher. It will involve researching the idea of intelligent mixing of broadcast audio content for hearing impaired audiences.

Perceptual Aspects of Broadcast Audio Mixing for Hearing Impaired Audiences

Project Description

This project will explore new approaches to audio production to address hearing loss, a growing concern with an aging population. The overall goal is to investigate, implement and validate original strategies for mixing broadcast content such that it can be delivered with improved perceptual quality for hearing impaired people.

Soundtracks for television and radio content typically have dialogue, sound effects and music mixed together with normal-hearing listeners in mind. But a hearing impairment may result in this final mix sounding muddy and cluttered. First, hearing aid strategies will be investigated, to establish their limitations and opportunities for improving upon them with object- based audio content. Then different mixing strategies will be implemented to counteract the hearing impairment. These strategies will be compared against each other in extensive listening tests, to establish preferred approaches to mixing broadcast audio content.

Requirements and details

This is a fully funded, 4 year studentship which includes tuition fees, travel and consumables allowance and a stipend covering living expenses.

Skills in signal processing, audio production and auditory models are preferred, though we encourage any interested and talented researchers to apply. A successful candidate will have an academic background in engineering, science or maths.

The student will be based in London. Time will be spent  between QMUL’s Audio Engineering team (the people behind this blog) in the Centre for Digital Music and BBC R&D South Lab, with a minimum of six months at each.

The preferred start date is January 2nd, 2019.
All potential candidates must meet UK residency requirements, e.g. normally EU citizen with long-term residence in the UK. Please check the regulations if you’re unsure.

If interested, please contact Prof. Josh Reiss at joshua.reiss@qmul.ac.uk .

Sneak preview of the research to be unveiled at the 145th Audio Engineering Society

max-audio-logo2[1]

We’ve made it a tradition on this blog to preview the technical program at the Audio Engineering Society Conventions, as we did with the 142nd, 143rd, and 144th AES Conventions. The 145th AES  convention is just around the corner, October 17 to 20 in New York. As before, the Audio Engineering research team behind this blog will be quite active at the convention.

These conventions have thousands of attendees, but aren’t so large that you get lost or overwhelmed. Away from the main exhibition hall is the Technical Program, which includes plenty of tutorials and presentations on cutting edge research.

So we’ve gathered together some information about a lot of the events that caught our eye as being unusual, exceptionally high quality involved in, attending, or just worth mentioning. And this Convention will certainly live up to the hype. Plus, its a special one, the 70th anniversary of the founding of the AES.

By the way, I don’t think I mention a single loudspeaker paper below, but the Technical Program is full of them this time. You could have a full conference just on loudspeakers from them. If you want to become an expert on loudspeaker research, this is the place to be.

Anyway, lets dive right in.

Wednesday, October 17th

We know different cultures listen to music differently, but do they listen to audio coding artifacts differently? Find out at 9:30 when Sascha Disch and co-authors present On the Influence of Cultural Differences on the Perception of Audio Coding Artifacts in Music.

ABX, AB, MUSHRA… so many choices for subjective evaluation and listening tests, so little time. Which one to use, which one gives the strongest results? Lets put them all to the test while looking at the same question. This is what was done in Investigation into the Effects of Subjective Test Interface Choice on the Validity of Results, presented at 11:30. The results are strong, and surprising. Authors include former members of the team behind this blog, Nick Jillings and Brecht de Man, myself and frequent collaborator Ryan Stables.

From 10-11:30, Steve Fenton will be presenting the poster Automatic Mixing of Multitrack Material Using Modified Loudness Models. Automatic mixing is a really hot research area, one where we’ve made quite a few contributions. And a lot of it has involved loudness models for level balancing or fader settings. Someone really should do a review of all the papers focused on that, or better yet, a meta-analysis. Dr. Fenton and co-authors also have another poster in the same session, about a Real-Time System for the Measurement of Perceived Punch. Fenton’s PhD was about perception and modelling of punchiness in audio, and I suggested to him that the thesis should have just been titled ‘Punch!’

The researchers from Harman continue their analysis of headphone preference and quality with A Survey and Analysis of Consumer and Professional Headphones Based on Their Objective and Subjective Performances at 3:30. Harman obviously have a strong interest in this, but its rigorous, high quality research, not promotion.

In the 3:00 to 4:30 poster session, Daniel Johnston presents a wonderful spatial audio application, SoundFields: A Mixed Reality Spatial Audio Game for Children with Autism Spectrum Disorder. I’m pretty sure this isn’t the quirky lo-fi singer/songwriter Daniel Johnston.

Thursday, October 18th

There’s something bizarre about the EBU R128 / ITU-R BS.1770 specification for loudness measurements. It doesn’t give the filter coefficients as a function of sample rate. So, for this and other reasons, even though the actual specification is just a few lines of code, you have to reverse engineer it if you’re doing it yourself, as was done here. At 10 am, Brecht de Man presents Evaluation of Implementations of the EBU R128 Loudness Measurement, which looks carefully at different implementations and provides full implementations in several programming languages.

Roughly one in six people in developed countries suffer some hearing impairment. If you think that seems too high, think how many wear glasses or contact lenses or had eye surgery. And given the sound exposure, I’d expect the average to be higher with music producers. But we need good data on this. Thus, Laura Sinnott’s 3 pm presentation on Risk of Sound-Induced Hearing Disorders for Audio Post Production Engineers: A Preliminary Study is particularly relevant.

Some interesting posters in the 2:45 to 4:15 session. Maree Sheehan’s Audio Portraiture –The Sound of Identity, an Indigenous Artistic Enquiry uses 3D immersive and binaural sound to create audio portraits of Maori women. Its a wonderful use of state of the art audio technologies for cultural and artistic study. Researchers from the University of Alcala in Madrid present an improved method to detect anger in speech in Precision Maximization in Anger Detection in Interactive Voice Response Systems.

Friday, October 19th

There’s plenty of interesting papers this day, but only one I’m highlighting. By coincidence, its my own presentation of work with He Peng, on Why Can You Hear a Difference between Pouring Hot and Cold Water? An Investigation of Temperature Dependence in Psychoacoustics. This was inspired by the curious phenomenon and initial investigations described in a previous blog entry.

Saturday, October 20th

Get there early on Saturday to find out about audio branding from a designer’s perspective in the 9 am Creative Approach to Audio in Corporate Brand Experiences.

Object-based audio allows broadcasters to deliver separate channels for sound effects, music and dialog, which can then be remixed on the client-side. This has high potential for delivering better sound for the hearing-impaired, as described in Lauren Ward’s Accessible Object-Based Audio Using Hierarchical Narrative Importance Metadata at 9:45. I’ve heard this demonstrated by the way, and it sounds amazing.

A big challenge with spatial audio systems is the rendering of sounds that are close to the listener. Descriptions of such systems almost always begin with ‘assume the sound source is in the far field.’ In the 10:30 to 12:00 poster session, researchers from the Chinese Academy of Science present a real advance in this subject with Near-Field Compensated Higher-Order Ambisonics Using a Virtual Source Panning Method.

Rob Maher is one of the world’s leading audio forensics experts. At 1:30 in Audio Forensic Gunshot Analysis and Multilateration, he looks at how to answer the question ‘Who shot first?’ from audio recordings. As is often the case in audio forensics, I suspect this paper was motivated by real court cases.

When visual cues disagree with auditory cues, which ones do you believe? Or conversely, does low quality audio seem more realistic if strengthened by visual cues? These sorts of questions are investigated at 2 pm in the large international collaboration Influence of Visual Content on the Perceived Audio Quality in Virtual Reality. Audio Engineering Society Conventions are full of original research, but survey and review papers are certainly welcomed, especially ones like the thorough and insightful HRTF Individualization: A Survey, presented at 2:30.

Standard devices for measuring auditory brainstem response are typically designed to work only with clicks or tone bursts. A team of researchers from Gdansk developed A Device for Measuring Auditory Brainstem Responses to Audio, presented in the 2:30 to 4 pm poster session.

 

Hopefully, I can also give a wrap-up after the Convention, as we did here and here.

Aeroacoustic Sound Effects – Journal Article

I am delighted to be able to announce that my article on Creating Real-Time Aeroacoustic Sound Effects Using Physically Informed Models is in this months Journal of the Audio Engineering Society. This is an invited article following winning the best paper award at the Audio Engineering Society 141st Convention in LA. It is an open access article so free for all to download!

The article extends the original paper by examining how the Aeolian tone synthesis models can be used to create a number of sound effects. The benefits of these models are that the produce plausible sound effects which operate in real-time. Users are presented with a number of highly relevant parameters to control the effects which can be mapped directly to 3D models within game engines.

The basics of the Aeolian tone were given in a previous blog post. To summarise, a tone is generated when air passes around an object and vortices are shed behind it. Fluid dynamic equations are available which allow a prediction of the tone frequency based on the physics of the interaction between the air and object. The Aeolian tone is modelled as a compact sound source.

To model a sword or similar object a number of these compact sound sources are placed in a row. A previous blog post describes this in more detail. The majority of compact sound sources are placed at the tip as this is where the airspeed is greatest and the greatest sound is generated.

The behaviour of a sword when being swung has to be modelled which then used to control some of the parameters in the equations. This behaviour can be controlled by a game engine making fully integrated procedural audio models.

The sword model was extended to include objects like a baseball bat and golf club, as well as a broom handle. The compact sound source of a cavity tone was also added in to replicate swords which have grooved profiles. Subjective evaluation gave excellent results, especially for thicker objects which were perceived as plausible as pre-recorded samples.

The synthesis model could be extended to look at a range of sword cross sections as well as any influence of the material of the sword. It is envisaged that other sporting equipment which swing or fly through the air could be modelled using compact sound sources.

A propeller sound is one which is common in games and film and partially based on the sounds generated from the Aeolian tone and vortex shedding. As a blade passes through the air vortices are shed at a specific frequency along the length. To model individual propeller blades the profiles of a number were obtained with specific span length (centre to tip) and chord lengths (leading edge to trailing edge).

Another major sound source is the loading sounds generated by the torque and thrust. A procedure for modelling these sounds is outlined in the article. Missing from the propeller model is distortion sounds. These are more associated with rotors which turn in the horizontal plane.

An important sound when hearing a propeller powered aircraft is the engine sound. The one taken for this model was based on one of Andy Farnell’s from his book Designing Sound. Once complete a user is able to select an aircraft from a pre-programmed bank and set the flight path. If linked to a game engine the physical dimensions and flight paths can all be controlled procedurally.

Listening tests indicate that the synthesis model was as plausible as an alternative method but still not as plausible as pre-recorded samples. It is believed that results may have been more favourable if modelling electric-powered drones and aircraft which do not have the sound of a combustion engine.

The final model exploring the use of the Aeolian tone was that of an Aeolian Harp. This is a musical instrument that is activated by wind blowing around the strings. The vortices that are shed behind the string can activate a mechanical vibration if they are around the frequency of one of the strings natural harmonics. This produces a distinctive sound.

The digital model allows a user to synthesis a harp of up to 13 strings. Tension, mass density, length and diameter can all be adjusted to replicate a wide variety of string material and harp size. Users can also control a wind model modified from one presented in Andy Farnell’s book Designing Sound, with control over the amount of gusts. Listening tests indicate that the sound is not as plausible as pre-recorded ones but is as plausible as alternative synthesis methods.

The article describes the design processes in more detail as well as the fluid dynamic principles each was developed from. All models developed are open source and implemented in pure data. Links to these are in the paper as well as my previous publications. Demo videos can be found on YouTube.

FAST Industry Day, Thursday 25 October, 2 – 8 pm, Abbey Road Studios

FAST: Fusing Audio and Semantic Technologies for Intelligent Music Production and Consumption

Music’s changing fast: FAST is changing music.Showcasing the culmination of five years of digital music research, FAST (Fusing Audio and Semantic Technologies) is hosting an invite onlyindustry day at Abbey Road Studios on Thursday 25 October, 2 – 8 pm.

Presented by Professor Mark Sandler, Director of the Centre for Digital Music at Queen Mary University of London, the event will showcase to artists, journalists and industry professionals the next generation technologies that will shape the music industry – from production to consumption.

Projects on show include MusicLynx – a web AI app for journeys of discovery through the universe of music; Climb! – an intelligent game-based music composition and performance; FXive – a new start-up for sound effect synthesis; tools for interlinking composition and immersive experience, and many more. Plus, Grateful Live – a unique online fanzine for Grateful Dead fans to learn more about the band, the music and their concerts. All the research comes from the labs of 3 of the UK’s top universities:  Queen Mary’s Centre for Digital Music, Nottingham’s Mixed Reality Lab and Oxford’s e-Research Centre.

The programme will consist of an exciting afternoon and evening of talks and demonstrations, and an expert panel discussion.

Please contact Dr. Jasmina Bolfek-Radovani, FAST Programme Manager at fast-impact@qmul.ac.uk if you are interested in attending the event.

Capacity is limited and your place cannot be guaranteed.

For further details about the FAST project, visit: www.semanticaudio.ac.uk

You can also follow us on twitter @semanticaudio

Cross-adaptive audio effects: automatic mixing, live performance and everything in between

Our paper on Applications of cross-adaptive audio effects: automatic mixing, live performance and everything in between has just been published in Frontiers in Digital Humanities. It is a systematic review of cross-adaptive audio effects and their applications.

Cross-adaptive effects extend the boundaries of traditional audio effects by having many inputs and outputs, and deriving their behavior based on analysis of the signals and their interaction. This allows the audio effects to adapt to different material, seemingly being aware of what they do and listening to the signals. Here’s a block diagram showing how a cross-adaptive audio effect modifies a signal.

cross-adaptive architecture

Last year, we published a paper reviewing the history of automatic mixing, almost exactly ten years to the day from when automatic mixing was first extended beyond simple gain changes for speech applications. These automatic mixing applications rely on cross-adaptive effects, but the effects can do so much more.

Here’s an example automatic mixing system from our youtube channel, IntelligentSoundEng.

When a musician uses the signals of other performers directly to inform the timbral character of her own instrument, it enables a radical expansion of interaction during music making. Exploring this was the goal of the Cross-adaptive processing for musical intervention project, led by Oeyvind Brandtsegg, which we discussed in an earlier blog entry. Using cross-adaptive audio effects, musicians can exert control over each the instruments and performance of other musicians, both leading to new competitive aspects and new synergies.

Here’s a short video demonstrating this.

Despite various projects, research and applications involving cross-adaptive audio effects, there is still a fair amount of confusion surrounding the topic. There are multiple definitions, sometimes even by the same authors. So this paper gives a brief history of applications as well as a classification of effects types and clarifies issues that have come up in earlier literature. It further defines the field, lays a formal framework, explores technical aspects and applications, and considers the future from artistic, perceptual, scientific and engineering perspectives.

Check it out!