Cool sound design and audio effects projects

Every year, I teach two classes (modules), Sound Design and Digital Audio Effects. In both classes, the final assignment involves creating an original work that involves audio programming and using concepts taught in class. But the students also have a lot of free reign to experiment and explore their own ideas. Last year, I had a well received blog entry about the projects.

The results are always great. Lots of really cool ideas, many of which could lead to a publication, or would be great to listen to regardless of the fact that it was an assignment. Here’s a few of the projects this year.

From the Sound Design class;

  • A truly novel abstract sound synthesis (amplitude and frequency modulation) where parameters are controlled by pitch recognition and face recognition machine learning models, using the microphone and the webcam. Users could use their voice and move their face around to affect the sound.
  • An impressive one had six sound models: rain, bouncing ball, sea waves, fire, wind and explosions. It also had a website where each synthesised sound could be compared against real recordings. We couldn’t always tell which was real and which was synthesised!

SoundSelect.png

  • An auditory model of a London Underground train, from the perspective of a passenger on a train, or waiting at a platform. It had a great animation.

train

  • Two projects involved creating interactive soundscapes auralising an image. One involved a famous photo taken by the photographer, Gregory Crewdson. encapsulating  a dark side of suburban America through surreal, cinematic imagery. The other was an estate area, where there are no bodies visible , giving the impression of an eerie atmosphere where background noises and small sounds are given prominence.

And from the Digital Audio Effects class;

  • A create-your-own distortion effect, where the user can interactively modify the wave shaping curve.
  • Input dependant modulation signal based on the physical mass/ spring system
  • A Swedish death metal guitar effect combining lots of effects for a very distinctive sound
  • A very creative all-in-one audio toy, ‘Ring delay’. This  augmented ping-pong delay effect gives controls over the panning of the delays, the equalization of the audio input and delays, and the output gain. Delays can be played backwards, and the output can be set out-of-phase. Finally, a ring modulator can modulate the audio input to create new sounds to be delayed.
  • Chordify, which transforms an incoming signal, ideally individual notes, into a chord of three different pitches.

chordify

  • An audio effects chain inspired by interning at a local radio station. The student helped the owner produce tracks using effects chain presets. But this producers understanding of compressors, EQ, distortion effects… was fairly limited. So the student recreated one of the effects chains into a plugin that only has two adjustable parameters which control multiple parameters inside. 
  • Old Styler, a plug-in that applies sort of a ‘vintage’ effect so that it sounds like from an old radio or an old, black and white movie. Here’s how it sounds.
  • There were some advanced reverbs, including a VST implementation of a state-of-the-art reverberation algorithm known as a Scattering Delay Network (SDN), and a Church reverb incorporating some additional effects to get that ‘church sound’ just right.
  • A pretty amazing cave simulator, with both reverb and random water droplet sounds as part of the VST plug-in.

CaveSimulator

  • A bit crusher, which also had noise, downsampling and filtering to allow lots of ways to degrade the signal.
  • A VST implementation of the Euclidian Algorithm for world rhythms as described by Goddfried Toussaint in his paper The Euclidean Algorithm Generates Traditional Musical Rhythms.
  • A mid/side processor, with excellent analysis to verify that the student got the implementation just right.
  • Multi-functional distortion pedal. Guitarists often compose music in their bedroom and would benefit from having an effect to facilitate filling the song with a range of sounds, traditionally belonging to other instruments. That’s what this plug-in did, using a lot of clever tricks to widen the soundstage of the guitar.
  • Related to the multi-functional distortion, two students created multiband distortion effects.
  • A Python project that separates a track into harmonic, percussive, and residual components which can be adjusted individually.
  • An effect that attempts to resynthesise any audio input with sine wave oscillators that take their frequencies from the well-tempered scale. This goes far beyond auto-tune, yet can be quite subtle.
  • A source separator plug-in based on Dan Barry’s ADRESS algorithm, described here and here. Along with Mikel Gainza, Dan Barry cofounded the company Sonic Ladder, which released the successful software Riffstation, based on their research.

There were many other interesting assignments, including several variations on tape emulation. But this selection really shows both the talent of the students and the possibilities to create new and interesting sounds.

Advertisements

What’s up with the Web Audio API?

Recently, we’ve been doing a lot of audio development for applications running in the browser, like with the procedural audio and sound synthesis system FXive, or the Web Audio Evaluation Tool (WAET). The Web Audio API is part of HTML5 and its a high level Application Programming Interface with a lot of built-in functions for processing and generating sound. The idea is that its what you need to have any audio application (audio effects, virtual instruments, editing and analysis tools…) running as javascript in a web browser.

It uses a dataflow model like LabView and media-focused languages like Max/MSP, Pure Data and Reaktor,. So you create oscillators, connect them to filters, combine them and then connect that to output to play out the sound. But unlike the others, its not graphical, since you write it as JavaScript like most code that runs client-side on a web browser.

Sounds great, right? And it is. But there were a lot of strange choices that went into the API. They don’t make it unusable or anything like that, but it does sometimes leave you kicking in frustration and thinking the coding would be so much easier if only… Here’s a few of them.

  • There’s no built-in noise signal generator. You can create sine waves, sawtooth waves, square waves… but not noise. Generating audio rate random numbers is built in to pretty much every other audio development environment, and in almost every web audio application I’ve seen, the developers have redone it themselves, with ScriptProcessors, AudioWorklets, buffered noise Classes or methods.
  • The low pass, high pass, low shelving and high shelving filters in the Web Audio API are not the standard first order designs, as taught in signal processing and described in [1, 2] and lots of references within. The low pass and high pass are resonant second order filters, and the shelving filters are the less common alternatives to the first order designs. This is ok for a lot of cases where you are developing a new application with a bit of filtering, but its a major pain if you’re writing a web version of something written in MATLAB, Puredata or lots and lots of other environments where the basic low and high filters are standard first order designs.
  • The oscillators come with a detune property that represents detuning of oscillation in hundredths of a semitone, or cents. I suppose its a nice feature if you are using cents on the interface and dealing with musical intervals. But its the same as changing the frequency parameter and doesn’t save a single line of code. There are other useful parameters which they didn’t give the ability to change, like phase, or the duty rate of a square wave. https://github.com/Flarp/better-oscillator is an alternative implementation that addresses this.
  • The square, sawtooth and triangle waves are not what you think they are. Instead of the triangle wave being a periodic ramp up and ramp down, they are the sum of a few terms in the Fourier series that approximate this. This is nice if you want to avoid aliasing, but wrong for every other use. It took me a long time to figure this out when I tried modulating a signal by a square wave to turn it on and off. Again, https://github.com/Flarp/better-oscillator gives an alternative implementation with the actual waveforms.
  • The API comes with a biquad filter that allows you to create almost arbitrary infinite impulse response filters. But you can’t change the coefficients once its created. So its useless for most web audio applications, which involve some control or interaction.

Despite all that, its pretty amazing. And you can get around all these issues since you can always write your own audio worklets for any audio processing and generation. But you shouldn’t have to.

We’ve published a few papers on the Web Audio API and what you can do with it, so please check them out if you are doing some R&D involving it.

 

[1] J. D. Reiss and A. P. McPherson, “Audio Effects: Theory, Implementation and Application“, CRC Press, 2014.

[2] V. Valimaki and J. D. Reiss, ‘All About Audio Equalization: Solutions and Frontiers,’ Applied Sciences, special issue on Audio Signal Processing, 6 (5), May 2016.

[3] P. Bahadoran, A. Benito, T. Vassallo, J. D. Reiss, FXive: A Web Platform for Procedural Sound Synthesis, Audio Engineering Society Convention 144, May 2018

[4] N. Jillings, Y. Wang, R. Stables and J. D. Reiss, ‘Intelligent audio plugin framework for the Web Audio API,’ Web Audio Conference, London, 2017

[5] N. Jillings, Y. Wang, J. D. Reiss and R. Stables, “JSAP: A Plugin Standard for the Web Audio API with Intelligent Functionality,” 141st Audio Engineering Society Convention, Los Angeles, USA, 2016.

[6] N. Jillings, D. Moffat, B. De Man, J. D. Reiss, R. Stables, ‘Web Audio Evaluation Tool: A framework for subjective assessment of audio,’ 2nd Web Audio Conf., Atlanta, 2016

[7] N. Jillings, B. De Man, D. Moffat and J. D. Reiss, ‘Web Audio Evaluation Tool: A Browser-Based Listening Test Environment,’ Sound and Music Computing (SMC), July 26 – Aug. 1, 2015

What we did in 2018

2018 is coming to an end, and everyone is rushing to get their ‘Year in Review’ articles out. We’re no different in that regard. Only we’re going to do it in two parts, first what we have been doing this year, and then a second blog entry reviewing all the great breakthroughs and interesting research results in audio engineering, psychoacoustics, sound synthesis and related fields.

But first, lets talk about us. 🙂

I think we’ve all done some wonderful research this year, and the Audio Engineering team here can be proud of the results and progress.

Social Media:

First off, we’ve increased our social media presence tremendously,

• This blog, intelligentsoundengineering.wordpress.com/ has almost 22,000 views, with  1,711 followers, mostly through other social media.

• Our twitter account, twitter.com/IntelSoundEng has 886 followers. Not huge, but growing and doing well a research-focused feed.

• Our Youtube channel, www.youtube.com/user/IntelligentSoundEng has over 20,000 views and 206 subscribers. Which reminds me, I’ve got some more videos to put up.

If you haven’t already, subscribe to the feeds and tell your friends 😉 .

Awards:

Last year’s three awards was exceptional. This year I won Queen Mary University of London’s Bruce Dickinson Entrepreneur of the Year award. Here’s a little video featuring all the shortlisted nominees (I start about 50 seconds in).

I gave the keynote talk at this year’s Digital Audio Effects Conference. And not exactly an award, but still a big deal. I gave my inaugural professorship lecture, titled Do you hear what I hear? The science of everyday sounds.

People:

This was the year everyone graduated!

David Moffat, Yonghao Wang, Dave Ronan, Josh Mycroft, and Rod Selfridge  all successfully defended their PhDs. They did amazing and are all continuing to impress.

Parham Bahadoran and Tom Vassallo started exciting positions at AI Music, and Brecht de Man started with Semantic Audio. Expect great things from both those companies. There’s lots of others who moved around- too many to mention.

Grants and projects:

We finished the Cross-adaptive processing for musical intervention project  and the Autonomous Systems for Sound Integration and GeneratioN (ASSIGN) InnovateUK project. We’ve been working closely with industry on a variety of projects, especially with RPPtv, who are funding Emmanouil Chourdakis’s PhD and collaborated on InnovateUK projects. We are starting a very interesting ICASE Studentship with BBC- more on that in another entry, and may soon start a studentship with Yamaha. We formed the spin-out company FXive, which hopefully will be able to launch product soon.

Publications:

We had a great year for publications. I’ve listed all the ones I can think of below.

Journal articles

  1. Hu, W., Ma, T., Wang, Y., Xu, F., & Reiss, J. (2018). TDCS: a new scheduling framework for real-time multimedia OS. International Journal of Parallel, Emergent and Distributed Systems, 1-16.
  2. R. Selfridge, D. Moffat, E. Avital and J. D. Reiss, ‘Creating Real-Time Aeroacoustic Sound Effects Using Physically Derived Models,’ Journal of the Audio Engineering Society, 66 (7/8), pp. 594–607, July/August 2018, DOI: https://doi.org/10.17743/jaes.2018.0033
  3. J. D. Reiss, Ø. Brandtsegg, ‘Applications of cross-adaptive audio effects: automatic mixing, live performance and everything in between,’ Frontiers in Digital Humanities, 5 (17), 28 June 2018
  4. D. Moffat and J. D. Reiss, ‘Perceptual Evaluation of Synthesized Sound Effects,’ ACM Transactions on Applied Perception, 15 (2), April 2018
  5. Milo, Alessia, Nick Bryan-Kinns, and Joshua D. Reiss. “Graphical Research Tools for Acoustic Design Training: Capturing Perception in Architectural Settings” In Handbook of Research on Perception-Driven Approaches to Urban Assessment and Design, pp. 397-434. IGI Global, 2018.
  6. H. Peng and J. D. Reiss, ‘Why Can You Hear a Difference between Pouring Hot and Cold Water? An Investigation of Temperature Dependence in Psychoacoustics,’ 145th AES Convention, New York, Oct. 2018
  7. N. Jillings, B. De Man, R. Stables, J. D. Reiss, ‘Investigation into the Effects of Subjective Test Interface Choice on the Validity of Results.’ 145th AES Convention, New York, Oct. 2018
  8. P. Bahadoran, A. Benito, W. Buchanan and J. D. Reiss, “FXive: investigation and implementation of a sound effect synthesis service,” Amsterdam, International Broadcasting Convention (IBC), 2018
  9. M. A. Martinez Ramirez and J. D. Reiss, ‘End-to-end equalization with convolutional neural networks,’ Digital Audio Effects (DAFx), Aveiro, Portugal, Sept. 4–8 2018.
  10. D. Moffat and J. D. Reiss, “Objective Evaluations of Synthesised Environmental Sounds,” Digital Audio Effects (DAFx), Aveiro, Portugal, Sept. 4–8 2018
  11. W. J. Wilkinson, J. D. Reiss, D. Stowell, ‘A Generative Model for Natural Sounds Based on Latent Force Modelling,’ Arxiv pre-print version. International Conference on Latent Variable Analysis and Signal Separation, Guildford, UK, July 2018
  12. E. T. Chourdakis and J. D. Reiss, ‘From my pen to your ears: automatic production of radio plays from unstructured story text,’ 15th Sound and Music Computing Conference (SMC), Limassol, Cyprus, 4-7 July, 2018
  13. R. Selfridge, J. D. Reiss, E. Avital, Physically Derived Synthesis Model of an Edge Tone, Audio Engineering Society Convention 144, May 2018
  14. A. Pras, B. De Man, J. D Reiss, A Case Study of Cultural Influences on Mixing Practices, Audio Engineering Society Convention 144, May 2018
  15. J. Flynn, J. D. Reiss, Improving the Frequency Response Magnitude and Phase of Analogue-Matched Digital Filters, Audio Engineering Society Convention 144, May 2018
  16. P. Bahadoran, A. Benito, T. Vassallo, J. D. Reiss, FXive: A Web Platform for Procedural Sound Synthesis, Audio Engineering Society Convention 144, May 2018

 

See you in 2019!

Congratulations Dr. Rod Selfridge!

This afternoon one of our PhD student researchers, Rod Selfridge, successfully defended his PhD. The form of these exams, or vivas, varies from country to country, and even institution to institution, which we discussed previously. Here, its pretty gruelling; behind closed doors, with two expert examiners probing every aspect of the PhD.

Rod’s PhD was on ‘Real-time sound synthesis of aeroacoustic sounds using physical models.’ Aeroacoustic sounds are those generated from turbulent fluid motion or aerodynamic forces, like wind whistling or the swoosh of a sword. But when researchers simulate such phenomena, they usually use highly computational approaches. If you need to analyse airplane noise, it might be okay to spend hours of computing time for a few seconds of sound, but you can’t use that approach in games or virtual reality. The alternative is procedural audio, which involves real-time and controllable sound generation. But that is usually not based on the actual physics that generated the sound. For complicated sounds, at best it is inspired by the physics.

Rod wondered if physical models could be implemented in a procedural audio context. For this, he took a fairly novel approach. Physical modelling often involves gridding up a space and looking at the interaction between each grid element , such as in finite difference time domain methods. But there are equations explaining many aspects of aeroacoustics, so why not build them directly into the model. This is like the difference between modelling a bouncing ball by building a dynamic model of the space in which it could move, or you could just apply Newton’s laws of motion. And Rod took the latter approach. Here’s a slick video summarising what the PhD is about,

It worked. He was able to apply real-time, interactive physical models of propeller sounds, aeolian tones, cavity tones, edge tones, the aeolian harp, a bullroarer. He won the Silver Design Award and Best Student Paper Award at the 141st AES Convention, and the Best Paper Award at the Sound and Music Computing conference. And he produced some great demonstration videos of his work, like

and

and

and

Rod also contributed a lot of great blog entries,

So congratulations to Dr. Rod Selfridge, and best of luck with his future endeavours. 🙂

This is the first blog entry I’ve written for a graduating PhD student. I really should do it for all of them- they’ve all been doing great stuff.

And finally, here’s a list of all Rod’s papers as a member of the Intelligent Sound Engineering team.

·        R. Selfridge, D. Moffat, E. Avital and J. D. Reiss, ‘Creating Real-Time Aeroacoustic Sound Effects Using Physically Derived Models,’ Journal of the Audio Engineering Society, 66 (7/8), pp. 594–607, July/August 2018, DOI: https://doi.org/10.17743/jaes.2018.0033

·        R. Selfridge, D. Moffat and J. D. Reiss, ‘Sound Synthesis of Objects Swinging through Air Using Physical Models,’ Applied Sciences, v. 7 (11), Nov. 2017, Online version doi:10.3390/app7111177

·        R. Selfridge, J. D. Reiss, E. Avital, Physically Derived Synthesis Model of an Edge Tone, Audio Engineering Society Convention 144, May 2018

·        R. Selfridge, D. Moffat and J. D. Reiss, ‘Physically Derived Sound Synthesis Model of a Propeller,’ Audio Mostly, London, 2017

·        R. Selfridge, D. Moffat and J. D. Reiss, ‘Physically Derived Synthesis Model of a Cavity Tone,’ Digital Audio Effects (DAFx) Conf., Edinburgh, September 5–9, 2017

·        R. Selfridge, D. J. Moffat and J. D. Reiss, ‘Real-time physical model for synthesis of sword swing sounds,’ Best paper award, Sound and Music Computing (SMC), Helsinki, July 5-8, 2017.

·        R. Selfridge, D. J. Moffat, E. Avital, and J. D. Reiss, ‘Real-time physical model of an Aeolian harp,’ 24th International Congress on Sound and Vibration (ICSV), London, July 23-27, 2017.

·        R. Selfridge, J. D. Reiss, E. Avital, and X. Tang, “Physically derived synthesis model of aeolian tones,” winner of the Best Student Paper award, 141st Audio Engineering Society Convention USA, 2016.

·        R. Selfridge and J. D. Reiss, Interactive Mixing Using the Wii Controller, AES 130th Convention, May 2011.

Audiology and audio production PhD studentship available for UK residents

BBC R&D and Queen Mary University of London’s School of Electronic Engineering and Computer Science have an ICASE PhD studentship available for a talented researcher. It will involve researching the idea of intelligent mixing of broadcast audio content for hearing impaired audiences.

Perceptual Aspects of Broadcast Audio Mixing for Hearing Impaired Audiences

Project Description

This project will explore new approaches to audio production to address hearing loss, a growing concern with an aging population. The overall goal is to investigate, implement and validate original strategies for mixing broadcast content such that it can be delivered with improved perceptual quality for hearing impaired people.

Soundtracks for television and radio content typically have dialogue, sound effects and music mixed together with normal-hearing listeners in mind. But a hearing impairment may result in this final mix sounding muddy and cluttered. First, hearing aid strategies will be investigated, to establish their limitations and opportunities for improving upon them with object- based audio content. Then different mixing strategies will be implemented to counteract the hearing impairment. These strategies will be compared against each other in extensive listening tests, to establish preferred approaches to mixing broadcast audio content.

Requirements and details

This is a fully funded, 4 year studentship which includes tuition fees, travel and consumables allowance and a stipend covering living expenses.

Skills in signal processing, audio production and auditory models are preferred, though we encourage any interested and talented researchers to apply. A successful candidate will have an academic background in engineering, science or maths.

The student will be based in London. Time will be spent  between QMUL’s Audio Engineering team (the people behind this blog) in the Centre for Digital Music and BBC R&D South Lab, with a minimum of six months at each.

The preferred start date is January 2nd, 2019.
All potential candidates must meet UK residency requirements, e.g. normally EU citizen with long-term residence in the UK. Please check the regulations if you’re unsure.

If interested, please contact Prof. Josh Reiss at joshua.reiss@qmul.ac.uk .

Sneak preview of the research to be unveiled at the 145th Audio Engineering Society

max-audio-logo2[1]

We’ve made it a tradition on this blog to preview the technical program at the Audio Engineering Society Conventions, as we did with the 142nd, 143rd, and 144th AES Conventions. The 145th AES  convention is just around the corner, October 17 to 20 in New York. As before, the Audio Engineering research team behind this blog will be quite active at the convention.

These conventions have thousands of attendees, but aren’t so large that you get lost or overwhelmed. Away from the main exhibition hall is the Technical Program, which includes plenty of tutorials and presentations on cutting edge research.

So we’ve gathered together some information about a lot of the events that caught our eye as being unusual, exceptionally high quality involved in, attending, or just worth mentioning. And this Convention will certainly live up to the hype. Plus, its a special one, the 70th anniversary of the founding of the AES.

By the way, I don’t think I mention a single loudspeaker paper below, but the Technical Program is full of them this time. You could have a full conference just on loudspeakers from them. If you want to become an expert on loudspeaker research, this is the place to be.

Anyway, lets dive right in.

Wednesday, October 17th

We know different cultures listen to music differently, but do they listen to audio coding artifacts differently? Find out at 9:30 when Sascha Disch and co-authors present On the Influence of Cultural Differences on the Perception of Audio Coding Artifacts in Music.

ABX, AB, MUSHRA… so many choices for subjective evaluation and listening tests, so little time. Which one to use, which one gives the strongest results? Lets put them all to the test while looking at the same question. This is what was done in Investigation into the Effects of Subjective Test Interface Choice on the Validity of Results, presented at 11:30. The results are strong, and surprising. Authors include former members of the team behind this blog, Nick Jillings and Brecht de Man, myself and frequent collaborator Ryan Stables.

From 10-11:30, Steve Fenton will be presenting the poster Automatic Mixing of Multitrack Material Using Modified Loudness Models. Automatic mixing is a really hot research area, one where we’ve made quite a few contributions. And a lot of it has involved loudness models for level balancing or fader settings. Someone really should do a review of all the papers focused on that, or better yet, a meta-analysis. Dr. Fenton and co-authors also have another poster in the same session, about a Real-Time System for the Measurement of Perceived Punch. Fenton’s PhD was about perception and modelling of punchiness in audio, and I suggested to him that the thesis should have just been titled ‘Punch!’

The researchers from Harman continue their analysis of headphone preference and quality with A Survey and Analysis of Consumer and Professional Headphones Based on Their Objective and Subjective Performances at 3:30. Harman obviously have a strong interest in this, but its rigorous, high quality research, not promotion.

In the 3:00 to 4:30 poster session, Daniel Johnston presents a wonderful spatial audio application, SoundFields: A Mixed Reality Spatial Audio Game for Children with Autism Spectrum Disorder. I’m pretty sure this isn’t the quirky lo-fi singer/songwriter Daniel Johnston.

Thursday, October 18th

There’s something bizarre about the EBU R128 / ITU-R BS.1770 specification for loudness measurements. It doesn’t give the filter coefficients as a function of sample rate. So, for this and other reasons, even though the actual specification is just a few lines of code, you have to reverse engineer it if you’re doing it yourself, as was done here. At 10 am, Brecht de Man presents Evaluation of Implementations of the EBU R128 Loudness Measurement, which looks carefully at different implementations and provides full implementations in several programming languages.

Roughly one in six people in developed countries suffer some hearing impairment. If you think that seems too high, think how many wear glasses or contact lenses or had eye surgery. And given the sound exposure, I’d expect the average to be higher with music producers. But we need good data on this. Thus, Laura Sinnott’s 3 pm presentation on Risk of Sound-Induced Hearing Disorders for Audio Post Production Engineers: A Preliminary Study is particularly relevant.

Some interesting posters in the 2:45 to 4:15 session. Maree Sheehan’s Audio Portraiture –The Sound of Identity, an Indigenous Artistic Enquiry uses 3D immersive and binaural sound to create audio portraits of Maori women. Its a wonderful use of state of the art audio technologies for cultural and artistic study. Researchers from the University of Alcala in Madrid present an improved method to detect anger in speech in Precision Maximization in Anger Detection in Interactive Voice Response Systems.

Friday, October 19th

There’s plenty of interesting papers this day, but only one I’m highlighting. By coincidence, its my own presentation of work with He Peng, on Why Can You Hear a Difference between Pouring Hot and Cold Water? An Investigation of Temperature Dependence in Psychoacoustics. This was inspired by the curious phenomenon and initial investigations described in a previous blog entry.

Saturday, October 20th

Get there early on Saturday to find out about audio branding from a designer’s perspective in the 9 am Creative Approach to Audio in Corporate Brand Experiences.

Object-based audio allows broadcasters to deliver separate channels for sound effects, music and dialog, which can then be remixed on the client-side. This has high potential for delivering better sound for the hearing-impaired, as described in Lauren Ward’s Accessible Object-Based Audio Using Hierarchical Narrative Importance Metadata at 9:45. I’ve heard this demonstrated by the way, and it sounds amazing.

A big challenge with spatial audio systems is the rendering of sounds that are close to the listener. Descriptions of such systems almost always begin with ‘assume the sound source is in the far field.’ In the 10:30 to 12:00 poster session, researchers from the Chinese Academy of Science present a real advance in this subject with Near-Field Compensated Higher-Order Ambisonics Using a Virtual Source Panning Method.

Rob Maher is one of the world’s leading audio forensics experts. At 1:30 in Audio Forensic Gunshot Analysis and Multilateration, he looks at how to answer the question ‘Who shot first?’ from audio recordings. As is often the case in audio forensics, I suspect this paper was motivated by real court cases.

When visual cues disagree with auditory cues, which ones do you believe? Or conversely, does low quality audio seem more realistic if strengthened by visual cues? These sorts of questions are investigated at 2 pm in the large international collaboration Influence of Visual Content on the Perceived Audio Quality in Virtual Reality. Audio Engineering Society Conventions are full of original research, but survey and review papers are certainly welcomed, especially ones like the thorough and insightful HRTF Individualization: A Survey, presented at 2:30.

Standard devices for measuring auditory brainstem response are typically designed to work only with clicks or tone bursts. A team of researchers from Gdansk developed A Device for Measuring Auditory Brainstem Responses to Audio, presented in the 2:30 to 4 pm poster session.

 

Hopefully, I can also give a wrap-up after the Convention, as we did here and here.

Cross-adaptive audio effects: automatic mixing, live performance and everything in between

Our paper on Applications of cross-adaptive audio effects: automatic mixing, live performance and everything in between has just been published in Frontiers in Digital Humanities. It is a systematic review of cross-adaptive audio effects and their applications.

Cross-adaptive effects extend the boundaries of traditional audio effects by having many inputs and outputs, and deriving their behavior based on analysis of the signals and their interaction. This allows the audio effects to adapt to different material, seemingly being aware of what they do and listening to the signals. Here’s a block diagram showing how a cross-adaptive audio effect modifies a signal.

cross-adaptive architecture

Last year, we published a paper reviewing the history of automatic mixing, almost exactly ten years to the day from when automatic mixing was first extended beyond simple gain changes for speech applications. These automatic mixing applications rely on cross-adaptive effects, but the effects can do so much more.

Here’s an example automatic mixing system from our youtube channel, IntelligentSoundEng.

When a musician uses the signals of other performers directly to inform the timbral character of her own instrument, it enables a radical expansion of interaction during music making. Exploring this was the goal of the Cross-adaptive processing for musical intervention project, led by Oeyvind Brandtsegg, which we discussed in an earlier blog entry. Using cross-adaptive audio effects, musicians can exert control over each the instruments and performance of other musicians, both leading to new competitive aspects and new synergies.

Here’s a short video demonstrating this.

Despite various projects, research and applications involving cross-adaptive audio effects, there is still a fair amount of confusion surrounding the topic. There are multiple definitions, sometimes even by the same authors. So this paper gives a brief history of applications as well as a classification of effects types and clarifies issues that have come up in earlier literature. It further defines the field, lays a formal framework, explores technical aspects and applications, and considers the future from artistic, perceptual, scientific and engineering perspectives.

Check it out!