Cross-adaptive audio effects: automatic mixing, live performance and everything in between

Our paper on Applications of cross-adaptive audio effects: automatic mixing, live performance and everything in between has just been published in Frontiers in Digital Humanities. It is a systematic review of cross-adaptive audio effects and their applications.

Cross-adaptive effects extend the boundaries of traditional audio effects by having many inputs and outputs, and deriving their behavior based on analysis of the signals and their interaction. This allows the audio effects to adapt to different material, seemingly being aware of what they do and listening to the signals. Here’s a block diagram showing how a cross-adaptive audio effect modifies a signal.

cross-adaptive architecture

Last year, we published a paper reviewing the history of automatic mixing, almost exactly ten years to the day from when automatic mixing was first extended beyond simple gain changes for speech applications. These automatic mixing applications rely on cross-adaptive effects, but the effects can do so much more.

Here’s an example automatic mixing system from our youtube channel, IntelligentSoundEng.

When a musician uses the signals of other performers directly to inform the timbral character of her own instrument, it enables a radical expansion of interaction during music making. Exploring this was the goal of the Cross-adaptive processing for musical intervention project, led by Oeyvind Brandtsegg, which we discussed in an earlier blog entry. Using cross-adaptive audio effects, musicians can exert control over each the instruments and performance of other musicians, both leading to new competitive aspects and new synergies.

Here’s a short video demonstrating this.

Despite various projects, research and applications involving cross-adaptive audio effects, there is still a fair amount of confusion surrounding the topic. There are multiple definitions, sometimes even by the same authors. So this paper gives a brief history of applications as well as a classification of effects types and clarifies issues that have come up in earlier literature. It further defines the field, lays a formal framework, explores technical aspects and applications, and considers the future from artistic, perceptual, scientific and engineering perspectives.

Check it out!

Advertisements

Weird and wonderful research to be unveiled at the 144th Audio Engineering Society Convention

th

Last year, we previewed the142nd and 143rd AES Conventions, which we followed with a wrap-up discussions here and here. The next AES  convention is just around the corner, May 23 to 26 in Milan. As before, the Audio Engineering research team here aim to be quite active at the convention.

These conventions have thousands of attendees, but aren’t so large that you get lost or overwhelmed. Away from the main exhibition hall is the Technical Program, which includes plenty of tutorials and presentations on cutting edge research.

So we’ve gathered together some information about a lot of the events that caught our eye as being unusual, exceptionally high quality involved in, attending, or just worth mentioning. And this Convention will certainly live up to the hype.

Wednesday May 23rd

From 11:15 to 12:45 that day, there’s an interesting poster by a team of researchers from the University of Limerick titled Can Visual Priming Affect the Perceived Sound Quality of a Voice Signal in Voice over Internet Protocol (VoIP) Applications? This builds on work we discussed in a previous blog entry, where they did a perceptual study of DFA Faders, looking at how people’s perception of mixing changes when the sound engineer only pretends to make an adjustment.

As expected given the location, there’s lots of great work being presented by Italian researchers. The first one that caught my eye is the 2:30-4 poster on Active noise control for snoring reduction. Whether you’re a loud snorer, sleep next to someone who is a loud snorer or just interested in unusual applications of audio signal processing, this one is worth checking out.

Do you get annoyed sometimes when driving and the road surface changes to something really noisy? Surely someone should do a study and find out which roads are noisiest so that then we can put a bit of effort into better road design and better in-vehicle equalisation and noise reduction? Well, now its finally happened with this paper in the same session on Deep Neural Networks for Road Surface Roughness Classification from Acoustic Signals.

Thursday, May 24

If you were to spend only one day this year immersing yourself in frontier audio engineering research, this is the day to do it.

How do people mix music differently in different countries? And do people perceive the mixes differently based on their different cultural backgrounds? These are the sorts of questions our research team here have been asking. Find out more in this 9:30 presentation by Amandine Pras. She led this Case Study of Cultural Influences on Mixing Practices, in collaboration with Brecht De Man (now with Birmingham City University) and myself.

Rod Selfridge has been blazing new trails in sound synthesis and procedural audio. He won the Best Student Paper Award at AES 141st Convention and the Best Paper Award at Sound and Music Computing. He’ll give another great presentation at noon on Physically Derived Synthesis Model of an Edge Tone which was also discussed in a recent blog entry.

I love the title of this next paper, Miniaturized Noise Generation System—A Simulation of a Simulation, which will be presented at 2:30pm by researchers from Intel Technology in Gdansk, Poland. This idea of a meta-simulation is not as uncommon as you might think; we do digital emulation of old analogue synthesizers, and I’ve seen papers on numerical models of Foley rain sound generators.

A highlight for our team here is our 2:45 pm presentation, FXive: A Web Platform for Procedural Sound Synthesis. We’ll be unveiling a disruptive innovation for sound design, FXive.com, aimed at replacing reliance on sound effect libraries. Please come check it out, and get in touch with the presenters or any members of the team to find out more.

Immediately following this is a presentation which asks Can Algorithms Replace a Sound Engineer? This is a question the research team here have also investigated a lot, you could even say it was the main focus of our research for several years. The team behind this presentation are asking it in relation to Auto-EQ. I’m sure it will be interesting, and I hope they reference a few of our papers on the subject.

From 9-10:30, I will chair a Workshop on The State of the Art in Sound Synthesis and Procedural Audio, featuring the world’s experts on the subject. Outside of speech and possibly music, sound synthesis is still in its infancy, but its destined to change the world of sound design in the near future. Find out why.

12:15 — 13:45 is a workshop related to machine learning in audio (a subject that is sometimes called Machine Listening), Deep Learning for Audio Applications. Deep learning can be quite a technical subject, and there’s a lot of hype around it. So a Workshop on the subject is a good way to get a feel for it. See below for another machine listening related workshop on Friday.

The Heyser Lecture, named after Richard Heyser (we discussed some of his work in a previous entry), is a prestigious evening talk given by one of the eminent individuals in the field. This one will be presented by Malcolm Hawksford. , a man who has had major impact on research in audio engineering for decades.

Friday

The 9:30 — 11 poster session features some unusual but very interesting research. A talented team of researchers from Ancona will present A Preliminary Study of Sounds Emitted by Honey Bees in a Beehive.

Intense solar activity in March 2012 caused some amazing solar storms here on Earth. Researchers in Finland recorded them, and some very unusual results will be presented in the same session with the poster titled Analysis of Reports and Crackling Sounds with Associated Magnetic Field Disturbances Recorded during a Geomagnetic Storm on March 7, 2012 in Southern Finland.

You’ve been living in a cave if you haven’t noticed the recent proliferation of smart devices, especially in the audio field. But what makes them tick, is there a common framework and how are they tested? Find out more at 10:45 when researchers from Audio Precision will present The Anatomy, Physiology, and Diagnostics of Smart Audio Devices.

From 3 to 4:30, there’s a Workshop on Artificial Intelligence in Your Audio. It follows on from a highly successful workshop we did on the subject at the last Convention.

Saturday

A couple of weeks ago, John Flynn wrote an excellent blog entry describing his paper on Improving the Frequency Response Magnitude and Phase of Analogue-Matched Digital Filters. His work is a true advance on the state of the art, providing digital filters with closer matches to their analogue counterparts than any previous approaches. The full details will be unveiled in his presentation at 10:30.

If you haven’t seen Mariana Lopez presenting research, you’re missing out. Her enthusiasm for the subject is infectious, and she has a wonderful ability to convey the technical details, their deeper meanings and their importance to any audience. See her one hour tutorial on Hearing the Past: Using Acoustic Measurement Techniques and Computer Models to Study Heritage Sites, starting at 9:15.

The full program can be explored on the Convention Calendar or the Convention website. Come say hi to us if you’re there! Josh Reiss (author of this blog entry), John Flynn, Parham Bahadoran and Adan Benito from the Audio Engineering research team within the Centre for Digital Music, along with two recent graduates Brecht De Man and Rod Selfridge, will all be there.

Exciting research at the upcoming Audio Engineering Society Convention

aes143

About five months ago, we previewed the last European Audio Engineering Society Convention, which we followed with a wrap-up discussion. The next AES  convention is just around the corner, October 18 to 21st in New York. As before, the Audio Engineering research team here aim to be quite active at the convention.

These conventions are quite big, with thousands of attendees, but not so large that you get lost or overwhelmed. Away from the main exhibition hall is the Technical Program, which includes plenty of tutorials and presentations on cutting edge research.

So here, we’ve gathered together some information about a lot of the events that we will be involved in, attending, or we just thought were worth mentioning. And I’ve gotta say, the Technical Program looks amazing.

Wednesday

One of the first events of the Convention is the Diversity Town Hall, which introduces the AES Diversity and Inclusion Committee. I’m a firm supporter of this, and wrote a recent blog entry about female pioneers in audio engineering. The AES aims to be fully inclusive, open and encouraging to all, but that’s not yet fully reflected in its activities and membership. So expect to see some exciting initiatives in this area coming soon.

In the 10:45 to 12:15 poster session, Steve Fenton will present Alternative Weighting Filters for Multi-Track Program Loudness Measurement. We’ve published a couple of papers (Loudness Measurement of Multitrack Audio Content Using Modifications of ITU-R BS.1770, and Partial loudness in multitrack mixing) showing that well-known loudness measures don’t correlate very well with perception when used on individual tracks within a multitrack mix, so it would be interesting to see what Steve and his co-author Hyunkook Lee found out. Perhaps all this research will lead to better loudness models and measures.

At 2 pm, Cleopatra Pike will present a discussion and analysis of Direct and Indirect Listening Test Methods. I’m often sceptical when someone draws strong conclusions from indirect methods like measuring EEGs and reaction times, so I’m curious what this study found and what recommendations they propose.

The 2:15 to 3:45 poster session will feature the work with probably the coolest name, Influence of Audience Noises on the Classical Music Perception on the Example of Anti-cough Candies Unwrapping Noise. And yes, it looks like a rigorous study, using an anechoic chamber to record the sounds of sweets being unwrapped, and the signal analysis is coupled with a survey to identify the most distracting sounds. It reminds me of the DFA faders paper from the last convention.

At 4:30, researchers from Fraunhofer and the Technical University of Ilmenau present Training on the Acoustical Identification of the Listening Position in a Virtual Environment. In a recent paper in the Journal of the AES, we found that training resulted in a huge difference between participant results in a discrimination task, yet listening tests often employ untrained listeners. This suggests that maybe we can hear a lot more than what studies suggest, we just don’t know how to listen and what to listen for.

Thursday

If you were to spend only one day this year immersing yourself in frontier audio engineering research, this is the day to do it.

At 9 am, researchers from Harman will present part 1 of A Statistical Model that Predicts Listeners’ Preference Ratings of In-Ear Headphones. This was a massive study involving 30 headphone models and 71 listeners under carefully controlled conditions. Part 2, on Friday, focuses on development and validation of the model based on the listening tests. I’m looking forward to both, but puzzled as to why they weren’t put back-to-back in the schedule.

At 10 am, researchers from the Tokyo University of the Arts will present Frequency Bands Distribution for Virtual Source Widening in Binaural Synthesis, a technique which seems closely related to work we presented previously on Cross-adaptive Dynamic Spectral Panning.

From 10:45 to 12:15, our own Brecht De Man will be chairing and speaking in a Workshop on ‘New Developments in Listening Test Design.’ He’s quite a leader in this field, and has developed some great software that makes the set up, running and analysis of listening tests much simpler and still rigorous.

In the 11-12:30 poster session, Nick Jillings will present Automatic Masking Reduction in Balance Mixes Using Evolutionary Computing, which deals with a challenging problem in music production, and builds on the large amount of research we’ve done on Automatic Mixing.

At 11:45, researchers from McGill will present work on Simultaneous Audio Capture at Multiple Sample Rates and Formats. This helps address one of the challenges in perceptual evaluation of high resolution audio (and see the open access journal paper on this), ensuring that the same audio is used for different versions of the stimuli, with only variation in formats.

At 1:30, renowned audio researcher John Vanderkooy will present research on how a  loudspeaker can be used as the sensor for a high-performance infrasound microphone. In the same session at 2:30, researchers from Plextek will show how consumer headphones can be augmented to automatically perform hearing assessments. Should we expect a new audiometry product from them soon?

At 2 pm, our own Marco Martinez Ramirez will present Analysis and Prediction of the Audio Feature Space when Mixing Raw Recordings into Individual Stems, which applies machine learning to challenging music production problems. Immediately following this, Stephen Roessner discusses a Tempo Analysis of Billboard #1 Songs from 1955–2015, which builds partly on other work analysing hit songs to observe trends in music and production tastes.

At 3:45, there is a short talk on Evolving the Audio Equalizer. Audio equalization is a topic on which we’ve done quite a lot of research (see our review article, and a blog entry on the history of EQ). I’m not sure where the novelty is in the author’s approach though, since dynamic EQ has been around for a while, and there are plenty of harmonic processing tools.

At 4:15, there’s a presentation on Designing Sound and Creating Soundscapes for Still Images, an interesting and unusual bit of sound design.

Friday

Judging from the abstract, the short Tutorial on the Audibility of Loudspeaker Distortion at Bass Frequencies at 5:30 looks like it will be an excellent and easy to understand review, covering practice and theory, perception and metrics. In 15 minutes, I suppose it can only give a taster of what’s in the paper.

There’s a great session on perception from 1:30 to 4. At 2, perceptual evaluation expert Nick Zacharov gives a Comparison of Hedonic and Quality Rating Scales for Perceptual Evaluation. I think people often have a favorite evaluation method without knowing if its the best one for the test. We briefly looked at pairwise versus multistimuli tests in previous work, but it looks like Nick’s work is far more focused on comparing methodologies.

Immediately after that, researchers from the University of Surrey present Perceptual Evaluation of Source Separation for Remixing Music. Techniques for remixing audio via source separation is a hot topic, with lots of applications whenever the original unmixed sources are unavailable. This work will get to the heart of which approaches sound best.

The last talk in the session, at 3:30 is on The Bandwidth of Human Perception and its Implications for Pro Audio. Judging from the abstract, this is a big picture, almost philosophical discussion about what and how we hear, but with some definitive conclusions and proposals that could be useful for psychoacoustics researchers.

Saturday

Grateful Dead fans will want to check out Bridging Fan Communities and Facilitating Access to Music Archives through Semantic Audio Applications in the 9 to 10:30 poster session, which is all about an application providing wonderful new experiences for interacting with the huge archives of live Grateful Dead performances.

At 11 o’clock, Alessia Milo, a researcher in our team with a background in architecture, will discuss Soundwalk Exploration with a Textile Sonic Map. We discussed her work in a recent blog entry on Aural Fabric.

In the 2 to 3:30 poster session, I really hope there will be a live demonstration accompanying the paper on Acoustic Levitation.

At 3 o’clock, Gopal Mathur will present an Active Acoustic Meta Material Loudspeaker System. Metamaterials are receiving a lot of deserved attention, and such advances in materials are expected to lead to innovative and superior headphones and loudspeakers in the near future.

 

The full program can be explored on the Convention Calendar or the Convention website. Come say hi to us if you’re there! Josh Reiss (author of this blog entry), Brecht De Man, Marco Martinez and Alessia Milo from the Audio Engineering research team within the Centre for Digital Music  will all be there.
 

 

Ten Years of Automatic Mixing

tenyears

Automatic microphone mixers have been around since 1975. These are devices that lower the levels of microphones that are not in use, thus reducing background noise and preventing acoustic feedback. They’re great for things like conference settings, where there may be many microphones but only a few speakers should be heard at any time.

Over the next three decades, various designs appeared, but it didn’t really grow much from Dan Dugan’s original Dan Dugan’s original concept.

Enter Enrique Perez Gonzalez, a PhD student researcher and experienced sound engineer. On September 11th, 2007, exactly ten years ago from the publication of this blog post, he presented a paper “Automatic Mixing: Live Downmixing Stereo Panner.” With this work, he showed that it may be possible to automate not just fader levels in speech applications, but other tasks and for other applications. Over the course of his PhD research, he proposed methods for autonomous operation of many aspects of the music mixing process; stereo positioning, equalisation, time alignment, polarity correction, feedback prevention, selective masking minimization, etc. He also laid out a framework for further automatic mixing systems.

Enrique established a new field of research, and its been growing ever since. People have used machine learning techniques for automatic mixing, applied auditory neuroscience to the problem, and explored where the boundaries lie between the creative and technical aspects of mixing. Commercial products have arisen based on the concept. And yet all this is still only scratching the surface.

I had the privilege to supervise Enrique and have many anecdotes from that time. I remember Enrique and I going to a talk that Dan Dugan gave at an AES convention panel session and one of us asked Dan about automating other aspects of the mix besides mic levels. He had a puzzled look and basically said that he’d never considered it. It was also interesting to see the hostile reactions from some (but certainly not all) practitioners, which brings up lots of interesting questions about disruptive innovations and the threat of automation.

wimp3

Next week, Salford University will host the 3rd Workshop on Intelligent Music Production, which also builds on this early research. There, Brecht De Man will present the paper ‘Ten Years of Automatic Mixing’, describing the evolution of the field, the approaches taken, the gaps in our knowledge and what appears to be the most exciting new research directions. Enrique, who is now CTO of Solid State Logic, will also be a panellist at the Workshop.

Here’s a video of one of the early Automatic Mixing demonstrators.

And here’s a list of all the early Automatic Mixing papers.

  • E. Perez Gonzalez and J. D. Reiss, A real-time semi-autonomous audio panning system for music mixing, EURASIP Journal on Advances in Signal Processing, v2010, Article ID 436895, p. 1-10, 2010.
  • Perez-Gonzalez, E. and Reiss, J. D. (2011) Automatic Mixing, in DAFX: Digital Audio Effects, Second Edition (ed U. Zölzer), John Wiley & Sons, Ltd, Chichester, UK. doi: 10.1002/9781119991298. ch13, p. 523-550.
  • E. Perez Gonzalez and J. D. Reiss, “Automatic equalization of multi-channel audio using cross-adaptive methods”, Proceedings of the 127th AES Convention, New York, October 2009
  • E. Perez Gonzalez, J. D. Reiss “Automatic Gain and Fader Control For Live Mixing”, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, New York, October 18-21, 2009
  • E. Perez Gonzalez, J. D. Reiss “Determination and correction of individual channel time offsets for signals involved in an audio mixture”, 125th AES Convention, San Francisco, USA, October 2008
  • E. Perez Gonzalez, J. D. Reiss “An automatic maximum gain normalization technique with applications to audio mixing.”, 124th AES Convention, Amsterdam, Netherlands, May 2008
  • E. Perez Gonzalez, J. D. Reiss, “Improved control for selective minimization of masking using interchannel dependency effects”, 11th International Conference on Digital Audio Effects (DAFx), September 2008
  • E. Perez Gonzalez, J. D. Reiss, “Automatic Mixing: Live Downmixing Stereo Panner”, 10th International Conference on Digital Audio Effects (DAFx-07), Bordeaux, France, September 10-15, 2007

Cool stuff at the Audio Engineering Society Convention in Berlin

aesberlin17_IDS_headerThe next Audio Engineering Society convention is just around the corner, May 20-23 in Berlin. This is an event where we always have a big presence. After all, this blog is brought to you by the Audio Engineering research team within the Centre for Digital Music, so its a natural fit for a lot of what we do.

These conventions are quite big, with thousands of attendees, but not so big that you get lost or overwhelmed. The attendees fit loosely into five categories: the companies, the professionals and practitioners, students, enthusiasts, and the researchers. That last category is where we fit.

I thought I’d give you an idea of some of the highlights of the Convention. These are some of the events that we will be involved in or just attending, but of course, there’s plenty else going on.

On Saturday May 20th, 9:30-12:30, Dave Ronan from the team here will be presenting a poster on ‘Analysis of the Subgrouping Practices of Professional Mix Engineers.’ Subgrouping is a greatly understudied, but important part of the mixing process. Dave surveyed 10 award winning mix engineers to find out how and why they do subgrouping. He then subjected the results to detailed thematic analysis to uncover best practices and insights into the topic.

2:45-4:15 pm there is a workshop on ‘Perception of Temporal Response and Resolution in Time Domain.’ Last year we published an article in the Journal of the Audio Engineering Society  on ‘A meta-analysis of high resolution audio perceptual evaluation.’ There’s a blog entry about it too. The research showed very strong evidence that people can hear a difference between high resolution audio and standard, CD quality audio. But this brings up the question of why? Many people have suggested that the fine temporal resolution of oversampled audio might be perceived. I expect that this Workshop will shed some light on this as yet unresolved question.

Overlapping that workshop, there are some interesting posters from 3 to 6 pm. ‘Mathematical Model of the Acoustic Signal Generated by the Combustion Engine‘ is about synthesis of engine sounds, specifically for electric motorbikes. We are doing a lot of sound synthesis research here, and so are always on the lookout for new approaches and new models. ‘A Study on Audio Signal Processed by “Instant Mastering” Services‘ investigates the effects applied to ten songs by various online, automatic mastering platforms. One of those platforms, LandR, was a high tech spin-out from our research a few years ago, so we’ll be very interested in what they found.

For those willing to get up bright and early Sunday morning, there’s a 9 am panel on ‘Audio Education—What Does the Future Hold,’ where I will be one of the panellists. It should have some pretty lively discussion.

Then there’s some interesting posters from 9:30 to 12:30. We’ve done a lot of work on new interfaces for audio mixing, so will be quite interested in ‘The Mixing Glove and Leap Motion Controller: Exploratory Research and Development of Gesture Controllers for Audio Mixing.’ And returning to the subject of high resolution audio, there is ‘Discussion on Subjective Characteristics of High Resolution Audio,’ by Mitsunori Mizumachi. Mitsunori was kind enough to give me details about his data and experiments in hi-res audio, which I then used in the meta-analysis paper. He’ll also be looking at what factors affect high resolution audio perception.

From 10:45 to 12:15, our own Brecht De Man will be chairing and speaking in a Workshop on ‘New Developments in Listening Test Design.’ He’s quite a leader in this field, and has developed some great software that makes the set up, running and analysis of listening tests much simpler and still rigorous.

From 1 to 2 pm, there is the meeting of the Technical Committee on High Resolution Audio, of which I am co-chair along with Vicki Melchior. The Technical Committee aims for comprehensive understanding of high resolution audio technology in all its aspects. The meeting is open to all, so for those at the Convention, feel free to stop by.

Sunday evening at 6:30 is the Heyser lecture. This is quite prestigious, a big talk by one of the eminent people in the field. This one is given by Jorg Sennheiser of, well, Sennheiser Electronic.

Monday morning 10:45-12:15, there’s a tutorial on ‘Developing Novel Audio Algorithms and Plugins – Moving Quickly from Ideas to Real-time Prototypes,’ given by Mathworks, the company behind Matlab. They have a great new toolbox for audio plugin development, which should make life a bit simpler for all those students and researchers who know Matlab well and want to demo their work in an audio workstation.

Again in the mixing interface department, we look forward to hearing about ‘Formal Usability Evaluation of Audio Track Widget Graphical Representation for Two-Dimensional Stage Audio Mixing Interface‘ on Tuesday, 11-11:30. The authors gave us a taste of this work at the Workshop on Intelligent Music Production which our group hosted last September.

In the same session – which is all about ‘Recording and Live Sound‘ so very close to home – a new approach to acoustic feedback suppression is discussed in ‘Using a Speech Codec to Suppress Howling in Public Address Systems‘, 12-12:30. With several past projects on gain optimization for live sound, we are curious to hear (or not hear) the results!

The full program can be explored on the AES Convention planner or the Convention website. Come say hi to us if you’re there!