The post Behind the spectacular sound of ‘Dunkirk’ – with Richard King: appeared first on A Sound Effect. Its an interesting interview giving deep insights into sound design and soundscape creation for film. It caught my attention first because of the mention of Richard King. But its not Richard King, Grammy award winning professor in sound recording at University of McGill. Its the other one, the Oscar award winning supervising sound editor at Warner Brothers Sound.
We collaborated with Prof. Richard King on a couple of papers. In , we conducted an experiment where eight songs were each mixed by eight different engineers. We analysed audio features from the multitracks and mixes. This allowed us to test various assumed rules of mixing practice. In the follow-up , the mixes were all rated by experienced test subjects. We used the ratings to investigate relationships between perceived mix quality and sonic features of the mixes.
 B. De Man, M. Boerum, B. Leonard, R. King, G. Massenburg and J. D. Reiss, ‘Perceptual Evaluation of Music Mixing Practices,’ 138th Audio Engineering Society (AES) Convention, May 2015
 B. De Man, B. Leonard, R. King and Joshua D. Reiss, “An analysis and evaluation of audio features for multitrack music mixtures,” 15th Int. Society for Music Information Retrieval Conference (ISMIR-14), Taipei, Taiwan, Oct. 2014
The Audio Engineering team (C4DM) was present in this year’s edition of Sónar+D in Barcelona. Sónar+D is an international conference integrated to Sónar festival that focus on the interdisciplinary approach between creativity and technology.
The Sónar Innovation Challenge (SIC), co-organized by the MTG, <<is an online and on site platform for the creative minds that want to be one step ahead and experiment with the future of technology. It brings together innovative tech companies and creators, collaborating to solve challenges that will lead to disruptive prototypes showcased in Sónar+D.>>
In this year’s challenge, Marco Martínez was part of the enhanced dj assistant by the Music Technology Group at Universitat Pompeu Fabra, which challenged participants to create a user-friendly, visually appealing and musically motivated system that DJs can use to remix music collections in exciting new ways.
Thus, after nearly one month of online meetings, the challengers and mentors finally met at Sónar, and during 4 days of intensive brain-storming-programming-prototyping at more than 30°C the team came with ATOMIX:
Visualize, explore and manipulate atoms of sound from
multitrack recordings, enhancing the creative
possibilities for live artists and DJs.
From multitrack recording (stems) and using advanced algorithms and cutting edge technologies in feature extraction, clustering, synthesis and visualisation. It segments a collection of stems into atoms of sound and groups them by timbre similarity. Thus, through concatenative synthesis, ATOMIX allows you to manipulate and exchange atoms of sound in real-time with professional DAW controls, achieving a one-of-a-kind live music exploration.
The project is still in a prototype stage and we hope to hear news of development very soon.
The 142nd AES Convention was held last month in the creative heart of Berlin. The four-day program and its more than 2000 attendees covered several workshops, tutorials, technical tours and special events, all related to the latest trends and developments in audio research. But as much as scale, it’s attention to detail that makes AES special. There’s an emphasis on the research side of audio topics as much as the side of panels of experts discussing a range of provocative and practical topics.
It can be said that 3D Audio: Recording and Reproduction, Binaural Listening and Audio for VR were the most popular topics among workshops, tutorial, papers and engineering briefs. However, a significant portion of the program was also devoted to common audio topics such as digital filter design, live audio, loudspeaker design, recording, audio encoding, microphones, and music production techniques just to name a few.
For this reason, here at the Audio Engineering research team within C4DM, we bring you what we believe were the highlights, the key talks or the most relevant topics that took place during the convention.
The future of mastering
What better way to start AES than with a mastering experts’ workshop discussing about the future of the field? Jonathan Wyner (iZotope) introduced us to the current challenges that this discipline faces. This related to the demographic, economic and target formatting issues that are constantly evolving and changing due to advances in the music technology industry and its consumers.
When discussing the future of mastering, the panel was reluctant to a fully automated future. But pointed out that the main challenge of assistive tools is to understand artistry intentions and genre-based decisions without the need of the expert knowledge of the mastering engineer. Concluding that research efforts should go towards the development of an intelligent assistant, able to function as an smart preset that provides master engineers a starting point.
Virtual analog modeling of dynamic range compression systems
This paper described a method to digitally model an analogue dynamic range compression. Based on the analysis of processed and unprocessed audio waveforms, a generic model of dynamic range compression is proposed and its parameters are derived from iterative optimization techniques.
Audio samples were reproduced and the quality of the audio produced by the digital model was demonstrated. However, it should be noted that the parameters of the digital compressor can not be changed, thus, this could be an interesting future work path, as well as the inclusion of other audio effects such as equalizers or delay lines.
Evaluation of alternative audio mixing interfaces
In the paper ‘Formal Usability Evaluation of Audio Track Widget Graphical Representation for Two-Dimensional Stage Audio Mixing Interface‘ an evaluation of different graphical track visualization styles is proposed. Multitrack visualizations included text only, different colour conventions for circles containing text or icons related to the type of instruments, circles with opacity assigned to audio features and also a traditional channel strip mixing interface.
Efficiency was tested and it was concluded that subjects preferred instrument icons as well as the traditional mixing interface. In this way, taking into account several works and proposals on alternative mixing interfaces (2D and 3D), there is still a lot of scope to explore on how to build an intuitive, efficient and simple interface capable of replacing the good known channel strip.
Perceptually motivated filter design with application to loudspeaker-room equalization
This tutorial, was based on the engineering brief ‘Quantization Noise of Warped and Parallel Filters Using Floating Point Arithmetic’ where warped parallel filters are proposed, which aim to have the frequency resolution of the human ear.
Thus, via Matlab, we explored various approaches for achieving this goal, including warped FIR and IIR, Kautz, and fixed-pole parallel filters. Providing in this way a very useful tool that can be used for various applications such as room EQ, physical modelling synthesis and perhaps to improve existing intelligent music production systems.
Source Separation in Action: Demixing the Beatles at the Hollywood Bowl
Abbey Road’s James Clarke presented a great poster with the actual algorithm that was used for the remixed, remastered and expanded version of The Beatles’ album Live at the Hollywood Bowl. The method achieved to isolate the crowd noise, allowing to separate into clean tracks everything that Paul McCartney, John Lennon, Ringo Starr and George Harrison played live in 1964.
The results speak for themselves (audio comparison). Thus, based on a Non-negative Matrix Factorization (NMF) algorithm, this work provides a great research tool for source separation and reverse-engineer of mixes.
Other keynotes worth to mention:
The rest of the paper proceedings are available in the AES E-library.
The next Audio Engineering Society convention is just around the corner, May 20-23 in Berlin. This is an event where we always have a big presence. After all, this blog is brought to you by the Audio Engineering research team within the Centre for Digital Music, so its a natural fit for a lot of what we do.
These conventions are quite big, with thousands of attendees, but not so big that you get lost or overwhelmed. The attendees fit loosely into five categories: the companies, the professionals and practitioners, students, enthusiasts, and the researchers. That last category is where we fit.
I thought I’d give you an idea of some of the highlights of the Convention. These are some of the events that we will be involved in or just attending, but of course, there’s plenty else going on.
On Saturday May 20th, 9:30-12:30, Dave Ronan from the team here will be presenting a poster on ‘Analysis of the Subgrouping Practices of Professional Mix Engineers.’ Subgrouping is a greatly understudied, but important part of the mixing process. Dave surveyed 10 award winning mix engineers to find out how and why they do subgrouping. He then subjected the results to detailed thematic analysis to uncover best practices and insights into the topic.
2:45-4:15 pm there is a workshop on ‘Perception of Temporal Response and Resolution in Time Domain.’ Last year we published an article in the Journal of the Audio Engineering Society on ‘A meta-analysis of high resolution audio perceptual evaluation.’ There’s a blog entry about it too. The research showed very strong evidence that people can hear a difference between high resolution audio and standard, CD quality audio. But this brings up the question of why? Many people have suggested that the fine temporal resolution of oversampled audio might be perceived. I expect that this Workshop will shed some light on this as yet unresolved question.
Overlapping that workshop, there are some interesting posters from 3 to 6 pm. ‘Mathematical Model of the Acoustic Signal Generated by the Combustion Engine‘ is about synthesis of engine sounds, specifically for electric motorbikes. We are doing a lot of sound synthesis research here, and so are always on the lookout for new approaches and new models. ‘A Study on Audio Signal Processed by “Instant Mastering” Services‘ investigates the effects applied to ten songs by various online, automatic mastering platforms. One of those platforms, LandR, was a high tech spin-out from our research a few years ago, so we’ll be very interested in what they found.
For those willing to get up bright and early Sunday morning, there’s a 9 am panel on ‘Audio Education—What Does the Future Hold,’ where I will be one of the panellists. It should have some pretty lively discussion.
Then there’s some interesting posters from 9:30 to 12:30. We’ve done a lot of work on new interfaces for audio mixing, so will be quite interested in ‘The Mixing Glove and Leap Motion Controller: Exploratory Research and Development of Gesture Controllers for Audio Mixing.’ And returning to the subject of high resolution audio, there is ‘Discussion on Subjective Characteristics of High Resolution Audio,’ by Mitsunori Mizumachi. Mitsunori was kind enough to give me details about his data and experiments in hi-res audio, which I then used in the meta-analysis paper. He’ll also be looking at what factors affect high resolution audio perception.
From 10:45 to 12:15, our own Brecht De Man will be chairing and speaking in a Workshop on ‘New Developments in Listening Test Design.’ He’s quite a leader in this field, and has developed some great software that makes the set up, running and analysis of listening tests much simpler and still rigorous.
From 1 to 2 pm, there is the meeting of the Technical Committee on High Resolution Audio, of which I am co-chair along with Vicki Melchior. The Technical Committee aims for comprehensive understanding of high resolution audio technology in all its aspects. The meeting is open to all, so for those at the Convention, feel free to stop by.
Sunday evening at 6:30 is the Heyser lecture. This is quite prestigious, a big talk by one of the eminent people in the field. This one is given by Jorg Sennheiser of, well, Sennheiser Electronic.
Monday morning 10:45-12:15, there’s a tutorial on ‘Developing Novel Audio Algorithms and Plugins – Moving Quickly from Ideas to Real-time Prototypes,’ given by Mathworks, the company behind Matlab. They have a great new toolbox for audio plugin development, which should make life a bit simpler for all those students and researchers who know Matlab well and want to demo their work in an audio workstation.
Again in the mixing interface department, we look forward to hearing about ‘Formal Usability Evaluation of Audio Track Widget Graphical Representation for Two-Dimensional Stage Audio Mixing Interface‘ on Tuesday, 11-11:30. The authors gave us a taste of this work at the Workshop on Intelligent Music Production which our group hosted last September.
In the same session – which is all about ‘Recording and Live Sound‘ so very close to home – a new approach to acoustic feedback suppression is discussed in ‘Using a Speech Codec to Suppress Howling in Public Address Systems‘, 12-12:30. With several past projects on gain optimization for live sound, we are curious to hear (or not hear) the results!
We’re collaborating on a really interesting project called ‘Cross-adaptive processing as musical intervention,’ led by Professor Øyvind Brandtsegg of the Norwegian University of Science and Technology. Essentially, this project involves cross-adaptive audio effects, where the processing applied to one audio signal is dependent on analysis of other signals. We’ve used this concept quite a lot to build intelligent music production systems. But in this project, Øyvind and his collaborators are exploring creative uses of cross-adaptive audio effects in live performance. The effects applied to one source may change depending on what and how another performer plays, so a performer may change what they play to overtly influence everyone else’s sound, thus taking the interplay in a jam session to a whole new level.
One of the neat things that they’ve done to get this project off the ground is created a blog, http://crossadaptive.hf.ntnu.no/ , which is a great way to get all the reports and reflections out there quickly and widely.
This got me thinking of a few other blogs that we should mention. First and foremost is Prof, Trevor Cox of the University of Salford’s wonderful blog, ‘The Sound Blog: Dispatches from Acoustic and Audio Engineering,’ is available at https://acousticengineering.wordpress.com/ . This blog was one of the principal inspirations for our own blog here.
Another leading researcher’s interesting blog is https://marianajlopez.wordpress.com/ – Mariana is looking into aspects of sound design that I feel really don’t get enough attention from the academic community… yet. Hopefully, that will change soon.
A lot of the researchers in the Audio Engineering team have their own personal blogs, which discuss their research, their projects and various other things related to their career or just cool technologies.
http://brechtdeman.com/blog.html – Brecht De Man ‘s blog. He’s researching semantic and knowledge engineering approaches to music production systems (and a lot more).
https://auralcharacter.wordpress.com/ – Alessia Milo’s blog. She’s looking at (and listening to) soundscapes, and their importance in architecture
http://davemoffat.com/wp/ – Dave Moffat is investigating evaluation of sound synthesis techniques, and how machine learning can be applied to synthesize a wide variety of sound effects.
https://rodselfridge.wordpress.com/ – Rod Selfridge is looking at real-time physical modelling techniques for procedural audio and sound synthesis.
More to come on all of them, I’m sure.
Let us know of any other blogs that we should mention, and we’ll update this entry or add new entries.
From 1976 through 1989, Dr. Andy Hildebrand worked for the oil industry, interpreting seismic data. By sending sound waves into the ground, he could detect the reflections, and map potential drill sites. Dr. Hildebrand studied music composition at Rice University, and then developed audio processing tools based on his knowledge in seismic data analysis. He was a leading developer of a variety of plug-ins, including MDT (Multiband Dynamics Tool), JVP (Jupiter Voice Processor) and SST (Spectral Shaping Tool). At a dinner party, a guest challenged him to invent a tool that would help her sing in tune. Based on the phase vocoder, Hildebrand’s Antares Audio Technologies released Auto-Tune in late 1996.
Auto-Tune was intended to correct or disguise off-key vocals. It moves the pitch of a note to the nearest true semitone (the nearest musical interval in traditional, equal temperament Western tonal music), thus allowing the vocal parts to be tuned. The original Auto-Tune had a speed parameter which could be set between 0 and 400 milliseconds, and determined how quickly the note moved to the target pitch. Engineers soon realised that by setting this ‘attack time’ very short, Auto-Tune could be used as an effect to distort vocals, and make it sound as if the voice leaps from note to note in discrete steps. It gives it an artificial, synthesiser like sound, that can be appealing or irritating depending on taste. This unusual effect was the trademark sound of Cher’s 1998 hit song, ‘Believe.’
Like many audio effects, engineers and performers found a creative use, quite different from the intended use. As Hildebrand said, “I never figured anyone in their right mind would want to do that.” Yet Auto-Tune and competing pitch correction technologies are now widely applied (in amateur and professional recordings, and across many genres) for both intended and unusual, artistic uses.
Dynamic range compression is used at almost every stage of the audio production chain. It is applied to minimise artifacts in recording (like variation in loudness as a vocalist moves towards or away from a microphone), to reduce masking and to bring different tracks into a comparable loudness range. Compression is also applied in mastering to make the recording sound ‘loud.’ since a loud recording will be more noticeable than a quiet one, and the listener will hear more of the full frequency range. This has resulted in a trend to more and more compression being applied, a ‘loudness war.’
Broadcasting also has its loudness wars. Dynamic range compression is applied in broadcasting to prevent drastic level changes from song to song, and to ensure compliance with standards regarding maximum broadcast levels. But competition for listeners between radio stations has resulted in a trend to very large amounts of compression being applied.
So a lot of recordings have been compressed to the point where dynamics are compromised, transients are squashed, clipping occurs and there can be significant distortion throughout. The end result is that many people think, compared to what they could have been, a lot of modern recordings sound terrible. And broadcast compression only adds to the problem.
Who is to blame? There is a belief among many that ‘loud sells records.’ This may not be true, but believing it encourages people to participate in the loudness war. And each individual may think that what they are doing is appropriate. Collectively, the musician who wants a loud recording, the record producer who wants a wall of sound, the engineers dealing with artifacts, the mastering engineers who prepare content for broadcast and the broadcasters themselves are all acting as soldiers in the loudness war.
The tide is turning
The loudness war may have reached its peak shortly after the start of the new millenium. Audiologists became concerned that the prolonged loudness of new albums might cause hearing damage. Musicians began highlighting the sound quality issue, and in 2006, Bob Dylan said, “… these modern records, they’re atrocious, they have sound all over them. There’s no definition of nothing, no vocal, no nothing, just like static. Even these songs probably sounded ten times better in the studio.” Also in 2006, a vice-president at a Sony Music subsidiary wrote an open letter decrying the loudness war, claiming that mastering engineers are being forced to make releases louder in order to get the attention of industry heads.
In 2008 Metallica released an album with tremendous compression, and hence clipping and lots of distortion. But a version without overuse of compression was included in downloadable content for a game, Guitar Hero III, and listeners all over noticed and complained about the difference. Again in 2008, Guns N’ Roses producers (including the band’s frontman Axl Rose) chose a version with minimal compression when offered three alternative mastered versions.
Recently, an annual Dynamic Range Day has been organised to raise awareness of the issue, and the nonprofit organization Turn Me Up! was created to promote recordings with more dynamic range.
The European Broadcasting Union addressed the broadcast loudness wars with EBU Recommendation R 128 and related documents that specify how loudness and loudness range can be measured in broadcast content, as well as recommending appropriate ranges for both.
Together, all these developments may go a long way to establishing a truce in the loudness war.