Venturous Views on Virtual Vienna – a preview of AES 148

#VirtualVienna

We try to write a preview of the technical track for almost all recent Audio Engineering Society (AES) Conventions, see our entries on the 142nd, 143rd, 144th, 145th and 147th Conventions. But this 148th Convention is very different.

It is, of course, an online event. The Convention planning committee have put huge effort into putting it all online and making it a really engaging and exciting experience (and in massively reducing costs). There will be a mix of live-streams, break out sessions, interactive chat rooms and so on. But the technical papers will mostly be on-demand viewing, with Q&A and online dialog with the authors. This is great in the sense that you can view it and interact with authors any time, but it means that its easy to overlook really interesting work.

So we’ve gathered together some information about a lot of the presented research that caught our eye as being unusual, exceptionally high quality, or just worth mentioning. And every paper mentioned here will appear soon in the AES E-Library, by the way. Currently though, you can browse all the abstracts by searching the full papers and engineering briefs on the Convention website.

Deep learning and neural networks are all the rage in machine learning nowadays. A few contributions to the field will be presented by Eugenio Donati with ‘Prediction of hearing loss through application of Deep Neural Network’, Simon Plain with ‘Pruning of an Audio Enhancing Deep Generative Neural Network’, Giovanni Pepe’s presentation of ‘Generative Adversarial Networks for Audio Equalization: an evaluation study’, Yiwen Wang presenting ‘Direction of arrival estimation based on transfer function learning using autoencoder network’, and the author of this post, Josh Reiss will present work done mainly by sound designer/researcher Guillermo Peters, ‘A deep learning approach to sound classification for film audio post-production’. Related to this, check out the Workshop on ‘Deep Learning for Audio Applications – Engineering Best Practices for Data’, run by Gabriele Bunkheila of MathWorks (Matlab), which will be live-streamed  on Friday.

There’s enough work being presented on spatial audio that there could be a whole conference on the subject within the convention. A lot of that is in Keynotes, Workshops, Tutorials, and the Heyser Memorial Lecture by Francis Rumsey. But a few papers in the area really stood out for me. Toru Kamekawa’s investigated a big question with ‘Are full-range loudspeakers necessary for the top layer of 3D audio?’ Marcel Nophut’s ‘Multichannel Acoustic Echo Cancellation for Ambisonics-based Immersive Distributed Performances’ has me intrigued because I know a bit about echo cancellation and a bit about ambisonics, but have no idea how to do the former for the latter.

And I’m intrigued by ‘Creating virtual height loudspeakers using VHAP’, presented by Kacper Borzym. I’ve never heard of VHAP, but the original VBAP paper is the most highly cited paper in the Journal of the AES (1367 citations at the time of writing this).

How good are you at understanding speech from native speakers? How about when there’s a lot of noise in the background? Do you think you’re as good as a computer? Gain some insight into related research when viewing the presentation by Eugenio Donati on ‘Comparing speech identification under degraded acoustic conditions between native and non-native English speakers’.

There’s a few papers exploring creative works, all of which look interesting and have great titles. David Poirier-Quinot will present ‘Emily’s World: behind the scenes of a binaural synthesis production’. Music technology has a fascinating history. Michael J. Murphy will explore the beginning of a revolution with ‘Reimagining Robb: The Sound of the World’s First Sample-based Electronic Musical Instrument circa 1927’. And if you’re into Scandinavian instrumental rock music (and who isn’t?), Zachary Bresler’s presentation of ‘Music and Space: A case of live immersive music performance with the Norwegian post-rock band Spurv’ is a must.

robb

Frank Morse Robb, inventor of the first sample-based electronic musical instrument.

But sound creation comes first, and new technologies are emerging to do it. Damian T. Dziwis will present ‘Body-controlled sound field manipulation as a performance practice’. And particularly relevant given the worldwide isolation going on is ‘Quality of Musicians’ Experience in Network Music Performance: A Subjective Evaluation,’ presented by Konstantinos Tsioutas.

Portraiture looks at how to represent or capture the essence and rich details of a person. Maree Sheehan explores how this is achieved sonically, focusing on Maori women, in an intriguing presentation on ‘Audio portraiture sound design- the development and creation of audio portraiture within immersive and binaural audio environments.’

We talked about exciting research on metamaterials for headphones and loudspeakers when giving previews of previous AES Conventions, and there’s another development in this area presented by Sebastien Degraeve in ‘Metamaterial Absorber for Loudspeaker Enclosures’

Paul Ferguson and colleagues look set to break some speed records, but any such feats require careful testing first, as in ‘Trans-Europe Express Audio: testing 1000 mile low-latency uncompressed audio between Edinburgh and Berlin using GPS-derived word clock’

Our own research has focused a lot on intelligent music production, and especially automatic mixing. A novel contribution to the field, and a fresh perspective, is given in Nyssim Lefford’s presentation of ‘Mixing with Intelligent Mixing Systems: Evolving Practices and Lessons from Computer Assisted Design’.

Subjective evaluation, usually in the form of listening tests, is the primary form of testing audio engineering theory and technology. As Feynman said, ‘if it disagrees with experiment, its wrong!’

And thus, there are quite a few top-notch research presentations focused on experiments with listeners. Minh Voong looks at an interesting aspect of bone conduction with ‘Influence of individual HRTF preference on localization accuracy – a comparison between regular and bone conducting headphones. Realistic reverb in games is incredibly challenging because characters are always moving, so Zoran Cvetkovic tackles this with ‘Perceptual Evaluation of Artificial Reverberation Methods for Computer Games.’ The abstract for Lawrence Pardoe’s ‘Investigating user interface preferences for controlling background-foreground balance on connected TVs’ suggests that there’s more than one answer to that preference question. That highlights the need for looking deep into any data, and not just considering the mean and standard deviation, which often leads to Simpson’s Paradox. And finally, Peter Critchell will present ‘A new approach to predicting listener’s preference based on acoustical parameters,’ which addresses the need to accurately simulate and understand listening test results.

There are some talks about really rigorous signal processing approaches. Jens Ahren will present ‘Tutorial on Scaling of the Discrete Fourier Transform and the Implied Physical Units of the Spectra of Time-Discrete Signals.’ I’m excited about this because it may shed some light on a possible explanation for why we hear a difference between CD quality and very high sample rate audio formats.

The Constant-Q Transform represents a signal in frequency domain, but with logarithmically spaced bins. So potentially very useful for audio. The last decade has seen a couple of breakthroughs that may make it far more practical.  I was sitting next to Gino Velasco when he won the “best student paper” award for Velasco et al.’s “Constructing an invertible constant-Q transform with nonstationary Gabor frames.” Schörkhuber and Klapuri also made excellent contributions, mainly around implementing a fast version of the transform, culminating in a JAES paper. and the teams collaborated together on a popular Matlab toolbox. Now there’s another advance with Felix Holzmüller presenting ‘Computational efficient real-time capable constant-Q spectrum analyzer’.

The abstract for Dan Turner’s ‘Content matching for sound generating objects within a visual scene using a computer vision approach’ suggests that it has implications for selection of sound effect samples in immersive sound design. But I’m a big fan of procedural audio, and think this could have even higher potential for sound synthesis and generative audio systems.

And finally, there’s some really interesting talks about innovative ways to conduct audio research based on practical challenges. Nils Meyer-Kahlen presents ‘DIY Modifications for Acoustically Transparent Headphones’. The abstract for Valerian Drack’s ‘A personal, 3D printable compact spherical loudspeaker array’, also mentions its use in a DIY approach. Joan La Roda’s own experience of festival shows led to his presentation of ‘Barrier Effect at Open-air Concerts, Part 1’. Another presentation with deep insights derived from personal experience is Fabio Kaiser’s ‘Working with room acoustics as a sound engineer using active acoustics.’ And the lecturers amongst us will be very interested in Sebastian Duran’s ‘Impact of room acoustics on perceived vocal fatigue of staff-members in Higher-education environments: a pilot study.’

Remember to check the AES E-Library which will soon have all the full papers for all the presentations mentioned here, including listing all authors not just presenters. And feel free to get in touch with us. Josh Reiss (author of this blog entry), J. T. Colonel, and Angeliki Mourgela from the Audio Engineering research team within the Centre for Digital Music, will all be (virtually) there.

Radical and rigorous research at the upcoming Audio Engineering Society Convention

aes-ny-19-logo-small

We previewed the 142nd, 143rd, 144th  and 145th Audio Engineering Society (AES) Conventions, which we also followed with wrap-up discussions. Then we took a break, but now we’re back to preview the 147th AES  convention, October 16 to 19 in New York. As before, the Audio Engineering research team here aim to be quite active at the convention.

We’ve gathered together some information about a lot of the research-oriented events that caught our eye as being unusual, exceptionally high quality, involved in, attending, or just worth mentioning. And this Convention will certainly live up to the hype.

Wednesday October 16th

When I first read the title of the paper ‘Evaluation of Multichannel Audio in Automobiles versus Mobile Phones‘, presented at 10:30, I thought it was a comparison of multichannel automotive audio versus the tinny, quiet mono or barely stereo from a phone. But its actually comparing results of a listening test for stereo vs multichannel in a car, with results of a listening test for stereo vs multichannel for the same audio, but from a phone and rendered over headphones. And the results look quite interesting.

Deep neural networks are all the rage. We’ve been using DNNs to profile a wide variety of audio effects. Scott Hawley will be presenting some impressive related work at 9:30, ‘Profiling Audio Compressors with Deep Neural Networks.’

We previously presented work on digital filters that closely match their analog equivalents. We pointed out that such filters can have cut-off frequencies beyond Nyquist, but did not explore that aspect. ‘Digital Parametric Filters Beyond Nyquist Frequency‘, at 10 am, investigates this idea in depth.

I like a bit of high quality mathematical theory, and that’s what you get in Tamara Smyth’s 11:30 paper ‘On the Similarity between Feedback/Loopback Amplitude and Frequency Modulation‘, which shows a rather surprising (at least at first glance) equivalence between two types of feedback modulation.

There’s an interesting paper at 2pm, ‘What’s Old Is New Again: Using a Physical Scale Model Echo Chamber as a Real-Time Reverberator‘, where reverb is simulated not with impulse response recordings, or classic algorithms, but using scaled models of echo chambers.

At 4 o’clock, ‘A Comparison of Test Methodologies to Personalize Headphone Sound Quality‘ promises to offer great insights not just for headphones, but into subjective evaluation of audio in general.

There’s so many deep learning papers, but the 3-4:30 poster ‘Modal Representations for Audio Deep Learning‘ stands out from the pack. Deep learning for audio most often works with raw spectrogram data. But this work proposes learning modal filterbank coefficients directly, and they find it gives strong results for classification and generative tasks. Also in that session, ‘Analysis of the Sound Emitted by Honey Bees in a Beehive‘ promises to be an interesting and unusual piece of work. We talked about their preliminary results in a previous entry, but now they’ve used some rigorous audio analysis to make deep and meaningful conclusions about bee behaviour.

Immerse yourself in the world of virtual and augmented reality audio technology today, with some amazing workshops, like Music Production in VR and AR, Interactive AR Audio Using Spark, Music Production in Immersive Formats, ISSP: Immersive Sound System Panning, and Real-time Mixing and Monitoring Best Practices for Virtual, Mixed, and Augmented Reality. See the Calendar for full details.

Thursday, October 17th

An Automated Approach to the Application of Reverberation‘, at 9:30, is the first of several papers from our team, and essentially does something to algorithmic reverb similar to what “Parameter Automation in a Dynamic Range Compressor” did for a dynamic range compressor.

Why do public address (PA) systems sound for large venues sound so terrible? They actually have regulations for speech intelligibility. But this is only measured in empty stadiums. At 11 am, ‘The Effects of Spectators on the Speech Intelligibility Performance of Sound Systems in Stadia and Other Large Venues‘ looks at the real world challenges when the venue is occupied.

Two highlights of the 9-10:30 poster session, ‘Analyzing Loudness Aspects of 4.2 Million Musical Albums in Search of an Optimal Loudness Target for Music Streaming‘ is interesting, not just for the results, applications and research questions, but also for the fact that involved 4.2 million albums. Wow! And there’s a lot more to audio engineering research than what one might think. How about using acoustic sensors to enhance autonomous driving systems, which is a core application of ‘Audio Data Augmentation for Road Objects Classification‘.

Audio forensics is a fascinating world, where audio engineering is often applied to unusually but crucially. One such situation is explored at 2:15 in ‘Forensic Comparison of Simultaneous Recordings of Gunshots at a Crime Scene‘, which involves looking at several high profile, real world examples.

Friday, October 18th

There are two papers looking at new interfaces for virtual reality and immersive audio mixing, ‘Physical Controllers vs. Hand-and-Gesture Tracking: Control Scheme Evaluation for VR Audio Mixing‘ at 10:30, and ‘Exploratory Research into the Suitability of Various 3D Input Devices for an Immersive Mixing Task‘ at 3:15.

At 9:15, J. T. Colonel from our group looks into the features that relate, or don’t relate, to preference for multitrack mixes in ‘Exploring Preference for Multitrack Mixes Using Statistical Analysis of MIR and Textual Features‘, with some interesting results that invalidate some previous research. But don’t let negative results discourage ambitious approaches to intelligent mixing systems, like Dave Moffat’s (also from here) ‘Machine Learning Multitrack Gain Mixing of Drums‘, which follows at 9:30.

Continuing this theme of mixing analysis and automation is the poster ‘A Case Study of Cultural Influences on Mixing Preference—Targeting Japanese Acoustic Major Students‘, shown from 3:30-5, which does a bit of meta-analysis by merging their data with that of other studies.

Just below, I mention the need for multitrack audio data sets. Closely related, and also much needed, is this work on ‘A Dataset of High-Quality Object-Based Productions‘, also in the 3:30-5 poster session.

Saturday, October 19th

We’re approaching a world where almost every surface can be a visual display. Imagine if every surface could be a loudspeaker too. Such is the potential of metamaterials, discussed in ‘Acoustic Metamaterial in Loudspeaker Systems Design‘ at 10:45.

Another session, 9 to 11:30 has lots of interesting presentations about music production best practices. At 9, Amandine Pras presents ‘Production Processes of Pop Music Arrangers in Bamako, Mali‘. I doubt there will be many people at the convention who’ve thought about how production is done there, but I’m sure there will be lots of fascinating insights. This is followed at 9:30 by ‘Towards a Pedagogy of Multitrack Audio Resources for Sound Recording Education‘. We’ve published a few papers on multitrack audio collections, sorely needed for researchers and educators, so its good to see more advances.

I always appreciate filling the gaps in my knowledge. And though I know a lot about sound enhancement, I’ve never dived into how its done and how effective it is in soundbars, now widely used in home entertainment. So I’m looking forward to the poster ‘A Qualitative Investigation of Soundbar Theory‘, shown 10:30-12. From the title and abstract though, this feels like it might work better as an oral presentation. Also in that session, the poster ‘Sound Design and Reproduction Techniques for Co-Located Narrative VR Experiences‘ deserves special mention, since it won the Convention’s Best Peer-Reviewed Paper Award, and promises to be an important contribution to the growing field of immersive audio.

Its wonderful to see research make it into ‘product’, and ‘Casualty Accessible and Enhanced (A&E) Audio: Trialling Object-Based Accessible TV Audio‘, presented at 3:45, is a great example. Here, new technology to enhance broadcast audio for those with hearing loss iwas trialed for a popular BBC drama, Casualty. This is of extra interest to me since one of the researchers here, Angeliki Mourgela, does related research, also in collaboration with BBC. And one of my neighbours is an actress who appears on that TV show.

I encourage the project students working with me to aim for publishable research. Jorge Zuniga’s ‘Realistic Procedural Sound Synthesis of Bird Song Using Particle Swarm Optimization‘, presented at 2:30, is a stellar example. He created a machine learning system that uses bird sound recordings to find settings for a procedural audio model. Its a great improvement over other methods, and opens up a whole field of machine learning applied to sound synthesis.

At 3 o’clock in the same session is another paper from our team, Angeliki Mourgela presenting ‘Perceptually Motivated Hearing Loss Simulation for Audio Mixing Reference‘. Roughly 1 in 6 people suffer from some form of hearing loss, yet amazingly, sound engineers don’t know what the content will sound like to them. Wouldn’t it be great if the engineer could quickly audition any content as it would sound to hearing impaired listeners? That’s the aim of this research.

About three years ago, I published a meta-analysis on perception of high resolution audio, which received considerable attention. But almost all prior studies dealt with music content, and there are good reasons to consider more controlled stimuli too (noise, tones, etc). The poster ‘Discrimination of High-Resolution Audio without Music‘ does just that. Similarly, perceptual aspects of dynamic range compression is an oft debated topic, for which we have performed listening tests, and this is rigorously investigated in ‘Just Noticeable Difference for Dynamic Range Compression via “Limiting” of a Stereophonic Mix‘. Both posters are in the 3-4:30 session.

The full program can be explored on the Convention Calendar or the Convention website. Come say hi to us if you’re there! Josh Reiss (author of this blog entry), J. T. Colonel, Angeliki Mourgela and Dave Moffat from the Audio Engineering research team within the Centre for Digital Music, will all be there.

What we did in 2018

2018 is coming to an end, and everyone is rushing to get their ‘Year in Review’ articles out. We’re no different in that regard. Only we’re going to do it in two parts, first what we have been doing this year, and then a second blog entry reviewing all the great breakthroughs and interesting research results in audio engineering, psychoacoustics, sound synthesis and related fields.

But first, lets talk about us. 🙂

I think we’ve all done some wonderful research this year, and the Audio Engineering team here can be proud of the results and progress.

Social Media:

First off, we’ve increased our social media presence tremendously,

• This blog, intelligentsoundengineering.wordpress.com/ has almost 22,000 views, with  1,711 followers, mostly through other social media.

• Our twitter account, twitter.com/IntelSoundEng has 886 followers. Not huge, but growing and doing well a research-focused feed.

• Our Youtube channel, www.youtube.com/user/IntelligentSoundEng has over 20,000 views and 206 subscribers. Which reminds me, I’ve got some more videos to put up.

If you haven’t already, subscribe to the feeds and tell your friends 😉 .

Awards:

Last year’s three awards was exceptional. This year I won Queen Mary University of London’s Bruce Dickinson Entrepreneur of the Year award. Here’s a little video featuring all the shortlisted nominees (I start about 50 seconds in).

I gave the keynote talk at this year’s Digital Audio Effects Conference. And not exactly an award, but still a big deal. I gave my inaugural professorship lecture, titled Do you hear what I hear? The science of everyday sounds.

People:

This was the year everyone graduated!

David Moffat, Yonghao Wang, Dave Ronan, Josh Mycroft, and Rod Selfridge  all successfully defended their PhDs. They did amazing and are all continuing to impress.

Parham Bahadoran and Tom Vassallo started exciting positions at AI Music, and Brecht de Man started with Semantic Audio. Expect great things from both those companies. There’s lots of others who moved around- too many to mention.

Grants and projects:

We finished the Cross-adaptive processing for musical intervention project  and the Autonomous Systems for Sound Integration and GeneratioN (ASSIGN) InnovateUK project. We’ve been working closely with industry on a variety of projects, especially with RPPtv, who are funding Emmanouil Chourdakis’s PhD and collaborated on InnovateUK projects. We are starting a very interesting ICASE Studentship with BBC- more on that in another entry, and may soon start a studentship with Yamaha. We formed the spin-out company FXive, which hopefully will be able to launch product soon.

Publications:

We had a great year for publications. I’ve listed all the ones I can think of below.

Journal articles

  1. Hu, W., Ma, T., Wang, Y., Xu, F., & Reiss, J. (2018). TDCS: a new scheduling framework for real-time multimedia OS. International Journal of Parallel, Emergent and Distributed Systems, 1-16.
  2. R. Selfridge, D. Moffat, E. Avital and J. D. Reiss, ‘Creating Real-Time Aeroacoustic Sound Effects Using Physically Derived Models,’ Journal of the Audio Engineering Society, 66 (7/8), pp. 594–607, July/August 2018, DOI: https://doi.org/10.17743/jaes.2018.0033
  3. J. D. Reiss, Ø. Brandtsegg, ‘Applications of cross-adaptive audio effects: automatic mixing, live performance and everything in between,’ Frontiers in Digital Humanities, 5 (17), 28 June 2018
  4. D. Moffat and J. D. Reiss, ‘Perceptual Evaluation of Synthesized Sound Effects,’ ACM Transactions on Applied Perception, 15 (2), April 2018
  5. Milo, Alessia, Nick Bryan-Kinns, and Joshua D. Reiss. “Graphical Research Tools for Acoustic Design Training: Capturing Perception in Architectural Settings” In Handbook of Research on Perception-Driven Approaches to Urban Assessment and Design, pp. 397-434. IGI Global, 2018.
  6. H. Peng and J. D. Reiss, ‘Why Can You Hear a Difference between Pouring Hot and Cold Water? An Investigation of Temperature Dependence in Psychoacoustics,’ 145th AES Convention, New York, Oct. 2018
  7. N. Jillings, B. De Man, R. Stables, J. D. Reiss, ‘Investigation into the Effects of Subjective Test Interface Choice on the Validity of Results.’ 145th AES Convention, New York, Oct. 2018
  8. P. Bahadoran, A. Benito, W. Buchanan and J. D. Reiss, “FXive: investigation and implementation of a sound effect synthesis service,” Amsterdam, International Broadcasting Convention (IBC), 2018
  9. M. A. Martinez Ramirez and J. D. Reiss, ‘End-to-end equalization with convolutional neural networks,’ Digital Audio Effects (DAFx), Aveiro, Portugal, Sept. 4–8 2018.
  10. D. Moffat and J. D. Reiss, “Objective Evaluations of Synthesised Environmental Sounds,” Digital Audio Effects (DAFx), Aveiro, Portugal, Sept. 4–8 2018
  11. W. J. Wilkinson, J. D. Reiss, D. Stowell, ‘A Generative Model for Natural Sounds Based on Latent Force Modelling,’ Arxiv pre-print version. International Conference on Latent Variable Analysis and Signal Separation, Guildford, UK, July 2018
  12. E. T. Chourdakis and J. D. Reiss, ‘From my pen to your ears: automatic production of radio plays from unstructured story text,’ 15th Sound and Music Computing Conference (SMC), Limassol, Cyprus, 4-7 July, 2018
  13. R. Selfridge, J. D. Reiss, E. Avital, Physically Derived Synthesis Model of an Edge Tone, Audio Engineering Society Convention 144, May 2018
  14. A. Pras, B. De Man, J. D Reiss, A Case Study of Cultural Influences on Mixing Practices, Audio Engineering Society Convention 144, May 2018
  15. J. Flynn, J. D. Reiss, Improving the Frequency Response Magnitude and Phase of Analogue-Matched Digital Filters, Audio Engineering Society Convention 144, May 2018
  16. P. Bahadoran, A. Benito, T. Vassallo, J. D. Reiss, FXive: A Web Platform for Procedural Sound Synthesis, Audio Engineering Society Convention 144, May 2018

 

See you in 2019!

Cultural Influences on Mixing Practices

TL;DR: we are presenting a paper at the upcoming AES Convention in Milan on differences in mixes by engineers from different backgrounds, and qualitative analysis of the mixer’s notes as well as the critical listening comments of others.


We recently reviewed research to be presented at the AES 144th Convention, with further blog entries on some of our own contributions, analog-matched EQ and physically derived synthesis of edge tones. Here’s one more preview.

The mixing of multitrack music has been a core research interest of this group for the past ten years. In particular, much of the research in this area relates to the automation or streamlining of various processes which traditionally require significant time and effort from the mix engineer. To do that successfully, however, we need to have an excellent understanding of the process of the mix engineer, and the impact of the various signal manipulations on the perception of the listener. Members of this group have worked on projects that sought to expand this understanding by surveying mix engineers, analysing existing mixes, conducting psychoacoustic tests to optimise specific signal processing parameters, and measuring the subjective response to different mixes of the same song. This knowledge has lead to the creation of novel music production tools, but also just a better grasp of this exceedingly multidimensional and esoteric process.

At the upcoming Convention of the Audio Engineering Society in Milan, 23-26 May 2018, we will present a paper that builds on our previous work into analysis of mix creation and evaluation. Whereas previously the analysis of contrasting mixes was mostly quantitative in nature, this work focuses on the qualitative annotation of mixes and the documentation provided by the respective creators. Using these methods we investigated which mix principles and listening criteria the participants shared, and what the impact of available technology is (fully in the box vs outboard processing available).

We found that the task order, balancing practices, and choice of effects was unique, though some common trends were identified: starting the mix with all faders at 0 dB, creating subgroups, and changing levels and effect parameters for different song sections, to name a few. Furthermore, all mixes were made ‘in the box’, i.e. using only software) even when analogue equipment was available.

Furthermore, the large existing dataset we collected during the last few years (in particular as part of Brecht De Man’s PhD) allowed us to compare mixes from the subjects of this study – students of the Paris Conservatoire – to mixes by students from other institutions. To this end, we used one multitrack recording which has served as source material in several previous experiments. Quantitative analysis of level balancing practices showed no significant deviation between institutions – consistent with previous findings.

The paper is written by Amandine Pras, a collaborator from the University of Lethbridge who is among others an expert on qualitative analysis of music production practices; Brecht De Man, previously a member of this group and now a Research Fellow with our collaborators at Birmingham City University; and Josh Reiss, head of this group. All will be present at the Convention. Do come say hi!


You can already read the paper here:

Amandine Pras, Brecht De Man and Joshua D. Reiss, “A Case Study of Cultural Influences on Mixing Practices,” AES Convention 144, May 2018.

Weird and wonderful research to be unveiled at the 144th Audio Engineering Society Convention

th

Last year, we previewed the142nd and 143rd AES Conventions, which we followed with a wrap-up discussions here and here. The next AES  convention is just around the corner, May 23 to 26 in Milan. As before, the Audio Engineering research team here aim to be quite active at the convention.

These conventions have thousands of attendees, but aren’t so large that you get lost or overwhelmed. Away from the main exhibition hall is the Technical Program, which includes plenty of tutorials and presentations on cutting edge research.

So we’ve gathered together some information about a lot of the events that caught our eye as being unusual, exceptionally high quality involved in, attending, or just worth mentioning. And this Convention will certainly live up to the hype.

Wednesday May 23rd

From 11:15 to 12:45 that day, there’s an interesting poster by a team of researchers from the University of Limerick titled Can Visual Priming Affect the Perceived Sound Quality of a Voice Signal in Voice over Internet Protocol (VoIP) Applications? This builds on work we discussed in a previous blog entry, where they did a perceptual study of DFA Faders, looking at how people’s perception of mixing changes when the sound engineer only pretends to make an adjustment.

As expected given the location, there’s lots of great work being presented by Italian researchers. The first one that caught my eye is the 2:30-4 poster on Active noise control for snoring reduction. Whether you’re a loud snorer, sleep next to someone who is a loud snorer or just interested in unusual applications of audio signal processing, this one is worth checking out.

Do you get annoyed sometimes when driving and the road surface changes to something really noisy? Surely someone should do a study and find out which roads are noisiest so that then we can put a bit of effort into better road design and better in-vehicle equalisation and noise reduction? Well, now its finally happened with this paper in the same session on Deep Neural Networks for Road Surface Roughness Classification from Acoustic Signals.

Thursday, May 24

If you were to spend only one day this year immersing yourself in frontier audio engineering research, this is the day to do it.

How do people mix music differently in different countries? And do people perceive the mixes differently based on their different cultural backgrounds? These are the sorts of questions our research team here have been asking. Find out more in this 9:30 presentation by Amandine Pras. She led this Case Study of Cultural Influences on Mixing Practices, in collaboration with Brecht De Man (now with Birmingham City University) and myself.

Rod Selfridge has been blazing new trails in sound synthesis and procedural audio. He won the Best Student Paper Award at AES 141st Convention and the Best Paper Award at Sound and Music Computing. He’ll give another great presentation at noon on Physically Derived Synthesis Model of an Edge Tone which was also discussed in a recent blog entry.

I love the title of this next paper, Miniaturized Noise Generation System—A Simulation of a Simulation, which will be presented at 2:30pm by researchers from Intel Technology in Gdansk, Poland. This idea of a meta-simulation is not as uncommon as you might think; we do digital emulation of old analogue synthesizers, and I’ve seen papers on numerical models of Foley rain sound generators.

A highlight for our team here is our 2:45 pm presentation, FXive: A Web Platform for Procedural Sound Synthesis. We’ll be unveiling a disruptive innovation for sound design, FXive.com, aimed at replacing reliance on sound effect libraries. Please come check it out, and get in touch with the presenters or any members of the team to find out more.

Immediately following this is a presentation which asks Can Algorithms Replace a Sound Engineer? This is a question the research team here have also investigated a lot, you could even say it was the main focus of our research for several years. The team behind this presentation are asking it in relation to Auto-EQ. I’m sure it will be interesting, and I hope they reference a few of our papers on the subject.

From 9-10:30, I will chair a Workshop on The State of the Art in Sound Synthesis and Procedural Audio, featuring the world’s experts on the subject. Outside of speech and possibly music, sound synthesis is still in its infancy, but its destined to change the world of sound design in the near future. Find out why.

12:15 — 13:45 is a workshop related to machine learning in audio (a subject that is sometimes called Machine Listening), Deep Learning for Audio Applications. Deep learning can be quite a technical subject, and there’s a lot of hype around it. So a Workshop on the subject is a good way to get a feel for it. See below for another machine listening related workshop on Friday.

The Heyser Lecture, named after Richard Heyser (we discussed some of his work in a previous entry), is a prestigious evening talk given by one of the eminent individuals in the field. This one will be presented by Malcolm Hawksford. , a man who has had major impact on research in audio engineering for decades.

Friday

The 9:30 — 11 poster session features some unusual but very interesting research. A talented team of researchers from Ancona will present A Preliminary Study of Sounds Emitted by Honey Bees in a Beehive.

Intense solar activity in March 2012 caused some amazing solar storms here on Earth. Researchers in Finland recorded them, and some very unusual results will be presented in the same session with the poster titled Analysis of Reports and Crackling Sounds with Associated Magnetic Field Disturbances Recorded during a Geomagnetic Storm on March 7, 2012 in Southern Finland.

You’ve been living in a cave if you haven’t noticed the recent proliferation of smart devices, especially in the audio field. But what makes them tick, is there a common framework and how are they tested? Find out more at 10:45 when researchers from Audio Precision will present The Anatomy, Physiology, and Diagnostics of Smart Audio Devices.

From 3 to 4:30, there’s a Workshop on Artificial Intelligence in Your Audio. It follows on from a highly successful workshop we did on the subject at the last Convention.

Saturday

A couple of weeks ago, John Flynn wrote an excellent blog entry describing his paper on Improving the Frequency Response Magnitude and Phase of Analogue-Matched Digital Filters. His work is a true advance on the state of the art, providing digital filters with closer matches to their analogue counterparts than any previous approaches. The full details will be unveiled in his presentation at 10:30.

If you haven’t seen Mariana Lopez presenting research, you’re missing out. Her enthusiasm for the subject is infectious, and she has a wonderful ability to convey the technical details, their deeper meanings and their importance to any audience. See her one hour tutorial on Hearing the Past: Using Acoustic Measurement Techniques and Computer Models to Study Heritage Sites, starting at 9:15.

The full program can be explored on the Convention Calendar or the Convention website. Come say hi to us if you’re there! Josh Reiss (author of this blog entry), John Flynn, Parham Bahadoran and Adan Benito from the Audio Engineering research team within the Centre for Digital Music, along with two recent graduates Brecht De Man and Rod Selfridge, will all be there.

Audio Engineering Society E-library

I try to avoid too much promotion in this blog, but in this case I think its justified. I’m involved in advancing a resource from a non-profit professional organisation, the Audio Engineering Society. They do lots and lots of different things, promoting the science, education and practice of all things audio engineering related. Among others, they’ve been publishing research in the area for almost 70 years, and institutions can get full access to all the content in a searchable library. In recent posts, I’ve written about some of the greatest papers ever published there, Part 1 and Part 2, and about one of my own contributions.

In an ideal world, this would all be Open Access . But publishing still costs money, so the AES support both gold Open Access (free to all, but authors pay Article Processing Charges) and the traditional model, where its free to publish but individuals or institutions subscribe or articles can be purchased individually. AES members get free access. I could write many blog articles just about Open Access (should I?)- its never as straightforward as it seems. At its best it is freely disseminating information for the benefit of all, but at its worst its like Pay to Play, a highly criticised practice for the music industry, and gives publishers an incentive to lower acceptance standards. But for now I’ll just point out that the AES does its absolute best to keep the costs down, regardless of publishing model, and the costs are generally much less than similar publishers.

Anyway, the AES realised that one of the most cost effective ways to get our content out to large communities is through institutional licenses or subscriptions. And we’re missing an opportunity here since we haven’t really promoted this option. And everybody benefits from it; wider dissemination of knowledge and research, more awareness of the AES, better access, etc. With this in mind, the AES issued the following press release, which I have copied verbatim. You can also find it as a tweet, blog entry or facebook post.

AES_ELibrary

AES E-Library Subscriptions Benefit Institutions and Organizations

— The Audio Engineering Society E-Library is the world’s largest collection of audio industry resources, and subscriptions provide access to extensive content for research, product development and education — 

New York, NY, March 22, 2018 — Does your research staff, faculty or students deserve access to the world’s most comprehensive collection of audio information? The continuously growing Audio Engineering Society (AES) E-Library contains over 16,000 fully searchable PDF files documenting the progression of audio research from 1953 to the present day. It includes every AES paper published from every AES convention and conference, as well as those published in the Journal of the Audio Engineering Society. From the phonograph to MP3s, from early concepts of digital audio through its fulfillment as the mainstay of audio production, distribution and reproduction, to leading-edge realization of spatial audio and audio for augmented and virtual reality, the E-Library provides a gateway to both the historical and the forward-looking foundational knowledge that sustains an entire industry.  

The AES E-Library has become the go-to online resource for anyone looking to gain instant access to the vast amount of information gathered by the Audio Engineering Society through research, presentations, interviews, conventions, section meetings and more. “Our academic and research staff, and PhD and undergraduate Tonmeister students, use the AES E-Library a lot,” says Dr. Tim Brookes, Senior Lecturer in Audio & Director of Research Institute of Sound Recording (IoSR) University of Surrey. “It’s an invaluable resource for our teaching, for independent student study and, of course, for our research.” 

“Researchers, academics and students benefit from E-Library access daily,” says Joshua Reiss, Chair of the AES Publications Policy Committee, “while many relevant institutions – academic, governmental or corporate – do not have an institutional license of the AES E-library, which means their staff or students are missing out on all the wonderful content there. We encourage all involved in audio research and investigation to inquire if their libraries have an E-Library subscription and, if not, suggest the library subscribe.” 

E-Library subscriptions can be obtained directly from the AES or through journal bundling services. A subscription allows a library’s users to download any document in the E-Library at no additional cost. 

“As an international audio company with over 25,000 employees world-wide, the AES E-library has been an incredibly valuable resource used by Harman audio researchers, engineers, patent lawyers and others,” says Dr. Sean Olive, Acoustic Research Fellow, Harman International. “It has paid for itself many times over.” 

The fee for an institutional online E-Library subscription is $1800 per year, which is significantly less than equivalent publisher licenses. 

To search the E-library, go to http://www.aes.org/e-lib/

To arrange for an institutional license, contact Lori Jackson directly at lori.jackson@aes.org, or go to http://www.aes.org/e-lib/subscribe/.

 

About the Audio Engineering Society
The Audio Engineering Society, celebrating its 70th anniversary in 2018, now counts over 12,000 members throughout the U.S., Latin America, Europe, Japan and the Far East. The organization serves as the pivotal force in the exchange and dissemination of technical information for the industry. Currently, its members are affiliated with 90 AES professional sections and more than 120 AES student sections around the world. Section activities include guest speakers, technical tours, demonstrations and social functions. Through local AES section events, members experience valuable opportunities for professional networking and personal growth. For additional information visit http://www.aes.org.

Join the conversation and keep up with the latest AES News and Events:
Twitter: #AESorg (AES Official) 
Facebook: http://facebook.com/AES.org

The cavity tone……

In September 2017, I attended the 20th International Conference on Digital Audio Effects in Edinburgh. At this conference, I presented my work on a real-time physically derived model of a cavity tone. The cavity tone is one of the fundamental aeroacoustic sounds, similar to previously described Aeolian tone. The cavity tone commonly occurs in aircraft when opening bomb bay doors or by the cavities left when the landing gear is extended. Another example of the cavity tone can be seen when swinging a sword with a grooved profile.

The physics of operation is a can be a little complicated. To try and keep it simple, air flows over the cavity and comes into contact with air at a different velocity within the cavity. The movement of air at one speed over air at another cause what’s known as shear layer between the two. The shear layer is unstable and flaps against the trailing edge of the cavity causing a pressure pulse. The pressure pulse travels back upstream to the leading edge and re-enforces the instability. This causes a feedback loop which will occur at set frequencies. Away from the cavity the pressure pulse will be heard as an acoustic tone – the cavity tone!

A diagram of this is shown below:

Like the previously described Aeolian tone, there are equations to derive the frequency of the cavity tone. This is based on the length of the cavity and the airspeed. There are a number of modes of operation, usually ranging from 1 – 4. The acoustic intensity has also been defined which is based on airspeed, position of the listener and geometry of the cavity.

The implementation of an individual mode cavity tone is shown in the figure below. The Reynolds number is a dimensionless measure of the ratio between the inertia and viscous force in the flow and Q relates to the bandwidth of the passband of the bandpass filter.

Comparing our model’s average frequency prediction to published results we found it was 0.3% lower than theoretical frequencies, 2.0% lower than computed frequencies and 6.4% lower than measured frequencies. A copy of the pure data synthesis model can be downloaded here.

 

Audio Research Year in Review- Part 1, It’s all about us

Enjoy the holiday!

So as 2017 is coming to an end, everyone is rushing to get their ‘Year in Review’ articles out. And we’re no different in that regard. Only we’re going to do it in two parts, first what we have been doing this year, and then a second blog entry reviewing all the great breakthroughs and interesting research results in audio engineering, psychoacoustics, sound synthesis and related fields.

But first, lets talk about us. 🙂

I think we’ve all done some wonderful research this year, and the Audio Engineering team here can be proud of the results and progress.

Social Media:

First off, we’ve increased our social media presence tremendously,

• This blog, intelligentsoundengineering.wordpress.com/ has 7,363 views, with  1,128 followers, mostly through other social media.

• We started a twitter account, twitter.com/IntelSoundEng and now have 615 followers. Not huge, but doing well for the first few months of a research-focused feed.

• Our Youtube channel, www.youtube.com/user/IntelligentSoundEng has 16,778 views and 178 subscribers

Here’s a sample video from our YouTube channel;

So people are reading and watching, which gives us even more incentive to put stuff out there that’s worth it for you to check out.

Awards:

We won three best paper or presentation awards;

Adan Benito (left) and Thomas Vassallo (right) for best presentation at the Web Audio Conference

benito vassallo awardRod Selfridge (right) , Dave Moffat and I, for best paper at Sound and Music Computing

selfridge award

I (right) won the best Journal of the Audio Engineering Society paper award, 2016 (announced in 2017 of course)

reiss award2

 

People:

Brecht De Man got his PhD and Yonghao Wang submitted his. Dave Ronan, Alessia Milo, Josh Mycroft and Rod Selfridge have all entered the write-up stage of their PhDs.

Brecht started a post-doc position and became Vice-Chair of the AES Education Committee, and I (Josh Reiss) was promoted to Professor of Audio Engineering. Dave Ronan started a position at AI Music.

We also welcomed a large number of visitors throughout the year, notably Dr. Amandine Pras and Saurjya Sarkar, now with Qualcomm.

Grants and projects:

We started the Cross-adaptive processing for musical intervention project (supporting Brecht, and Saurjya’s visit) and the Autonomous Systems for Sound Integration and GeneratioN (ASSIGN) InnovateUK project (supporting RTSFX researchers). We completed Brecht’s Yamaha postdoc, with another expected, and completed the QMI Proof of Concept: Sound Effect Synthesis project. We’ve been working closely with industry on a variety of projects, especially with RPPtv, who are funding Emmanouil Chourdakis’s PhD and collaborating on InnovateUK projects. We have other exciting grants in progress.

Events:

We’ve been involved in a few workshops. Will Wilkinson and Dave Moffat were on the organising committee for Audio Mostly. Alessia Milo gave an invited talk at the 8th International Symposium on Temporal Design, and organised a soundwalk at Audible Old Kent Road. Brecht and I were on the organizing committee of the 3rd Workshop on Intelligent Music Production. Brecht organized Sound Talking at the Science Museum, and panel sessions on listening tests design at the 142nd and 143rd AES Conventions. Dave Moffat organized a couple of Procedural Audio Now meet-ups.

Publications:

We had a fantastic year for publications, five journal papers (and one more accepted) and twenty conference papers. I’ve listed them all below.

Journal articles

  1. D. Moffat and J. D. Reiss, ‘Perceptual Evaluation of Synthesized Sound Effects,’ accepted for ACM Transactions on Applied Perception
  2. R. Selfridge, D. Moffat and J. D. Reiss, ‘Sound Synthesis of Objects Swinging through Air Using Physical Models,’ Applied Sciences, v. 7 (11), Nov. 2017, Online version doi:10.3390/app7111177
  3. A. Zacharakis, M. Terrell, A. Simpson, K. Pastiadis and J. Reiss ‘Rearrangement of timbre space due to background noise: behavioural evidence and acoustic correlates,’ Acta Acustica united with Acustica, 103 (2), 288-298, 2017. Definitive publisher-authenticated version at http://www.ingentaconnect.com/content/dav/aaua
  4. P. Pestana and J. Reiss, ‘User Preference on Artificial Reverberation and Delay Time Parameters,’ J. Audio Eng. Soc., Vol. 65, No. 1/2, January/February 2017.
  5. B. De Man, K. McNally and J. Reiss, ‘Perceptual evaluation and analysis of reverberation in multitrack music production,’ J. Audio Eng. Soc., Vol. 65, No. 1/2, January/February 2017.
  6. E. Chourdakis and J. Reiss, ‘A machine learning approach to design and evaluation of intelligent artificial reverberation,’ J. Audio Eng. Soc., Vol. 65, No. 1/2, January/February 2017.

Book chapters

  • Accepted: A. Milo, N. Bryan-Kinns, and J. D. Reiss. Graphical Research Tools for Acoustic Design Training: Capturing Perception in Architectural Settings. In Perception-Driven Approaches to Urban Assessment and Design, F. Aletta and X. Jieling (Eds.). IGI Global.
  • J. D. Reiss, ‘An Intelligent Systems Approach to Mixing Multitrack Music‘, Perspectives On Music Production: Mixing Music, Routledge, 2017

Patents

Conference papers

  1. M. A. Martinez Ramirez and J. D. Reiss, ‘Stem Audio Mixing as a Content-Based Transformation of Audio Features,’ IEEE 19th International Workshop on Multimedia Signal Processing, Luton, UK, Oct. 16-18, 2017.
  2. M. A. Martinez Ramirez and J. D. Reiss, ‘Analysis and Prediction of the Audio Feature Space when Mixing Raw Recordings into Individual Stems,’ 143rd AES Convention, New York, Oct. 18-21, 2017.
  3. A. Milo, N. Bryan-Kinns and J. D. Reiss, ‘Influences of a Key Map on Soundwalk Exploration with a Textile Sonic Map,’ 143rd AES Convention, New York, Oct. 18-21, 2017.
  4. A. Milo and J. D. Reiss, ‘Aural Fabric: an interactive textile sonic map,’ Audio Mostly, London, 2017
  5. R. Selfridge, D. Moffat and J. D. Reiss, ‘Physically Derived Sound Synthesis Model of a Propeller,’ Audio Mostly, London, 2017
  6. N. Jillings, R. Stables and J. D. Reiss, ‘Zero-Delay Large Signal Convolution Using Multiple Processor Architectures,’ WASPAA, New York, 2017
  7. E. T. Chourdakis and J. D. Reiss, ‘Constructing narrative using a generative model and continuous action policies,’ CC-NLG, 2017
  8. M. A. Martinez Ramirez and J. D. Reiss, ‘Deep Learning and Intelligent Audio Mixing,’ 3rd Workshop on Intelligent Music Production, Salford, UK, 15 September 2017.
  9. B. De Man, J. D. Reiss and R. Stables, ‘Ten years of automatic mixing,’ 3rd Workshop on Intelligent Music Production, Salford, UK, 15 September 2017.
  10. W. Wilkinson, J. D. Reiss and D. Stowell, ‘Latent Force Models for Sound: Learning Modal Synthesis Parameters and Excitation Functions from Audio Recordings,’ 20th International Conference on Digital Audio Effects (DAFx-17), Edinburgh, UK, September 5–9, 2017
  11. S. Sarkar, J. Reiss and O. Brandtsegg, ‘Investigation of a Drum Controlled Cross-adaptive Audio Effect for Live Performance,’ 20th International Conference on Digital Audio Effects (DAFx-17), Edinburgh, UK, September 5–9, 2017
  12. B. De Man and J. D. Reiss, ‘The mix evaluation dataset,’ 20th International Conference on Digital Audio Effects (DAFx-17), Edinburgh, UK, September 5–9, 2017
  13. D. Moffat, D. Ronan and J. D. Reiss, ‘Unsupervised taxonomy of sound effects,’ 20th International Conference on Digital Audio Effects (DAFx-17), Edinburgh, UK, September 5–9, 2017
  14. R. Selfridge, D. Moffat and J. D. Reiss, ‘Physically Derived Synthesis Model of a Cavity Tone,’ Digital Audio Effects (DAFx) Conf., Edinburgh, September 5–9, 2017
  15. N. Jillings, Y. Wang, R. Stables and J. D. Reiss, ‘Intelligent audio plugin framework for the Web Audio API,’ Web Audio Conference, London, 2017
  16. R. Selfridge, D. J. Moffat and J. D. Reiss, ‘Real-time physical model for synthesis of sword swing sounds,’ Best paper award, Sound and Music Computing (SMC), Helsinki, July 5-8, 2017.
  17. R. Selfridge, D. J. Moffat, E. Avital, and J. D. Reiss, ‘Real-time physical model of an Aeolian harp,’ 24th International Congress on Sound and Vibration (ICSV), London, July 23-27, 2017.
  18. A. Benito and J. D. Reiss, ‘Intelligent Multitrack Reverberation Based on Hinge-Loss Markov Random Fields,’ AES Semantic Audio, Erlangen Germany, June 2017
  19. D. Ronan, H. Gunes and J. D. Reiss, “Analysis of the Subgrouping Practices of Professional Mix Engineers“, AES 142nd Convention, Berlin, May 20-23, 2017
  20. Y. Song, Y. Wang, P. Bull and J. D. Reiss, ‘Performance Evaluation of a New Flexible Time Division Multiplexing Protocol on Mixed Traffic Types,’ 31st IEEE International Conference on Advanced Information Networking and Applications (AINA), Taipei, Taiwan, March 27-29, 2017.

 

Sound Talking at the Science Museum featured assorted speakers on sonic semantics

sound-talking-logo-large

On Friday 3 November, Dr Brecht De Man (Centre for Digital Music, Queen Mary University of London) and Dr Melissa Dickson (Diseases of Modern Life, University of Oxford) organised a one-day workshop at the London Science Museum on the topic of language describing sound, and sound emulating language. We discussed it in a previous blog entry, but now we can wrap up and discuss what happened.

Titled ‘Sound Talking‘, it brought together a diverse lineup of speakers around the common theme of sonic semantics. And with diverse we truly mean that: the programme featured a neuroscientist, a historian, an acoustician, and a Grammy-winning sound engineer, among others.

The event was born from a friendship between two academics who had for a while assumed their work could not be more different, with music technology and history of Victorian literature as their respective fields. When learning their topics were both about sound-related language, they set out to find more researchers from maximally different disciplines and make it a day of engaging talks.

After having Dr Dickson as a resident researcher earlier this year, the Science Museum generously hosted the event, providing a very appropriate and ‘neutral’ central London venue. The venue was further supported by the Diseases of Modern Life project, funded by the European Research Council, and the Centre for Digital Music at Queen Mary University of London.

The programme featured (in order of appearance):

  • Maria Chait, Professor of auditory cognitive neuroscience at UCL, on the auditory system as the brain’s early warning system
  • Jonathan Andrews, Reader in the history of psychiatry at Newcastle University, on the soundscape of the Bethlehem Hospital for Lunatics (‘Bedlam’)
  • Melissa Dickson, postdoctoral researcher in Victorian literature at University of Oxford, on the invention of the stethoscope and the development of an associated vocabulary
  • Mariana Lopez, Lecturer in sound production and post production at University of York, on making film accessible for visually impaired audiences through sound design
  • David M. Howard, Professor of Electronic Engineering at Royal Holloway University of London, on the sound of voice and the voice of sound
  • Brecht De Man, postdoctoral researcher in audio engineering at Queen Mary University of London, on defining the language of music production
  • Mandy Parnell, mastering engineer at Black Saloon Studios, on the various languages of artistic direction
  • Trevor Cox, Professor of acoustic engineering at University of Salford, on categorisation of everyday sounds

In addition to this stellar speaker lineup, Aleks Kolkowski (Recording Angels) exhibited an array of historic sound making objects, including tuning forks, listening tubes, a monochord, and a live recording of a wax cylinder. The workshop took place in a museum, after all, where Dr Kolkowski has held a research associateship, so the display was very fitting.

The full program can be found on the event’s web page. Video proceedings of the event are forthcoming.

My favorite sessions from the 143rd AES Convention

AES_NY

Recently, several researchers from the audio engineering research team here attended the 143rd Audio Engineering Society Convention in New York. Before the Convention, I wrote a blog entry highlighting a lot of the more interesting or adventurous research that was being presented there. As is usually the case at these Conventions, I have so many meetings to attend that I miss out on a lot of highlights, even ones that I flag up beforehand as ‘must see’. Still, I managed to attend some real gems this time, and I’ll discuss a few of them here.

I’m glad that I attended ‘Audio Engineering with Hearing Loss—A Practical Symposium’ . Hearing loss amongst musicians, audiophiles and audio engineers is an important topic that needs more attention. Overexposure, both prolonged and too loud, is a major cause of hearing dage. In addition to all the issues it causes for anybody, for those in the industry, it affects their ability to work or even appreciate their passion. The session had lots of interesting advice.

The most interesting presentation in the session was from Richard Einhorn, a composer and music producer. In 2010, he lost much of his hearing due to a virus. He woke up one day to find that he had completely lost hearing in his right ear, a condition known as Idiopathic Sudden Sensorineural Hearing Loss. This then evolved into hyperacusis, with extreme distortion, excessive volume and speech intelligibility. In many ways, deafness in the right ear would have been preferred. On top of that, his left ear suffered otosclerosis, where everything was at greatly reduced volume. And given that this was his only functioning ear, the risk of surgery to correct it was too great.

Richard has found some wonderful ways to still function, and even continue working in audio and music, with the limited hearing he still has. There’s a wonderful description of them in Hearing Loss Magazine, and they include the use of the ‘Companion Mic,’ which allowed him to hear from many different locations around a busy, noisy environment, like a crowded restaurant.

Thomas Lund presented ‘The Bandwidth of Human Perception and its Implications for Pro Audio.’ I really wasn’t sure about this before the Convention. I had read the abstract, and thought it might be some meandering, somewhat philosophical talk about hearing perception, with plenty of speculation but lacking in substance. I was very glad to be proven wrong! It had aspects of all of that, but in a very positive sense. It was quite rigorous, essentially a systematic review of research in the field that had been published in medical journals. It looks at the question of auditory perceptual bandwidth, where bandwidth is in a general information theoretic and cognitive sense, not specifically frequency range. The research revolves around the fact that, though we receive many megabits of sensory information every second, it seems that we only use dozens of bits per second of information in our higher level perception. This has lots of implications for listening test design, notably on how to deal with aspects like sample duration or training of participants. This was probably the most fascinating technical talk I saw at the Convention.

There were two papers that I had flagged up as having the most interesting titles, ‘Influence of Audience Noises on the Classical Music Perception on the Example of Anti-cough Candies Unwrapping Noise’, and ‘Acoustic Levitation—Standing Wave Demonstration.’ I had an interesting chat with an author of the first one, Adam Pilch. When walking around much later looking for the poster for the second one, I bump into Adam again. Turns out, he was a co-author on both of them! It looks like Adam Pilch and Bartlomiej Chojnacki (the shared authors on those papers) and their co-authors have an appreciation of the joy of doing research for fun and curiousity, and an appreciation for a good paper title.

Leslie Ann Jones was the Heyser lecturer. The Heyser lecture, named after Richard C. Heyser, is an evening talk given by an eminent individual in audio engineering or related fields. Leslie has had a fascinating career, and gave a talk that makes one realise just how much the industry is changing and growing, and how important are the individuals and opportunities that one encounters in a career.

The last session I attended was also one of the best. Chris Pike, who recently became leader of the audio research team at BBC R&D (he has big shoes to fill, but fits them well and is already racing ahead), presented ‘What’s This? Doctor Who with Spatial Audio!’ . I knew this was going to be good because it involved two of my favorite things, but it was much better than that. The audience were all handed headphones so that they could listen to binaural renderings used throughout the presentation. I love props at technical talks! I also expected the talk to focus almost completely on the binaural, 3d sound rendering for a recent episode, but it was so much more than that. There was quite detailed discussion of audio innovation throughout the more than 50 years of Doctor Who, some of which we have discussed when mentioning Daphne Oram and Delia Derbyshire in our blog entry on female pioneers in audio engineering.

There’s a nice short interview with Chris and colleagues Darran Clement (sound mixer) and Catherine Robinson (audio supervisor) about the binaural sound in Doctor Who on BBC R&D’s blog, and here’s a youtube video promoting the binaural sound in the recent episode;