Our meta-analysis wins best JAES paper 2016!

Last year, we published an Open Access article in the Journal of the Audio Engineering Society (JAES) on “A meta-analysis of high resolution audio perceptual evaluation.”

JAES_V64_6_ALL

I’m very pleased and proud to announce that this paper won the award for best JAES paper for the calendar year 2016.

We discussed the research a little bit while it was ongoing, and then in more detail soon after publication. The research addressed a contentious issue in the audio industry. For decades, professionals and enthusiasts have engaged in heated debate over whether high resolution audio (beyond CD quality) really makes a difference. So I undertook a meta-analysis to assess the ability to perceive a difference between high resolution and standard CD quality audio. Meta-analysis is a popular technique in medical research, but this may be the first time that its been formally applied to audio engineering and psychoacoustics. Results showed a highly significant ability to discriminate high resolution content in trained subjects that had not previously been revealed. With over 400 participants in over 12,500 trials, it represented the most thorough investigation of high resolution audio so far.

Since publication, this paper was covered broadly across social media, popular press and trade journals. Thousands of comments were made on forums, with hundreds of thousands of reads.

Here’s one popular independent youtube video discussing it.

and an interview with Scientific American about it,

and some discussion of it in this article for Forbes magazine (which is actually about the lack of a headphone jack in the iPhone 7).

But if you want to see just how angry this research made people, check out the discussion on hydrogenaudio. Wow, I’ve never been called an intellectually dishonest placebophile apologist before 😉 .

In fact, the discussion on social media was full of misinformation, so I’ll try and clear up a few things here;

When I first started looking into this subject , it became clear that potential issues in the studies was a problem. One option would have been to just give up, but then I’d be adding no rigour to a discussion because I felt it wasn’t rigourous enough. Its the same as not publishing because you don’t get a significant result, only now on a meta scale. And though I did not have a strong opinion either way as to whether differences could be perceived, I could easily be fooling myself. I wanted to avoid any of my own biases or judgement calls. So I set some ground rules.

  • I committed to publishing all results, regardless of outcome.
  • A strong motivation for doing the meta-analysis was to avoid cherry-picking studies. So I included all studies for which there was sufficient data for them to be used in meta-analysis.  Even if I thought a study was poor, its conclusions seemed flawed or it disagreed with my own conceptions, if I could get the minimal data to do meta-analysis, I included it. I then discussed potential issues.
  • Any choices regarding analysis or transformation of data was made a priori, regardless of the result of that choice, in an attempt to minimize any of my own biases influencing the outcome.
  • I did further analysis to look at alternative methods of study selection and representation.

I found the whole process of doing a meta-analysis in this field to be fascinating. In audio engineering and psychoacoustics, there are a wealth of studies investigating big questions, and I hope others will use similar approaches to gain deeper insights and perhaps even resolve some issues.

Advertisements

Exciting research at the upcoming Audio Engineering Society Convention

aes143

About five months ago, we previewed the last European Audio Engineering Society Convention, which we followed with a wrap-up discussion. The next AES  convention is just around the corner, October 18 to 21st in New York. As before, the Audio Engineering research team here aim to be quite active at the convention.

These conventions are quite big, with thousands of attendees, but not so large that you get lost or overwhelmed. Away from the main exhibition hall is the Technical Program, which includes plenty of tutorials and presentations on cutting edge research.

So here, we’ve gathered together some information about a lot of the events that we will be involved in, attending, or we just thought were worth mentioning. And I’ve gotta say, the Technical Program looks amazing.

Wednesday

One of the first events of the Convention is the Diversity Town Hall, which introduces the AES Diversity and Inclusion Committee. I’m a firm supporter of this, and wrote a recent blog entry about female pioneers in audio engineering. The AES aims to be fully inclusive, open and encouraging to all, but that’s not yet fully reflected in its activities and membership. So expect to see some exciting initiatives in this area coming soon.

In the 10:45 to 12:15 poster session, Steve Fenton will present Alternative Weighting Filters for Multi-Track Program Loudness Measurement. We’ve published a couple of papers (Loudness Measurement of Multitrack Audio Content Using Modifications of ITU-R BS.1770, and Partial loudness in multitrack mixing) showing that well-known loudness measures don’t correlate very well with perception when used on individual tracks within a multitrack mix, so it would be interesting to see what Steve and his co-author Hyunkook Lee found out. Perhaps all this research will lead to better loudness models and measures.

At 2 pm, Cleopatra Pike will present a discussion and analysis of Direct and Indirect Listening Test Methods. I’m often sceptical when someone draws strong conclusions from indirect methods like measuring EEGs and reaction times, so I’m curious what this study found and what recommendations they propose.

The 2:15 to 3:45 poster session will feature the work with probably the coolest name, Influence of Audience Noises on the Classical Music Perception on the Example of Anti-cough Candies Unwrapping Noise. And yes, it looks like a rigorous study, using an anechoic chamber to record the sounds of sweets being unwrapped, and the signal analysis is coupled with a survey to identify the most distracting sounds. It reminds me of the DFA faders paper from the last convention.

At 4:30, researchers from Fraunhofer and the Technical University of Ilmenau present Training on the Acoustical Identification of the Listening Position in a Virtual Environment. In a recent paper in the Journal of the AES, we found that training resulted in a huge difference between participant results in a discrimination task, yet listening tests often employ untrained listeners. This suggests that maybe we can hear a lot more than what studies suggest, we just don’t know how to listen and what to listen for.

Thursday

If you were to spend only one day this year immersing yourself in frontier audio engineering research, this is the day to do it.

At 9 am, researchers from Harman will present part 1 of A Statistical Model that Predicts Listeners’ Preference Ratings of In-Ear Headphones. This was a massive study involving 30 headphone models and 71 listeners under carefully controlled conditions. Part 2, on Friday, focuses on development and validation of the model based on the listening tests. I’m looking forward to both, but puzzled as to why they weren’t put back-to-back in the schedule.

At 10 am, researchers from the Tokyo University of the Arts will present Frequency Bands Distribution for Virtual Source Widening in Binaural Synthesis, a technique which seems closely related to work we presented previously on Cross-adaptive Dynamic Spectral Panning.

From 10:45 to 12:15, our own Brecht De Man will be chairing and speaking in a Workshop on ‘New Developments in Listening Test Design.’ He’s quite a leader in this field, and has developed some great software that makes the set up, running and analysis of listening tests much simpler and still rigorous.

In the 11-12:30 poster session, Nick Jillings will present Automatic Masking Reduction in Balance Mixes Using Evolutionary Computing, which deals with a challenging problem in music production, and builds on the large amount of research we’ve done on Automatic Mixing.

At 11:45, researchers from McGill will present work on Simultaneous Audio Capture at Multiple Sample Rates and Formats. This helps address one of the challenges in perceptual evaluation of high resolution audio (and see the open access journal paper on this), ensuring that the same audio is used for different versions of the stimuli, with only variation in formats.

At 1:30, renowned audio researcher John Vanderkooy will present research on how a  loudspeaker can be used as the sensor for a high-performance infrasound microphone. In the same session at 2:30, researchers from Plextek will show how consumer headphones can be augmented to automatically perform hearing assessments. Should we expect a new audiometry product from them soon?

At 2 pm, our own Marco Martinez Ramirez will present Analysis and Prediction of the Audio Feature Space when Mixing Raw Recordings into Individual Stems, which applies machine learning to challenging music production problems. Immediately following this, Stephen Roessner discusses a Tempo Analysis of Billboard #1 Songs from 1955–2015, which builds partly on other work analysing hit songs to observe trends in music and production tastes.

At 3:45, there is a short talk on Evolving the Audio Equalizer. Audio equalization is a topic on which we’ve done quite a lot of research (see our review article, and a blog entry on the history of EQ). I’m not sure where the novelty is in the author’s approach though, since dynamic EQ has been around for a while, and there are plenty of harmonic processing tools.

At 4:15, there’s a presentation on Designing Sound and Creating Soundscapes for Still Images, an interesting and unusual bit of sound design.

Friday

Judging from the abstract, the short Tutorial on the Audibility of Loudspeaker Distortion at Bass Frequencies at 5:30 looks like it will be an excellent and easy to understand review, covering practice and theory, perception and metrics. In 15 minutes, I suppose it can only give a taster of what’s in the paper.

There’s a great session on perception from 1:30 to 4. At 2, perceptual evaluation expert Nick Zacharov gives a Comparison of Hedonic and Quality Rating Scales for Perceptual Evaluation. I think people often have a favorite evaluation method without knowing if its the best one for the test. We briefly looked at pairwise versus multistimuli tests in previous work, but it looks like Nick’s work is far more focused on comparing methodologies.

Immediately after that, researchers from the University of Surrey present Perceptual Evaluation of Source Separation for Remixing Music. Techniques for remixing audio via source separation is a hot topic, with lots of applications whenever the original unmixed sources are unavailable. This work will get to the heart of which approaches sound best.

The last talk in the session, at 3:30 is on The Bandwidth of Human Perception and its Implications for Pro Audio. Judging from the abstract, this is a big picture, almost philosophical discussion about what and how we hear, but with some definitive conclusions and proposals that could be useful for psychoacoustics researchers.

Saturday

Grateful Dead fans will want to check out Bridging Fan Communities and Facilitating Access to Music Archives through Semantic Audio Applications in the 9 to 10:30 poster session, which is all about an application providing wonderful new experiences for interacting with the huge archives of live Grateful Dead performances.

At 11 o’clock, Alessia Milo, a researcher in our team with a background in architecture, will discuss Soundwalk Exploration with a Textile Sonic Map. We discussed her work in a recent blog entry on Aural Fabric.

In the 2 to 3:30 poster session, I really hope there will be a live demonstration accompanying the paper on Acoustic Levitation.

At 3 o’clock, Gopal Mathur will present an Active Acoustic Meta Material Loudspeaker System. Metamaterials are receiving a lot of deserved attention, and such advances in materials are expected to lead to innovative and superior headphones and loudspeakers in the near future.

 

The full program can be explored on the Convention Calendar or the Convention website. Come say hi to us if you’re there! Josh Reiss (author of this blog entry), Brecht De Man, Marco Martinez and Alessia Milo from the Audio Engineering research team within the Centre for Digital Music  will all be there.
 

 

What the f*** are DFA faders?

I’ve been meaning to write this blog entry for a while, and I’ve finally gotten around to it. At the 142nd AES Convention, there were two papers that really stood out which weren’t discussed in our convention preview or convention wrap-up. One was about Acoustic Energy Harvesting, which we discussed a few weeks ago, and the other was titled ‘The DFA Fader: Exploring the Power of Suggestion in Loudness The DFA Fader: Exploring the Power of Suggestion in Loudness Judgments.’ When I mentioned this paper to others, their response was always the same, “What’s a DFA Fader?” . Well, the answer is hinted at in the title of this blog entry.

The basic idea is that musicians often give instructions to the sound engineer that he or she can’t or doesn’t want to follow. For instance, a vocalist might say “Turn me up” in a soundcheck, but the sound engineer knows that the vocals are at a nice level already and any more amplification might cause feedback. Sometimes, this sort of thing can be communicated back to the musician in a nice way. But there’s also the fallback option; a fader on the mixing console that “Does F*** All”, aka DFA. The engineer can slide the fader or twiddle an unconnected dial, smile back and say ‘Ok, does this sound a bit better?’.

A couple of companies have had fun with this idea. Funk Logic’s Palindrometer, shown below, is nothing more than a filler for empty rack space. Its an interface that looks like it might do something, but at best, it just flashes some LEDs when one toggles switches and turns the knobs.

pal_main

RANE have the PI 14 Pseudoacoustic Infector . Its worth checking out the full description, complete with product review and data sheets. I especially like the schematic, copied below.

pi14bd.png

And in 2014, our own Brecht De Man  released The Wire, a freely available VST and AudioUnit plug-in that emulates a gold-plated, balanced, 100% lossless audio connector.

TheWire

Anyway, the authors of this paper had the bright idea of doing legitimate subjective evaluation of DFA faders. They didn’t make jokes in the paper, not even to explain the DFA acronym. They took 22 participants and divided them into an 11 person control group and an 11 person test group. In the control group, each subject participated in twenty trials where two identical musical excerpts were presented and the subject had to rate the difference in loudness of vocals between the two excerpts. Only ten excerpts were used, so each pair was used in two trials. In the test group, a sound engineer was present and he made scripted suggestions that he was adjusting the levels in each trial. He could be seen, but participants couldn’t see his hands moving on the console.

Not surprisingly, most trials showed a statistically significant difference between test and control groups, confirming the effectiveness of verbal suggestions associated with the DFA fader. And the authors picked up on an interesting point; results were far more significant for stimuli where vocals were masked by other instruments. This links the work to psychoacoustic studies. Not only is our perception of loudness and timbre influenced by the presence of a masker, but we have a more difficult time judging loudness and hence are more likely to accept the suggestion from an expert.

The authors did an excellent job of critiquing their results. But unfortunately, the full data was not made available with the paper. So we are left with a lot of questions. What were these scripted suggestions? It could make a big difference if the engineer said “I’m going to turn the vocals way up” versus “Let me try something. Does it sound any different now?” And were some participants immune to the suggestions? And because participants couldn’t see a fader being adjusted (interviews with sound engineers had stressed the importance of verbal suggestions), we don’t know how that could influence results.

There is something else that’s very interesting about this. It’s a ‘false experiment’. The whole listening test is a trick since for all participants and in all trials, there was never any loudness differences between the two presented stimuli. So indirectly, it looks at an ‘auditory placebo effect’ that is more fundamental than DFA faders. What were the ratings for loudness differences that participants gave? For the control group especially, did they judge these differences to be small because they trusted their ears, or large because they knew that loudness judging is the nature of the test? Perhaps there is a natural uncertainty in loudness perception regardless of bias. How much weaker does a listener’s judgment become when repeatedly asked to make very subtle choices in a listening test? There’s been some prior work tackling some of these questions, but I think this DFA Faders paper opened up a lot of avenues of interesting research.

Female pioneers in audio engineering

The Heyser lecture is a distinguished talk given at each AES Convention by eminent individuals in audio engineering and related fields. At the 140th AES Convention, Rozenn Nicol was the Heyser lecturer. This was well-deserved, and she has made major contributions to the field of immersive audio. But what was shocking about this is that she is the first woman Heyser lecturer. Its an indicator that woman are under-represented and under-recognised in the field. With that in mind, I’d like to highlight some women who have made major contributions to the field, especially in research and innovation.

  • Birgitta Berglund led major research into the impact of noise on communities. Her influential research resulted in guidelines from the World Health Organisation, and greatly advanced our understanding of noise and its effects on society. She was the 2009 IOA Rayleigh medal recipient.
  • Marina Bosi is a past AES president of the AES. She has been instrumental in the development of standards for audio coding and digital content management standards and formats, including develop the AC-2, AC-3, and MPEG-2 Advanced Audio Coding technologies,
  • Anne-Marie Bruneau has been one of the most important researchers on electrodynamic loudspeaker design, exploring motion impedance and radiation patterns, as well as establishing some of the main analysis and measurement approaches used today. She co-founded the Laboratoire d’Acoustique de l’Université du Maine, now a leading acoustics research center.
  • Ilene J. Busch-Vishniac is responsible for major advances in the theory and understanding of electret microphones, as well as patenting several new designs. She received the ASA R. Bruce Lindsay Award in 1987, and the Silver Medal in Engineering Acoustics in 2001. President of the ASA 2003-4.
  • Elizabeth (Betsy) Cohen was the first female president of the Audio Engineering Society. She was presented with the AES Fellowship Award in 1995 for contributions to understanding the acoustics and psychoacoustics of sound in rooms. In 2001, she was presented with the AES Citation Award for pioneering the technology enabling collaborative multichannel performance over the broadband internet.
  • crumPoppy Crum is head scientist at Dolby Laboratories whose research involves computer research in music and acoustics. At Dolby, she is responsible for integrating neuroscience and knowledge of sensory perception into algorithm design, technological development, and technology strategy.
  • Delia Derbyshire (1937-2001) was an innovator in electronic music who pushed the boundaries of technology and composition. She is most well-known for her electronic arrangement of the theme for Doctor Who, an important example of Musique Concrète. Each note was individually crafted by cutting, splicing, and stretching or compressing segments of analogue tape which contained recordings of a plucked string, oscillators and white noise. Here’s a video detailing a lot of the effects she used, which have now become popular tools in digital music production.
  •  Ann Dowling is the first female president of the Royal Academy of Engineering. Her research focuses on noise analysis and reduction, especially from engines, and she is a leading educator in acoustics. A quick glance at google scholar shows how influential her research has been.
  • Marion Downs was an audiometrist at Colorado Medical Center in Denver, who invented the tests used to measure hearing both In newly born babies and in fetuses.
  • Judy Dubno is Director of Hearing Research at the Medical University of South Carolina. Her research focuses on human auditory function, with emphasis on the processing of auditory information and the recognition of speech, and how these abilities change in adverse listening conditions, with age, and with hearing loss. Recipient of the James Jerger Career Award for Research in Audiology from the American Academy of Audiology and Carhart Memorial Lecturer for the American Auditory Society. President of the ASA in 2014-15.
  • thumb_FiebrinkPhoto3Rebecca Fiebrink researches Human Computer Interaction (HCI) and its application of machine learning to real-time, interactive, and creative domains. She is the creator of the popular Wekinator, which allows anyone to use machine learning to build new musical instruments, real-time music information retrieval and audio analysis systems, computer listening systems and more.
  • Katherine Safford Harris pioneered EMG studies of speech production and auditory perception. Her research was fundamental to speech recognition, speech synthesis, reading machines for the blind, and the motor theory of speech perception. She was elected Fellow of the ASA, the AAAS, the American Speech-Language-Hearing Association, and the New York Academy of Sciences. She was President of the ASA (2000-2001), awarded the Silver Medal in 2005 and Gold Medal in 2007.
  • Rhona Hellman was a Fellow of the ASA. She was a distinguished hearing scientist and preeminent expert in auditory perceptual phenomena. Her research spanned almost 50 years, beginning in 1960. She tackled almost every aspect of loudness, and the work resulted in major advances and developments of loudness standards.
  • Mara Helmuth developed software for composition and improvisation involving granular synthesis. Throughout the 1990s, she paved the way forward by exploring and implementing systems for collaborative performance over the Internet. From 2008-10 she was President of the International Computer Music Association.
  • Carleen_HutchinsCarlene Hutchins (1911-2009) was a leading researcher in the study of violin acoustics, with over a hundred publications in the field. She was founder and president of the Catgut Society, an organization devoted to the study and appreciation of stringed instruments .
  • Sophie Germain (1776-1831) was a French mathematician, scientist and philosopher. She won a major prize from the French Academy of Sciences for developing a theory to explain the vibration of plates due to sound. The history behind her contribution, and the reactions of leading French mathematicians to having a female of similar calibre in their midst, is fascinating. Joseph Fourier, whose work underpins much of audio signal processing, was a champion of her work.
  • Bronwyn Jones was a psychoacoustician at the CBS Technology Center during the 70s and 80s. In seminal work with co-author Emil Torrick, she developed one of the first loudness meters, incorporating both psychoacoustic principles and detailed listening tests. It paved the way for what became major initiatives for loudness measurement, and in some ways outperforms the modern ITU 1770 standard
  • Bozena Kostek is editor of the Journal of the Audio Engineering Society. Her most significant contributions include the applications of neural networks, fuzzy logic and rough sets to musical acoustics, and the application of data processing and information retrieval to the psychophysiology of hearing. Her research has garnered dozens of prizes and awards.
  • Daphne Oram (1925 –2003) was a pioneer of ‘musique concrete’ and a central figure in the evolution of electronic music. She devised the Oramics technique for creating electronic sounds, co-founded the BBC Radiophonic Workshop, and was possibly the first woman to direct an electronic music studio, to set up a personal electronic music studio and to design and construct an electronic musical instrument.
  • scalettiCarla Scaletti is an innovator in computer generated music. She designed the Kyma sound generation computer language in 1986 and co-founded Symbolic Sound Corporation in 1989. Kyma is one of the first graphical programming languages for real time digital audio signal processing, a precursor to MaxMSP and PureData, and is still popular today.
  • Bridget Shield was professor of acoustics at London Southbank University. Her research is most significant in our understanding of the effects of noise on children, and has influenced many government initiatives. From 2012-14, she was the first female President of the Institute of Acoustics.
  • Laurie Spiegel created one of the first computer-based music composition programs, Music Mouse: an Intelligent Instrument, which also has some early examples of algorithmic composition and intelligent automation, both of which are hot research topics today.
  • maryMary Desiree Waller (1886-1959) wrote a definitive treatise on Chladni figures, which are the shapes and patterns made by surface vibrations due to sound, see Sophie Germain, above. It gave far deeper insight into the figures than any previous work.
  • Megan (or Margaret) Watts-Hughes is the inventor of the Eidophone, an early instrument for visualising the sounds made by your voice. She rediscovered this simple method of generating Chladni figures without knowledge of Sophie Germain or Ernst Chladni’s work. There is a great description of her experiments and analysis in her own words.

The Eidophone, demonstrated by Grace Digney.

Do you know some others who should be mentioned? We’d love to hear your thoughts.

Thanks to Theresa Leonard for information on past AES presidents. She was the third female president.  will be the fourth.

And check out Women in Audio: contributions and challenges in music
technology and production for a detailed analysis of the current state of the field.

Acoustic Energy Harvesting

At the recent Audio Engineering Society Convention, one of the most interesting talks was in the E-Briefs sessions. These are usually short presentations, dealing with late-breaking research results, work in progress, or engineering reports. The work, by Charalampos Papadokos presented an e-brief titled ‘Power Out of Thin Air: Harvesting of Acoustic Energy’.

Ambient energy sources are those sources all around us, like solar and kinetic energy. Energy harvesting is the capture and storage of ambient energy. It’s not a new concept at all, and dates back to the windmill and the waterwheel. Ambient power has been collected from electromagnetic radiation since the invention of crystal radios by Sir Jagadish Chandra Bose, a true renaissance man who made important contributions to many fields. But nowadays, people are looking for energy harvesting from many more possible sources, often for powering small devices, like wearable electronics and wireless sensor networks. The big advantages, of course, is that energy harvesters do not consume resources like oil or coal, and energy harvesting might enable some devices to operate almost indefinitely.

But two of the main challenges is that many ambient energy sources are very low power, and the harvesting may be difficult.

Typical power densities from energy harvesting can vary over orders of magnitude. Here’s the energy densities for various ambient sources, taken from the Open Access book chapter ‘Electrostatic Conversion for Vibration Energy Harvesting‘ by S. Boisseau, G. Despesse and B. Ahmed Seddik ‘.

EnergyHarvesting

You can see that vibration, which includes acoustic vibrations, has about 1/100th the energy density of solar power, or even less. The numbers are arguable, but at first glance it looks like it will be exceedingly difficult to get any significant energy from acoustic sources unless one can harvest over a very large area.

That’s where this e-brief paper comes in. Papadokos and his co-author, John Mourjopoulos, have a patented approach to harvesting the acoustic energy inside a loudspeaker enclosure. Others had considered harvesting the sound energy from loudspeakers before (see the work of Matsuda, for instance), but mainly just as a way of testing their harvesting approach, and not really exploiting the properties of loudspeakers. Papadokos and Mourjopoulos had the insight to realise that many loudspeakers are enclosed and the enclosure has abundant acoustic energy that might be harvested without interfering with the external design and without interfering with the sound presented to the listener. In earlier work, Papadokos and Mourjopoulos found that sound pressure within the enclosure often exceeds 130 dBs within a loudspeaker enclosure. Here, they simulated the effect of a piezoelectric plate in the enclosure, to convert the acoustic energy to electrical energy. Results showed that it might be possible to generate 2.6 volts under regular operating conditions, thus proving the concept of harvesting acoustic energy from loudspeaker enclosures, at least in simulation.

AES Berlin 2017: Keynotes from the technical program

aes2017

The 142nd AES Convention was held last month in the creative heart of Berlin. The four-day program and its more than 2000 attendees covered several workshops, tutorials, technical tours and special events, all related to the latest trends and developments in audio research. But as much as scale, it’s attention to detail that makes AES special. There’s an emphasis on the research side of audio topics as much as the side of panels of experts discussing a range of provocative and practical topics.

It can be said that 3D Audio: Recording and Reproduction, Binaural Listening and Audio for VR were the most popular topics among workshops, tutorial, papers and engineering briefs. However, a significant portion of the program was also devoted to common audio topics such as digital filter design, live audio, loudspeaker design, recording, audio encoding, microphones, and music production techniques just to name a few.

For this reason, here at the Audio Engineering research team within C4DM, we bring you what we believe were the highlights, the key talks or the most relevant topics that took place during the convention.

The future of mastering

What better way to start AES than with a mastering experts’ workshop discussing about the future of the field?  Jonathan Wyner (iZotope) introduced us to the current challenges that this discipline faces.  This related to the demographic, economic and target formatting issues that are constantly evolving and changing due to advances in the music technology industry and its consumers.

When discussing the future of mastering, the panel was reluctant to a fully automated future. But pointed out that the main challenge of assistive tools is to understand artistry intentions and genre-based decisions without the need of the expert knowledge of the mastering engineer. Concluding that research efforts should go towards the development of an intelligent assistant, able to function as an smart preset that provides master engineers a starting point.

Virtual analog modeling of dynamic range compression systems

This paper described a method to digitally model an analogue dynamic range compression. Based on the analysis of processed and unprocessed audio waveforms, a generic model of dynamic range compression is proposed and its parameters are derived from iterative optimization techniques.

Audio samples were reproduced and the quality of the audio produced by the digital model was demonstrated. However, it should be noted that the parameters of the digital compressor can not be changed, thus, this could be an interesting future work path, as well as the inclusion of other audio effects such as equalizers or delay lines.

Evaluation of alternative audio mixing interfaces

In the paperFormal Usability Evaluation of Audio Track Widget Graphical Representation for Two-Dimensional Stage Audio Mixing Interface‘  an evaluation of different graphical track visualization styles is proposed. Multitrack visualizations included text only, different colour conventions for circles containing text or icons related to the type of instruments, circles with opacity assigned to audio features and also a traditional channel strip mixing interface.

Efficiency was tested and it was concluded that subjects preferred instrument icons as well as the traditional mixing interface. In this way, taking into account several works and proposals on alternative mixing interfaces (2D and 3D), there is still a lot of scope to explore on how to build an intuitive, efficient and simple interface capable of replacing the good known channel strip.

Perceptually motivated filter design with application to loudspeaker-room equalization

This tutorial, was based on the engineering briefQuantization Noise of Warped and Parallel Filters Using Floating Point Arithmetic’  where warped parallel filters are proposed, which aim to have the frequency resolution of the human ear.

Thus, via Matlab, we explored various approaches for achieving this goal, including warped FIR and IIR, Kautz, and fixed-pole parallel filters. Providing in this way a very useful tool that can be used for various applications such as room EQ, physical modelling synthesis and perhaps to improve existing intelligent music production systems.

Source Separation in Action: Demixing the Beatles at the Hollywood Bowl

Abbey Road’s James Clarke presented a great poster with the actual algorithm that was used for the remixed, remastered and expanded version of The Beatles’ album Live at the Hollywood Bowl. The method achieved to isolate the crowd noise, allowing to separate into clean tracks everything that Paul McCartney, John Lennon, Ringo Starr and George Harrison played live in 1964.

The results speak for themselves (audio comparison). Thus, based on a Non-negative Matrix Factorization (NMF) algorithm, this work provides a great research tool for source separation and reverse-engineer of mixes.

Other keynotes worth to mention:

Close Miking Empirical Practice Verification: A Source Separation Approach

Analysis of the Subgrouping Practices of Professional Mix Engineers

New Developments in Listening Test Design

Data-Driven Granular Synthesis

A Study on Audio Signal Processed by “Instant Mastering” Services

The rest of the paper proceedings are available in the AES E-library.

The AES Semantic Audio Conference

Last week saw the 2017 International Conference on Semantic Audio by the Audio Engineering Society. Held at Fraunhofer Institute for Integrated Circuits in Erlangen, Germany, delegates enjoyed a well-organised and high-quality programme, interleaved with social and networking events such as a jazz concert and a visit to Erlangen’s famous beer cellars. The conference was a combined effort of Fraunhofer IIS, Friedrich-Alexander Universität, and their joint venture Audio Labs.

As the topic is of great relevance to our team, Brecht De Man and Adán Benito attended and presented their work there. With 5 papers and a late-breaking demo, the Centre for Digital Music in general was the most strongly represented institution, surpassing even the hosting organisations.

benito-reverb

Benito’s intelligent multitrack reverberation architecture

Adán Benito presented “Intelligent Multitrack Reverberation Based on Hinge-Loss Markov Random Fields“,  machine learning approach to automatic application of a reverb effect to musical audio.

Brecht De Man demoed the “Mix Evaluation Browser“, an online interface to access a dataset comprising several mixes of a number of songs, complete with corresponding DAW files, raw tracks, preference ratings, and annotated comments from subjective listening tests.

MixEvaluationBrowser

The Mix Evaluation Browser: an interface to visualise De Man’s dataset of raw tracks, mixes, and subjective evaluation results.

Still from the Centre for Digital Music, Delia Fano Yela delivered a beautifully hand-drawn and compelling presentation about source separation in general and how temporal context can be employed to considerably improve vocal extraction.

Rodrigo Schramm and Emmanouil Benetos won the Best Paper award for their paper “Automatic Transcription of a Cappella recordings from Multiple Singers”.

Emmanouil further presented another paper, “Polyphonic Note and Instrument Tracking Using Linear Dynamical Systems”, and coauthored “Assessing the Relevance of Onset Information for Note Tracking in Piano Music Transcription”.

 

Several other delegates were frequent collaborators or previously affiliated with Queen Mary. The opening keynote was delivered by Mark Plumbley, former director of the Centre for Digital Music, who gave an overview of the field of machine listening, specifically audio event detection and scene recognition. Nick Jillings, formerly research assistant and master project student at the Audio Engineering group, and currently a PhD student at Birmingham City University cosupervised by Josh Reiss, head of our Audio Engineering group, presented his paper “Investigating Music Production Using a Semantically Powered Digital Audio Workstation in the Browser” and demoed “Automatic channel routing using musical instrument linked data”.

Other keynotes were delivered by Udo Zölzer, best known from editing the collection “DAFX: Digital Audio Effects”, and Masataka Goto, a household name in the MIR community who discussed his own web-based implementations of music discovery and visualisation.

Paper proceedings are already available in the AES E-library, free for AES members.