Sound Talking – 3 November at the London Science Museum

On Friday 3 November 2017, Dr Brecht De Man (one of the audio engineering group researchers) and Dr Melissa Dickson are chairing an unusual and wildly interdisciplinary day of talks, tied together by the theme ‘language describing sound, and sound emulating language’.

Despite being part of the Electronic Engineering and Computer Science department, we think about and work around language quite a lot. After all, audio engineering is mostly related to transferring and manipulating (musical, informative, excessive, annoying) sound and therefore we need to understand how it is experienced and described. This is especially evident from projects such as the SAFE plugins, where we collect terms which describe a particular musical signal manipulation, to then determine their connection with the chosen process parameters and measured signal properties. So the relationship between sound and language is actually central to Brecht’s research, as well as of others here.

The aim of this event is to bring together a wide range of high-profile researchers who work on this intersection, from maximally different perspectives. They study the terminology used to discuss sound, the invention of words that capture sonic experience, and the use and manipulation of sound to emulate linguistic descriptions. Talks will address singing voice research, using sound in accessible film for hearing impaired viewers, new music production tools, auditory neuroscience, sounds in literature, the language of artistic direction, and the sounds of the insane asylum. ‘Sounds’ like a fascinating day at the Science Museum!

Register now (the modest fee just covers lunch, breaks, and wine reception) and get to see

  • Maria Chait (head of UCL Auditory Cognitive Neuroscience lab)
  • Jonathan Andrews (on soundscape of the insane asylum)
  • Melissa Dickson (historian of 19th century literature)
  • Mariana Lopez (making film more accessible through sound)
  • David Howard (the singing voice)
  • Brecht De Man (from our group, on understanding the vocabulary of mixing)
  • Mandy Parnell (award winning mastering engineer)
  • Trevor Cox (categorising quotidian sounds)

In addition, there will be a display of cool sound making objects, with a chance to make your own wax cylinder recording, and more!

The full programme including abstracts and biographies can be found on www.semanticaudio.co.uk/events/soundtalking/.

Advertisements

The Mix Evaluation Dataset

Still at the upcoming International Conference on Digital Audio Effects in Edinburgh, 5-8 September, our group’s Brecht De Man will be presenting a paper on his Mix Evaluation Dataset (a pre-release of which can be read here).
It is a collection of mixes and evaluations of these mixes, amassed over the course of his PhD research, that has already been the subject of several studies on best practices and perception of mix engineering processes.
With over 180 mixes of 18 different songs, and evaluations from 150 subjects totalling close to 13k statements (like ‘snare drum too dry’ and ‘good vocal presence’), the dataset is certainly the largest and most diverse of its kind.

Unlike the bulk of previous research in this topic, the data collection methodology presented here has maximally preserved ecological validity by allowing participating mix engineers to use representative, professional tools in their preferred environment. Mild constraints on software, such as the agreement to use the DAW’s native plug-ins, means that mixes can be recreated completely and analysed in depth from the DAW session files, which are also shared.

The listening test experiments offered a unique opportunity for the participating mix engineers to receive anonymous feedback from peers, and helped create a large body of ratings and free-field text comments. Annotation and analysis of these comments further helped understand the relative importance of various music production aspects, as well as correlate perceptual constructs (such as reverberation amount) with objective features.

Proportional representation of processors in subjective comments

An interface to browse the songs, audition the mixes, and dissect the comments is provided at http://c4dm.eecs.qmul.ac.uk/multitrack/MixEvaluation/, from where the audio (insofar the source is licensed under Creative Commons, or copyrighted but available online) and perceptual evaluation data can be downloaded as well.

The Mix Evaluation Dataset browsing interface

The AES Semantic Audio Conference

Last week saw the 2017 International Conference on Semantic Audio by the Audio Engineering Society. Held at Fraunhofer Institute for Integrated Circuits in Erlangen, Germany, delegates enjoyed a well-organised and high-quality programme, interleaved with social and networking events such as a jazz concert and a visit to Erlangen’s famous beer cellars. The conference was a combined effort of Fraunhofer IIS, Friedrich-Alexander Universität, and their joint venture Audio Labs.

As the topic is of great relevance to our team, Brecht De Man and Adán Benito attended and presented their work there. With 5 papers and a late-breaking demo, the Centre for Digital Music in general was the most strongly represented institution, surpassing even the hosting organisations.

benito-reverb

Benito’s intelligent multitrack reverberation architecture

Adán Benito presented “Intelligent Multitrack Reverberation Based on Hinge-Loss Markov Random Fields“,  machine learning approach to automatic application of a reverb effect to musical audio.

Brecht De Man demoed the “Mix Evaluation Browser“, an online interface to access a dataset comprising several mixes of a number of songs, complete with corresponding DAW files, raw tracks, preference ratings, and annotated comments from subjective listening tests.

MixEvaluationBrowser

The Mix Evaluation Browser: an interface to visualise De Man’s dataset of raw tracks, mixes, and subjective evaluation results.

Still from the Centre for Digital Music, Delia Fano Yela delivered a beautifully hand-drawn and compelling presentation about source separation in general and how temporal context can be employed to considerably improve vocal extraction.

Rodrigo Schramm and Emmanouil Benetos won the Best Paper award for their paper “Automatic Transcription of a Cappella recordings from Multiple Singers”.

Emmanouil further presented another paper, “Polyphonic Note and Instrument Tracking Using Linear Dynamical Systems”, and coauthored “Assessing the Relevance of Onset Information for Note Tracking in Piano Music Transcription”.

 

Several other delegates were frequent collaborators or previously affiliated with Queen Mary. The opening keynote was delivered by Mark Plumbley, former director of the Centre for Digital Music, who gave an overview of the field of machine listening, specifically audio event detection and scene recognition. Nick Jillings, formerly research assistant and master project student at the Audio Engineering group, and currently a PhD student at Birmingham City University cosupervised by Josh Reiss, head of our Audio Engineering group, presented his paper “Investigating Music Production Using a Semantically Powered Digital Audio Workstation in the Browser” and demoed “Automatic channel routing using musical instrument linked data”.

Other keynotes were delivered by Udo Zölzer, best known from editing the collection “DAFX: Digital Audio Effects”, and Masataka Goto, a household name in the MIR community who discussed his own web-based implementations of music discovery and visualisation.

Paper proceedings are already available in the AES E-library, free for AES members.

The Benefits of Browser-Based Listening Tests

Listening tests, or subjective evaluation of audio, are an essential tool in almost any form of audio and music related research, from data compression codecs over loudspeaker design to realism of sound effects. Sadly, because of the time and effort required to carefully design a test and convince a sufficient number of participants, it is also quite an expensive process.

The advent of web technologies like the Web Audio API, enabling elaborate audio applications within a web page, offers the opportunity to develop browser-based listening tests which mitigate some of the difficulties associated with perceptual evaluation of audio. Researchers at the Centre for Digital Music and Birmingham City University’s Digital Media Technology Lab have developed the Web Audio Evaluation Tool [1] to facilitate listening test design for any experimenter regardless of their programming experience, operating system, test paradigm, interface layout, and location of their test subjects.

APE interface - Web Audio Evaluation Tool

Web Audio Evaluation Tool: An example single-axis, multiple stimuli listening test with comment fields.

Here we cover some of the reasons why you would want to carry out a listening test in the browser, using the Web Audio Evaluation Tool as a case study.

Remote tests

The first and most obvious reason to use a browser-based listening test platform. If you want to conduct a perceptual evaluation study online, i.e. host a website where participants can take the test, then there are no two ways about it: you need a listening test that works within the browser, i.e. one that is based on HTML and JavaScript.

A downloadable application is rarely an elegant solution, and only the most determined participants will end up taking the test if they can get it to work. A website, however, amounts to a very low-threshold participation.

Pros

  • Low effort A remote test means no booking and setting up of a listening room, showing the participant into the building, …
  • Scales easily If you can conduct the test once, you can conduct it a virtually unlimited number of times, as long as you find the participants. Amazon Turk or similar services could be helpful with this.
  • Different locations/cultures/languages within reach For some types of research, it is necessary to include (a high number of) participants with certain geographical locations, cultural backgrounds and/or native languages. When these are scarce nearby, and you cannot find the time or funds to fly around the world, a remote listening test can be helpful.

Cons

  • Limited programming languages For the implementation of the test, you are basically constrained to using web technologies such as JavaScript. For someone used to using e.g. MATLAB or C++, this can be off-putting. This is one of the reasons we aim to offer a tool that doesn’t involve any coding for most of the use cases.
  • Loss of control A truly remote test means that you are not present to talk to the participant and answer questions, or notice they misunderstand the instructions. You also have little information on their playback system (make and model, how it is set up, …) and you often know less about their background.

Depending on the type of test and research, you may or may not want to go ‘remote’ or ‘local’.
However, it has been shown for certain tasks that there is no significant difference between results from local and remote tests [2,3].

Furthermore, a tool like the Web Audio Evaluation Tool has many safeguards to compensate this loss of control. Examples of these features include

  • Extensive metrics Timestamps corresponding with playback and movement events can be automatically visualised to show when participants auditioned which samples and for how long; when they moved which slider from where to where; and so on.
  • Post-test checks Upon submissions, optional dialogs can remind the participant of certain instructions, e.g. to listen to all fragments; to move all sliders at least once; to rate at least one stimulus below 20% or at exactly 100%; …
  • Audiometric test and calibration of the playback system An optional series of sliders shown at the start of a test, to be set by the participant so that sine waves an octave apart are all equally loud.
  • Survey questions Most relevant background information on the participant’s background and playback system can be captured by well-phrased survey questions, which can be incorporated at the start or end of the test.
The Web Audio Evaluation Tool - an example test interface inspired by the popular MUSHRA standard, typical for the evaluation of audio codecs.

Web Audio Evaluation Tool – an example test interface inspired by the popular MUSHRA standard, typical for the evaluation of audio codecs.


Cross-platform, no third-party software needed


Source: Sewell Support

Listening test interfaces can be hard to design, with many factors to take into account. On top of that it may not always be possible to use your personal machine for (all) your listening tests, even when all your tests are ‘local’.

When your interface requires a third party, proprietary tool like MATLAB or Max to be set up, this can pose a problem as this may not be available where the test is to take place. Furthermore, upgrades to newer versions of this third party software has been known to ‘break’ listening test software, meaning many more hours of updating and patching.

This is a much bigger problem when the test is to take place at different locations, with different computers and potentially different versions of operating systems or other software.

This has been the single most important driving factor behind the development of the Web Audio Evaluation Tool, even for projects where all tests were controlled, i.e. not hosted on a web server, with ‘internet strangers’ as participants, but in a dedicated listening room with known, skilled participants. Because these listening rooms can have very different computers, operating systems, and geographical locations, using a standalone test or a third party application such as MATLAB is often very tedious or even impossible.

In contrast, a browser-based tool typically works on any machine and operating system that supports the browsers it was designed for. In the case of the Web Audio Evaluation Tool, this means Firefox, Chrome, Edge, Safari, … Essentially every browser which supports the Web Audio API.


Multiple machines with centralised results collection

Central server
Source: jaxonraye.com

Another benefit of a browser-based listening test, again regardless of whether your test takes place ‘locally’ or ‘remotely’, is the possibility of easy, centralised collection of results of these tests. Not only is this more elegant than fetching every test result with a USB drive (from any number of computers you are using), but it is also much safer to save the result to your own server straight away. If you are more paranoid (which is encouraged in the case of listening tests), you can then back up this server continually for redundancy.

In the case of the Web Audio Evaluation Tool, you just put the test on a (local or remote) web server, and the results will be stored to this server by default.
Others have put the test on a regular file server (not web server) and run the included Python server emulator script python/pythonServer.py from the test computer. The results are then stored to the file server, which can be your personal machine on the same network.
Intermediate versions of the results are stored as well, so that an outage of the test computer means the results are not lost in the event of a computer crash, a human error or a forgotten dentist appointment. The test can be resumed at any point.

Multiple participants using the Web Audio Evaluation Tool at the same time, at Queen Mary University of London


Leveraging other web technologies


Source: LinkedIn

Finally, any listening test which is essentially a website, can be integrated within other sites or enhanced with any kind of web technologies. We have already seen clever use of YouTube videos as instructions or HTML index pages tracking progression through a series of tests.

The Web Audio Evaluation Tool seeks to facilitate this by providing the optional returnURL attribute, which specifies the page the participant is redirected to upon completion of the test. This page can be anything from a Doodle to schedule the next test session, an Amazon voucher, a reward cat video, to a secret Eventbrite page for a test participant party.


Are there any other benefits to using a browser-based tool for your listening tests? Please let us know!

[1] N. Jillings, B. De Man, D. Moffat and J. D. Reiss, “Web Audio Evaluation Tool: A browser-based listening test environment,” 12th Sound and Music Computing Conference, 2015.

[2] M. Cartwright, B. Pardo, G. Mysore and M. Hoffman, “Fast and Easy Crowdsourced Perceptual Audio Evaluation,” IEEE International Conference on Acoustics, Speech and Signal Processing, 2016.

[3] M. Schoeffler, F.-R. Stöter, H. Bayerlein, B. Edler and J. Herre, “An Experiment about Estimating the Number of Instruments in Polyphonic Music: A Comparison Between Internet and Laboratory Results,” 14th International Society for Music Information Retrieval Conference, 2013.

This post originally appeared in modified form on the Web Audio Evaluation Tool Github wiki