Exciting research at the upcoming Audio Engineering Society Convention


About five months ago, we previewed the last European Audio Engineering Society Convention, which we followed with a wrap-up discussion. The next AES  convention is just around the corner, October 18 to 21st in New York. As before, the Audio Engineering research team here aim to be quite active at the convention.

These conventions are quite big, with thousands of attendees, but not so large that you get lost or overwhelmed. Away from the main exhibition hall is the Technical Program, which includes plenty of tutorials and presentations on cutting edge research.

So here, we’ve gathered together some information about a lot of the events that we will be involved in, attending, or we just thought were worth mentioning. And I’ve gotta say, the Technical Program looks amazing.


One of the first events of the Convention is the Diversity Town Hall, which introduces the AES Diversity and Inclusion Committee. I’m a firm supporter of this, and wrote a recent blog entry about female pioneers in audio engineering. The AES aims to be fully inclusive, open and encouraging to all, but that’s not yet fully reflected in its activities and membership. So expect to see some exciting initiatives in this area coming soon.

In the 10:45 to 12:15 poster session, Steve Fenton will present Alternative Weighting Filters for Multi-Track Program Loudness Measurement. We’ve published a couple of papers (Loudness Measurement of Multitrack Audio Content Using Modifications of ITU-R BS.1770, and Partial loudness in multitrack mixing) showing that well-known loudness measures don’t correlate very well with perception when used on individual tracks within a multitrack mix, so it would be interesting to see what Steve and his co-author Hyunkook Lee found out. Perhaps all this research will lead to better loudness models and measures.

At 2 pm, Cleopatra Pike will present a discussion and analysis of Direct and Indirect Listening Test Methods. I’m often sceptical when someone draws strong conclusions from indirect methods like measuring EEGs and reaction times, so I’m curious what this study found and what recommendations they propose.

The 2:15 to 3:45 poster session will feature the work with probably the coolest name, Influence of Audience Noises on the Classical Music Perception on the Example of Anti-cough Candies Unwrapping Noise. And yes, it looks like a rigorous study, using an anechoic chamber to record the sounds of sweets being unwrapped, and the signal analysis is coupled with a survey to identify the most distracting sounds. It reminds me of the DFA faders paper from the last convention.

At 4:30, researchers from Fraunhofer and the Technical University of Ilmenau present Training on the Acoustical Identification of the Listening Position in a Virtual Environment. In a recent paper in the Journal of the AES, we found that training resulted in a huge difference between participant results in a discrimination task, yet listening tests often employ untrained listeners. This suggests that maybe we can hear a lot more than what studies suggest, we just don’t know how to listen and what to listen for.


If you were to spend only one day this year immersing yourself in frontier audio engineering research, this is the day to do it.

At 9 am, researchers from Harman will present part 1 of A Statistical Model that Predicts Listeners’ Preference Ratings of In-Ear Headphones. This was a massive study involving 30 headphone models and 71 listeners under carefully controlled conditions. Part 2, on Friday, focuses on development and validation of the model based on the listening tests. I’m looking forward to both, but puzzled as to why they weren’t put back-to-back in the schedule.

At 10 am, researchers from the Tokyo University of the Arts will present Frequency Bands Distribution for Virtual Source Widening in Binaural Synthesis, a technique which seems closely related to work we presented previously on Cross-adaptive Dynamic Spectral Panning.

From 10:45 to 12:15, our own Brecht De Man will be chairing and speaking in a Workshop on ‘New Developments in Listening Test Design.’ He’s quite a leader in this field, and has developed some great software that makes the set up, running and analysis of listening tests much simpler and still rigorous.

In the 11-12:30 poster session, Nick Jillings will present Automatic Masking Reduction in Balance Mixes Using Evolutionary Computing, which deals with a challenging problem in music production, and builds on the large amount of research we’ve done on Automatic Mixing.

At 11:45, researchers from McGill will present work on Simultaneous Audio Capture at Multiple Sample Rates and Formats. This helps address one of the challenges in perceptual evaluation of high resolution audio (and see the open access journal paper on this), ensuring that the same audio is used for different versions of the stimuli, with only variation in formats.

At 1:30, renowned audio researcher John Vanderkooy will present research on how a  loudspeaker can be used as the sensor for a high-performance infrasound microphone. In the same session at 2:30, researchers from Plextek will show how consumer headphones can be augmented to automatically perform hearing assessments. Should we expect a new audiometry product from them soon?

At 2 pm, our own Marco Martinez Ramirez will present Analysis and Prediction of the Audio Feature Space when Mixing Raw Recordings into Individual Stems, which applies machine learning to challenging music production problems. Immediately following this, Stephen Roessner discusses a Tempo Analysis of Billboard #1 Songs from 1955–2015, which builds partly on other work analysing hit songs to observe trends in music and production tastes.

At 3:45, there is a short talk on Evolving the Audio Equalizer. Audio equalization is a topic on which we’ve done quite a lot of research (see our review article, and a blog entry on the history of EQ). I’m not sure where the novelty is in the author’s approach though, since dynamic EQ has been around for a while, and there are plenty of harmonic processing tools.

At 4:15, there’s a presentation on Designing Sound and Creating Soundscapes for Still Images, an interesting and unusual bit of sound design.


Judging from the abstract, the short Tutorial on the Audibility of Loudspeaker Distortion at Bass Frequencies at 5:30 looks like it will be an excellent and easy to understand review, covering practice and theory, perception and metrics. In 15 minutes, I suppose it can only give a taster of what’s in the paper.

There’s a great session on perception from 1:30 to 4. At 2, perceptual evaluation expert Nick Zacharov gives a Comparison of Hedonic and Quality Rating Scales for Perceptual Evaluation. I think people often have a favorite evaluation method without knowing if its the best one for the test. We briefly looked at pairwise versus multistimuli tests in previous work, but it looks like Nick’s work is far more focused on comparing methodologies.

Immediately after that, researchers from the University of Surrey present Perceptual Evaluation of Source Separation for Remixing Music. Techniques for remixing audio via source separation is a hot topic, with lots of applications whenever the original unmixed sources are unavailable. This work will get to the heart of which approaches sound best.

The last talk in the session, at 3:30 is on The Bandwidth of Human Perception and its Implications for Pro Audio. Judging from the abstract, this is a big picture, almost philosophical discussion about what and how we hear, but with some definitive conclusions and proposals that could be useful for psychoacoustics researchers.


Grateful Dead fans will want to check out Bridging Fan Communities and Facilitating Access to Music Archives through Semantic Audio Applications in the 9 to 10:30 poster session, which is all about an application providing wonderful new experiences for interacting with the huge archives of live Grateful Dead performances.

At 11 o’clock, Alessia Milo, a researcher in our team with a background in architecture, will discuss Soundwalk Exploration with a Textile Sonic Map. We discussed her work in a recent blog entry on Aural Fabric.

In the 2 to 3:30 poster session, I really hope there will be a live demonstration accompanying the paper on Acoustic Levitation.

At 3 o’clock, Gopal Mathur will present an Active Acoustic Meta Material Loudspeaker System. Metamaterials are receiving a lot of deserved attention, and such advances in materials are expected to lead to innovative and superior headphones and loudspeakers in the near future.


The full program can be explored on the Convention Calendar or the Convention website. Come say hi to us if you’re there! Josh Reiss (author of this blog entry), Brecht De Man, Marco Martinez and Alessia Milo from the Audio Engineering research team within the Centre for Digital Music  will all be there.



Physically Derived Sound Synthesis Model of a Propeller

I recently presented my work on the real-time sound synthesis of a propeller at the 12th International Audio Mostly Conference in London. This sound effect is a continuation of my research into aeroacoustic sounds generated by physical models; an extension of my previous work on the Aeolian harp, sword sounds and Aeolian tones.

A demo video of the propeller model attached to an aircraft object in unity is given here. I use the Unity Doppler effect which I have since discovered is not the best and adds a high-pitched artefact but you’ll get the idea! The propeller physical model was implemented in Pure Data and transferred to Unity using the Heavy compiler.

So, when I was looking for an indication of the different sound sources in a propeller sound I found an excellent paper by JE Marte and DW Kurtz. (A review of aerodynamic noise from propellers, rotors, and lift fans. Jet Propulsion Laboratory, California Institute of Technology, 1970) This paper provides a breakdown of the different sound sources, replicated for you here.

The sounds are split into periodic and broadband groups. In the periodic sounds, there are rotational sounds associated with the forces on the blade and interaction and distortion effects. The first rotational sound is the Loading sounds. These are associated with the thrust and torque of each propeller blade.

To picture these forces, imagine you are sitting on an aircraft wing, looking down the span, travelling at a fixed speed and uniform air flowing over the aerofoil. From your point of view the wing will have a lift force associated with it and a drag force. Now if we change the aircraft wing to a propeller blade with similar profile to an aerofoil, spinning at a set RPM. If you are sitting at a point on the blade the thrust and torque will be constant at the point you are sat.

Now stepping off the propeller blade and examining the disk of rotation the thrust and torque forces will appear as pulses at the blade passing frequency. For example, a propeller with 2 blades, rotating at 2400 RPM will have a blade passing frequency of 80Hz. A similar propeller with 4 blades, rotating at the same RPM will have a blade passing frequency of 160Hz.

Thickness noise is the sound generated as the blade moves the air aside when passing. This sound is found to be small when blades are moving at the speed of sound, 343 m/s, (known as a speed of Mach 1), and is not considered in our model.

Interaction and distortion effects are associated with helicopter rotors and lift fans. Because these have horizontally rotating blades an effect called blade slap occurs, where the rotating blade passes through the vortices shed by the previous blade causing a large slapping sound. Horizontal blades also have AM and FM modulated signals related with them as well as other effects. Since we are looking at propellers that spin mostly vertically, we have omitted these effects.

The broadband sounds of the propeller are closely related to the Aeolian tone models I have spoken about previously. The vortex sounds are from the vortex shedding, identical to out sword model. This difference in this case is that a propeller has a set shape which more like an aerofoil than a cylinder.

In the Aeolian tone paper, published at AES, LA, 2016, it was found that for a cylinder the frequency can be determined by an equation defined by Strouhal. The ratio of the diameter, frequency and airspeed are related by the Strouhal number, found for a cylinder to be approximately 0.2. In the paper D Brown and JB Ollerhead, Propeller noise at low tip speeds. Technical report, DTIC Document, 1971, a Strouhal number of 0.85 was found for propellers. This was used in our model, along with the chord length of the propeller instead of the diameter.

We also include the wake sound in the Aeolian tone model which is similar to the turbulence sounds. These are only noticeable at high speeds.

The paper by Martz et. al. outlines a procedure by Hamilton Standard, a propeller manufacturer, for predicting the far field loading sounds. Along with the RPM, number of blades, distance, azimuth angle we need the blade diameter, and engine power. We first decided which aircraft we were going to model. This was determined by the fact that we wanted to carry out a perceptual test and had a limited number of clips of known aircraft.

We settled on a Hercules C130, Boeing B17 Flying Fortress, Tiger Moth, Yak-52, Cessna 340 and a P51 Mustang. The internet was searched for details like blade size, blade profile (to calculate chord lengths along the span of the blade), engine power, top speed and maximum RPM. This gave enough information for the models to be created in pure data and the sound effect to be as realistic as possible.

This enables us to calculate the loading sounds and broadband vortex sounds, adding in a Doppler effect for realism. What was missing is an engine sound – the aeroacoustic sounds will not happen in isolation in our model. To rectify this a model from Andy Farnell’s Designing Sound was modified to act as our engine sound.

A copy of the pure data software can be downloaded from this site, https://code.soundsoftware.ac.uk/hg/propeller-model. We performed listening tests on all the models, comparing them with an alternative synthesis model (SMS) and the real recordings we had. The tests highlighted that the real sounds are still the most plausible but our model performed as well as the alternative synthesis method. This is a great result considering the alternative method starts with a real recording of a propeller, analyses it and re-synthesizes it. Our model starts with real world physical parameters like the blade profile, engine power, distance and azimuth angles to produce the sound effect.

An example of the propeller sound effect is mixed into this famous scene from North by Northwest. As you can hear the effect still has some way to go to be as good as the original but this physical model is the first step in incorporating fluid dynamics of a propeller into the synthesis process.

From the editor: Check out all Rod’s videos at https://www.youtube.com/channel/UCIB4yxyZcndt06quMulIpsQ

A copy the paper published at Audio Mostly 2017 can be found here >> Propeller_AuthorsVersion

Sound Effects Taxonomy

At the upcoming International Conference on Digital Audio Effects, Dave Moffat will be presenting recent work on creating a sound effects taxonomy using unsupervised learning. The paper can be found here.

A taxonomy of sound effects is useful for a range of reasons. Sound designers often spend considerable time searching for sound effects. Classically, sound effects are arranged based on some key word tagging, and based on what caused the sound to be created – such as bacon cooking would have the name “BaconCook”, the tags “Bacon Cook, Sizzle, Open Pan, Food” and be placed in the category “cooking”. However, most sound designers know that the sound of frying bacon can sound very similar to the sound of rain (See this TED talk for more info), but rain is in an entirely different folder, in a different section of the SFx Library.

The approach, is to analyse the raw content of the audio files in the sound effects library, and allow a computer to determine which sounds are similar, based on the actual sonic content of the sound sample. As such, the sounds of rain and frying bacon will be placed much closer together, allowing a sound designer to quickly and easily find related sounds that relate to each other.

Here’s a figure from the paper, comparing the generated taxonomy to the original sound effect library classification scheme.


12th International Audio Mostly Conference, London 2017

by Rod Selfridge & David Moffat. Photos by Beici Liang.

Audio Mostly – Augmented and Participatory Sound and Music Experiences, was held at Queen Mary University of London between 23 – 26 August. The conference brought together a wide variety of audio and music designers, technologists, practitioners and enthusiasts from all over the world.

The opening day of the conference ran in parallel with the Web Audio Conference, also being held at Queen Mary, with sessions open to all delegates. The day opened with a joint Keynote from the computer scientist and author of the highly influential sound effect book – Designing Sound, Andy Farnell. Andy covered a number of topics and invited audience participation which grew into a discussion regarding intellectual property – the pros and cons if it was done away with.

Andy Farnell

The paper session then opened with an interesting talk by Luca Turchet from Queen Mary’s Centre for Digital Music. Luca presented his paper on The Hyper Mandolin, an augmented music instrument allowing real-time control of digital effects and sound generators. The session concluded with the second talk I’ve seen in as many months by Charles Martin. This time Charles presented Deep Models for Ensemble Touch-Screen Improvisation where an artificial neural network model has been used to implement a live performance and sniffed touch gestures of three virtual players.

In the afternoon, I got to present my paper, co-authored by David Moffat and Josh Reiss, on a Physically Derived Sound Synthesis Model of a Propeller. Here I continue the theme of my PhD by applying equations obtained through fluid dynamics research to generate authentic sound synthesis models.

Rod Selfridge

The final session of the day saw Geraint Wiggins, our former Head of School at EECS, Queen Mary, present Callum Goddard’s work on designing Computationally Creative Musical Performance Systems, looking at questions like what makes performance virtuosic and how this can be implemented using the Creative Systems Framework.

The oral sessions continued throughout Thursday, one presentation that I found interesting was by Anna Xambo titles Turn-Taking and Chatting in Collaborative Music Live Coding. In this research the authors explored collaborative music live coding using the live coding environment and pedagogical tool EarSketch, focusing on the benefits to both performance and education.

Thursday’s Keynote was by Goldsmith’s Rebecca Fiebrink, who was mentioned in a previous blog, discussing how machine learning can be used to support human creative experiences, aiding human designers for rapid prototyping and refinement of new interactions within sound and media.

Rebecca Fiebrink

The Gala Dinner and Boat Cruise was held on Thursday evening where all the delegates were taken on a boat up and down the Thames, seeing the sites and enjoying food and drink. Prizes were awarded and appreciation expressed to the excellent volunteers, technical teams, committee members and chairpersons who brought together the event.

Tower Bridge

A session on Sports Augmentation and Health / Safety Monitoring was held on Friday Morning which included a number of excellent talks. The presentation of the conference went to Tim Ryan who presented his paper on 2K-Reality: An Acoustic Sports Entertainment Augmentation for Pickup Basketball Play Spaces. Tim re-contextualises sounds appropriated from a National Basketball Association (NBA) video game to create interactive sonic experiences for players and spectators. I was lucky enough to have a play around with this system during a coffee break and can easily see how it could give an amazing experience for basketball enthusiasts, young and old, as well as drawing in a crowd to share.

Workshops ran on Friday afternoon. I went to Andy Farnell’s Zero to Hero Pure Data Workshop where participants managed to go from scratch to having a working bass drum, snare and high-hat synthesis models. Andy managed to illustrate how quickly these could be developed and included in a simple sequencer to give a basic drum machine.

Throughout the conference a number of fixed media, demos were available for delegates to view as well as poster sessions where authors presented their work.

Alessia Milo

Live music events were held on both Wednesday and Friday. A joint session titled Web Audio Mostly Concert was held on Wednesday which was a joint event for delegates of Audio Mostly and the Web Audio Conference. This included an augmented reality musical performance, a human-playable robotic zither, the Hyper Mandolin and DJs.

The Audio Mostly Concert on the Friday included a Transmusicking performance from a laptop orchestra from around the world, where 14 different performers collaborated online. The performance was curated by Anna Xambo. Alan Chamberlain and David De Roure performed The Gift of the Algorithm, which was a computer music performance inspired by Ada Lovelace. The wood and the water was an immersive performance of interactivity and gestural control of both a Harp and lighting for the performance, by Balandino Di Donato and Eleanor Turner. GrainField, by Benjamin Matuszewski and Norbert Schnell, was an interactive audio performance that demanded entire audience involvement, for the performance to exist, this collective improvisational piece demonstrated a how digital technology can really be used to augment the traditional musical experience. GrainField was awarded the prize for the best musical performance.

Adib Mehrabi

The final day of the conference was a full day’s workshop. I attended the one titled Designing Sounds in the Cloud. The morning was spent presenting two ongoing European Horizon 2020 projects, Audio Commons (www.audiocommons.org/) and Rapid-Mix. The Audio Commons initiative aims to promote the use of open audio content by providing a digital ecosystem that connects content providers and creative end users. The Rapid-Mix project focuses on multimodal and procedural interactions leveraging on rich sensing capabilities, machine learning and embodied ways to interact with sound.

Before lunch we took part in a sound walk around the Queen Mary Mile End Campus, with one of each group blindfolded, informing the other what they could hear. The afternoon session had teams of participants designing and prototyping new ways to use the APIs from each of the two Horizon 2020 projects – very much in the feel of a hackathon. We devised a system which captured expressive Italian hand gestures using the Leap Motion and classified them using machine learning techniques. Then in pure data each new classification triggered a sound effect taken from the Freesound website (part of the audio commons project). If time would have allowed the project would have been extended to have pure data link to the audio commons API and play sound effects straight from the web.

Overall, I found the conference informative, yet informal, enjoyable and inclusive. The social events were spectacular and ones that will be remembered by delegates for a long time.

International Congress on Sound and Vibration (ICSV) London 2017

The International Congress on Sound and Vibration (ICSV) may not be the first conference you would think of for publishing the results of research into a sound effect but that’s exactly what we have just returned from. I presented our paper on the Real-Time Physical Model of an Aeolian harp to a worldwide audience of the top researchers in sound and vibration.


The Congress opened with a keynote from Professor Eric Heller discussing acoustics resonance and formants following by a whole day of musical acoustics chaired by Professor Murray Campbell from Edinburgh University. One interesting talk was given by Stephen Dance of London South Bank University where a hearing study of music students was carried out. Their results showed that the hearing of the music students improved over the 3 years of their course even though none of the students would wear ear protection while playing. The only degradation of hearing was experienced by oboe players. Possible reasons being the fast attack time of the instrument and the fact that the oboe players were stood directly in front of the brass players when playing as an orchestra.


The opening day also had a talk titled – Artificial neural network based model for the crispness impression of the potato chip sounds  by Ercan Altinsoy from Dresden University of Technology. This researched looked into the acoustical properties of food and the impression of freshness that was inferred from this.


I presented my research on the Real-time physical model of an aeolian harp, describing the sound synthesis of this unusual musical instrument. The synthesis model captures the interaction between the mechanical vibration properties of each string and the vortices being shed from the wind blowing around them.


The session ended with Application of sinusoidal curves to shape design of chord sound plate and experimental verification by Bor-Tsuen Wang Department of Mechanical Engineering, National Pingtung University of Science and Technology, Pingtung, Taiwan. This work reviews the design concept of chord sound plate (CSP) that is a uniform thickness plate with special curved shape designed by Bezier curve (B-curve) method. The CSP can generate the percussion sound with three tone frequencies that consist of the musical note frequencies of triad chord.


A presentation from Gaku Minorikawa, Hosei University, Department of Mechanical Engineering, Faculty of Science and Engineering, Tokyo, Japan, discussed his research into the reduction of noise from fans – highly relevant to audio engineers who want the quietest computers as possible for a studio. Prediction for noise reduction and characteristics of flow induced noise on axial cooling fan 


There was an interesting session on the noise experienced in open plan offices and how other noise sources are introduced to apply acoustic masking to certain areas. The presentation by Charles Edgington illustrated practical implementations of such masking and considerations that have to be made. Practical considerations and experiences with sound masking’s latest technology


The testing of a number of water features within an open plan office was presented in Audio-visual preferences of water features used in open-plan offices by Zanyar Abdalrahman from Heriot-Watt University, School of Energy, Geoscience, Infrastructure and Society, Edinburgh. Here a number of water feature contractions were examined.


The difficulty of understanding the speech of the participants in both rooms of a video conference  was researched by Charlotte Hauervig-Jørgensen from Technical University of Denmark. Subjective rating and objective evaluation of the acoustic and indoor climate conditions in video conferencing rooms. Moving away from office acoustics to house construction I saw a fascinating talk by Francesco D’Alessandro, University of Perugia. This paper aims at investigating the acoustic properties of straw bale constructions. Straw as an acoustic material


One session was dedicated to Sound Field Control and 3D Audio with a total of 18 papers presented on this topic. Filippo Fazi from University of Southampton presented a paper on A loudspeaker array for 2 people transaural reproduction which introduced a signal processing approach for performing 2-people Transaural reproduction using a combination of 2 single-listener cross-talk cancellation (CTC) beamformers, so that the CTC is maximised at one listener position and the beamformer side-lobes radiate little energy not to affect the other listening position.


Another session running was Thermoacoustics research in a gender-balanced setting. For this session alternate female and male speakers presented their work on thermoacoustics. Francesca Sogaro from Imperial College London presented her work on Sensitivity analysis of thermoacoustic instabilities. Presenting Sensescapes fascilitating life quality, Frans Mossberg of The Sound Environment Center at Lund University, Sweden is examine research into what can be done to raise awareness of the significance of sense- and soundscape for health, wellbeing and communication.


The hearing aid is a complex yet common device used to assist those suffering from hearing loss. In their paper on Speech quality enhancement in digital hearing aids: an active noise control approach, Somanath Pradhan, (Indian Institute of Technology Gandhinagar), has attempted to overcome limitations of noise reduction techniques by introducing a reduced complexity integrated active noise cancellation approach, along with noise reduction schemes.


Through a combination of acoustic computer modelling, network protocol, game design and signal processing, the paper Head-tracked auralisations for a dynamic audio experience in virtual reality sceneries proposes a method for bridging acoustic simulations and interactive technologies, i.e. fostering a dynamic acoustic experience for virtual scenes via VR-oriented auralisations. This was presented by Eric Ballesteros, London South Bank University.


The final day also included a number of additional presentations form our co-author, Dr Avital, including ‘Differences in the Non Linear Propagation of Crackle and Screech and Aerodynamic and Aeroacoustic Re-Design of Low Speed Blade Profile. The conference’s final night concluded with a banquet at the Sheraton Park Lane Hotel in its Grade 2 listed ballroom. The night included a string quartet, awards and Japanese opera singing. Overall this was a conference with a vast number of presentations from a number of different fields.

So you want to write a research paper

The Audio Engineering research team here submit a lot of conference papers. In our internal reviewing and when we review submissions by others, certain things come up again and again. I’ve compiled all this together as some general advice for putting together a research paper for an academic conference, especially in engineering or computer science. Of course, there are always exceptions, and the advice below doesn’t always apply. But its worth thinking of this as a checklist to catch errors and issues in an early draft.

Make sure the abstract is self-contained. Don’t assume the person reading the abstract will read the paper, or vice-versa. Avoid acronyms. Be sure to actually say what the results were and what you found out, rather than just saying you applied the techniques and analysed the data that came out.
The abstract is part summary of the paper, and part an advertisement for why someone should read the paper. And keep in mind that far more people read the abstract than read the paper itself.
Make clear what the problem is and why it is important. Why is this paper needed, and what is going to distinguish this paper from the others?
In the last paragraph, outline the structure of the rest of the paper. But make sure that it is specific to the structure of the paper.

Background/state of the art/prior work – this could be a subsection of introduction, text within the introduction, or its own section right after the introduction. What have others done, what is the most closely related work? Don’t just list a lot of references. Have something to say about each reference, and relate them to the paper. If a lot of people have approached the same or similar problems, consider putting the methods into a table, where for each method, you have columns for short description, the reference(s), their various properties and their assumptions. If you think no one has dealt with your topic before, you probably just haven’t looked deep enough 😉 . Regardless, you should still explain what is the closest work, perhaps highlighting how they’ve overlooked your specific problem.

Problem formulation – after describing state of the art, this could be a subsection of introduction, text within the introduction, or its own section. Give a clear and unambiguous statement of the problem, as you define it and as it is addressed herein. The aim here is to be rigorous, and remove any doubt about what you are doing. It also allows other work to be framed in the same way. When appropriate, this is described mathematically, e.g., we define these terms, assume this and that, and we attempt to find an optimal solution to the following equation.

The structure of this, the core of the paper, is highly dependent on the specific work. One good approach is to have quite a lot of figures and tables. Then most of the writing is mainly just explaining and discusses these figures and tables, and the ordering of these should be mostly clear.
A typical ordering is
Describe the method, giving block diagrams where appropriate
Give any plots that analyse and illustrate the method, but aren’t using the method to produce results that address the problem
Present the results of using your method to address the problem. Keep the interpretation of the results here short, unless detailed explanation of a result it is needed to justify the next result that is presented. If there is lengthy discussion or interpretation, then leave that to a discussion section.

Equations and notation
For most papers in signal processing and related fields, at least a few equations are expected. The aim with equations is always to make the paper more understandable and less ambiguous. So avoid including equations just for the sake of it, avoid equations if they are just an obvious intermediate step, or if they aren’t really used in any way (e.g. ‘we use the Fourier transform, which by the way, can be given in this equation. Moving on…’), do use equations if they clear up any confusion when a technical concept is explained just with text.
Make sure every equation can be fully understood. All terms and notation should be defined, right before or right after they are used in the text. The logic or process of going from one equation to the next should be made clear.
Tables and figures
Where possible, these should be somewhat self-contained. So one should be able to look at a figure and understand it without reading the paper. If that isn’t possible, then it should be understood just by looking at the figure and figure caption. If not, then by just looking at the figure, caption and a small amount of text where the figure is described.
Figure captions typically go immediately below figures, but table captions typically above tables.
Label axes in figures wherever possible, and give units. If units are not appropriate, make clear that an axis is unitless. For any text within a figure, make sure that the font size used is close to the font size of the main text in the paper. Often, if you import a figure from software intending for viewing on a screen (like matlab), then the font can appear miniscule when the figure is imported into a paper.
Make sure all figures and tables are numbered and are all referenced, by their number, in the main text. Position them close to where they are first mentioned in the text. Don’t use phrasing that refers to their location, like ‘the figure below’ or ‘the table on the next page’, partly because their location may change in the final version.
Make sure all figures are high quality. Print out the paper before submitting and check that it all looks good, is high resolution, and nicely formatted.


Discussion/Future work/conclusion
Discussion and future work may be separate sections or part of the conclusion. Discussion is useful if the results need to be interpreted, but is often kept very brief in short papers where the results may speak for themselves.
Future work is not about what the author plans to do next. Its about research questions that arose or were not addressed, and research directions that are worth pursuing. The answers to these research questions may be pursued by the author or others. Here, you are encouraging others to build on the work in this paper, and suggesting to them the most promising directions and approaches. Future work is usually just a couple sentences or couple paragraphs at the end of conclusion, unless there is something particularly special about it.
The conclusion should not simply repeat the abstract or summarise the paper, though there may be an element of that. Its about getting across what were the main things that the reader should take away and remember. What was found out? What was surprising? What are the main insights that arose? If the research question is straightforward and directly addressed, what was the answer?


The most important criterion for references is to cite wherever it justifies a claim, clarifies a point, identifies where an idea is coming from someone else, or helps the reader find pertinent additional material. If you’re dealing with a very niche or underexplored topic, you may wish to give a full review of all existing literature on the subject.
Aim for references to come from high impact, recent peer reviewed journal articles, or as close to that as possible. So for instance, choose a journal over a conference article if you can, but maybe a highly cited conference paper over an obscure journal paper.
Avoid using web site references. If the reference is essentially just a URL, then put that directly in the text or as a footnote, but not as a citation. And no one cares when you accessed the website so no need to say ‘accessed on [date]’. If it’s a temporary record that may have only been there for a short period of time before the paper submission date, its probably not a reliable reference, won’t help the reader and you should probably find an alternative citation.
Check your reference formatting, especially if you use someone else’s reference library or some automatically generated citations. For instance some citations will have a publisher and a conference name, so it reads as ‘the X Society Conference, published by the X Society.
Be consistent. So for instance, have all references use author initials, or none of them. Always use journal abbreviations, or never use them. Always include the city of a conference, or never do it. And so on.

SMC Conference, Espoo, Finland

I have recently returned from the 14th Sound and Music Computing Conference hosted by Aalto University, Espoo, Finland. All 4 days were full of variety and quality, ensuring there was something of interest for all. There was also live performances during an afternoon session and 2 evenings as well as the banquet on Hanasaari, a small island in Espoo. This provided a friendly framework for all the delegates to interact, making or renew connections.
The paper presentations were the main content of the programme with presenters from all over the globe. Papers that stood out for me were Johnty Wang et al – Explorations with Digital Control of MIDI-enabled Pipe Organs where I heard the movement of an unborn child control the audio output of a pipe organ. I became aware of the Championship of Standstill where participants are challenged to standstill while a number of musical pieces are played – The Musical Influence on People’s Micromotion when Standing Still in Groups.
Does Singing a Low-Pitch Tone Make You Look Angrier? well it looked like it in  this interesting presentation! A social media music app was presented in Exploring Social Mobile Music with Tiny Touch-Screen Performances where we can interact with others by layering 5 second clips of sound to create a collaborative mix.
Analysis and synthesis was well represented with a presentation on Virtual Analog Simulation and Extensions of Plate Reverberation by Silvan Willemson et al and The Effectiveness of Two Audiovisual Mappings to Control a Concatenate Synthesiser by Augoustinos Tiros et al. The paper on Virtual Analog Model of the Lockhart Wavefolder explaining a method of modelling West Coast style analogue synthesiser.
Automatic mixing was also represented. Flavio Everard’s paper on Towards an Automated Multitrack Mixing Tool using Answer Set Programming, citing at least 8 papers from the Intelligent Audio Engineering group at C4DM.
In total 65 papers were presented orally or in the poster sessions with sessions on Music performance analysis and rendering, Music information retrieval, Spatial sound and sonification, Computer music languages and software, Analysis, synthesis and modification of sound, Social interaction, Computer-based music analysis and lastly Automatic systems and interactive performance. All papers are available at http://smc2017.aalto.fi/proceedings.html.
Having been treated to a wide variety of live music, technical papers and meeting colleagues from around the world, it was a added honour to be presented with one of the Best Paper Awards for our paper on Real-Time Physical Model for Synthesis of Sword Sounds. The conference closed with a short presentation from the next host….. SMC2018 – Cyprus!