Radical and rigorous research at the upcoming Audio Engineering Society Convention


We previewed the 142nd, 143rd, 144th  and 145th Audio Engineering Society (AES) Conventions, which we also followed with wrap-up discussions. Then we took a break, but now we’re back to preview the 147th AES  convention, October 16 to 19 in New York. As before, the Audio Engineering research team here aim to be quite active at the convention.

We’ve gathered together some information about a lot of the research-oriented events that caught our eye as being unusual, exceptionally high quality, involved in, attending, or just worth mentioning. And this Convention will certainly live up to the hype.

Wednesday October 16th

When I first read the title of the paper ‘Evaluation of Multichannel Audio in Automobiles versus Mobile Phones‘, presented at 10:30, I thought it was a comparison of multichannel automotive audio versus the tinny, quiet mono or barely stereo from a phone. But its actually comparing results of a listening test for stereo vs multichannel in a car, with results of a listening test for stereo vs multichannel for the same audio, but from a phone and rendered over headphones. And the results look quite interesting.

Deep neural networks are all the rage. We’ve been using DNNs to profile a wide variety of audio effects. Scott Hawley will be presenting some impressive related work at 9:30, ‘Profiling Audio Compressors with Deep Neural Networks.’

We previously presented work on digital filters that closely match their analog equivalents. We pointed out that such filters can have cut-off frequencies beyond Nyquist, but did not explore that aspect. ‘Digital Parametric Filters Beyond Nyquist Frequency‘, at 10 am, investigates this idea in depth.

I like a bit of high quality mathematical theory, and that’s what you get in Tamara Smyth’s 11:30 paper ‘On the Similarity between Feedback/Loopback Amplitude and Frequency Modulation‘, which shows a rather surprising (at least at first glance) equivalence between two types of feedback modulation.

There’s an interesting paper at 2pm, ‘What’s Old Is New Again: Using a Physical Scale Model Echo Chamber as a Real-Time Reverberator‘, where reverb is simulated not with impulse response recordings, or classic algorithms, but using scaled models of echo chambers.

At 4 o’clock, ‘A Comparison of Test Methodologies to Personalize Headphone Sound Quality‘ promises to offer great insights not just for headphones, but into subjective evaluation of audio in general.

There’s so many deep learning papers, but the 3-4:30 poster ‘Modal Representations for Audio Deep Learning‘ stands out from the pack. Deep learning for audio most often works with raw spectrogram data. But this work proposes learning modal filterbank coefficients directly, and they find it gives strong results for classification and generative tasks. Also in that session, ‘Analysis of the Sound Emitted by Honey Bees in a Beehive‘ promises to be an interesting and unusual piece of work. We talked about their preliminary results in a previous entry, but now they’ve used some rigorous audio analysis to make deep and meaningful conclusions about bee behaviour.

Immerse yourself in the world of virtual and augmented reality audio technology today, with some amazing workshops, like Music Production in VR and AR, Interactive AR Audio Using Spark, Music Production in Immersive Formats, ISSP: Immersive Sound System Panning, and Real-time Mixing and Monitoring Best Practices for Virtual, Mixed, and Augmented Reality. See the Calendar for full details.

Thursday, October 17th

An Automated Approach to the Application of Reverberation‘, at 9:30, is the first of several papers from our team, and essentially does something to algorithmic reverb similar to what “Parameter Automation in a Dynamic Range Compressor” did for a dynamic range compressor.

Why do public address (PA) systems sound for large venues sound so terrible? They actually have regulations for speech intelligibility. But this is only measured in empty stadiums. At 11 am, ‘The Effects of Spectators on the Speech Intelligibility Performance of Sound Systems in Stadia and Other Large Venues‘ looks at the real world challenges when the venue is occupied.

Two highlights of the 9-10:30 poster session, ‘Analyzing Loudness Aspects of 4.2 Million Musical Albums in Search of an Optimal Loudness Target for Music Streaming‘ is interesting, not just for the results, applications and research questions, but also for the fact that involved 4.2 million albums. Wow! And there’s a lot more to audio engineering research than what one might think. How about using acoustic sensors to enhance autonomous driving systems, which is a core application of ‘Audio Data Augmentation for Road Objects Classification‘.

Audio forensics is a fascinating world, where audio engineering is often applied to unusually but crucially. One such situation is explored at 2:15 in ‘Forensic Comparison of Simultaneous Recordings of Gunshots at a Crime Scene‘, which involves looking at several high profile, real world examples.

Friday, October 18th

There are two papers looking at new interfaces for virtual reality and immersive audio mixing, ‘Physical Controllers vs. Hand-and-Gesture Tracking: Control Scheme Evaluation for VR Audio Mixing‘ at 10:30, and ‘Exploratory Research into the Suitability of Various 3D Input Devices for an Immersive Mixing Task‘ at 3:15.

At 9:15, J. T. Colonel from our group looks into the features that relate, or don’t relate, to preference for multitrack mixes in ‘Exploring Preference for Multitrack Mixes Using Statistical Analysis of MIR and Textual Features‘, with some interesting results that invalidate some previous research. But don’t let negative results discourage ambitious approaches to intelligent mixing systems, like Dave Moffat’s (also from here) ‘Machine Learning Multitrack Gain Mixing of Drums‘, which follows at 9:30.

Continuing this theme of mixing analysis and automation is the poster ‘A Case Study of Cultural Influences on Mixing Preference—Targeting Japanese Acoustic Major Students‘, shown from 3:30-5, which does a bit of meta-analysis by merging their data with that of other studies.

Just below, I mention the need for multitrack audio data sets. Closely related, and also much needed, is this work on ‘A Dataset of High-Quality Object-Based Productions‘, also in the 3:30-5 poster session.

Saturday, October 19th

We’re approaching a world where almost every surface can be a visual display. Imagine if every surface could be a loudspeaker too. Such is the potential of metamaterials, discussed in ‘Acoustic Metamaterial in Loudspeaker Systems Design‘ at 10:45.

Another session, 9 to 11:30 has lots of interesting presentations about music production best practices. At 9, Amandine Pras presents ‘Production Processes of Pop Music Arrangers in Bamako, Mali‘. I doubt there will be many people at the convention who’ve thought about how production is done there, but I’m sure there will be lots of fascinating insights. This is followed at 9:30 by ‘Towards a Pedagogy of Multitrack Audio Resources for Sound Recording Education‘. We’ve published a few papers on multitrack audio collections, sorely needed for researchers and educators, so its good to see more advances.

I always appreciate filling the gaps in my knowledge. And though I know a lot about sound enhancement, I’ve never dived into how its done and how effective it is in soundbars, now widely used in home entertainment. So I’m looking forward to the poster ‘A Qualitative Investigation of Soundbar Theory‘, shown 10:30-12. From the title and abstract though, this feels like it might work better as an oral presentation. Also in that session, the poster ‘Sound Design and Reproduction Techniques for Co-Located Narrative VR Experiences‘ deserves special mention, since it won the Convention’s Best Peer-Reviewed Paper Award, and promises to be an important contribution to the growing field of immersive audio.

Its wonderful to see research make it into ‘product’, and ‘Casualty Accessible and Enhanced (A&E) Audio: Trialling Object-Based Accessible TV Audio‘, presented at 3:45, is a great example. Here, new technology to enhance broadcast audio for those with hearing loss iwas trialed for a popular BBC drama, Casualty. This is of extra interest to me since one of the researchers here, Angeliki Mourgela, does related research, also in collaboration with BBC. And one of my neighbours is an actress who appears on that TV show.

I encourage the project students working with me to aim for publishable research. Jorge Zuniga’s ‘Realistic Procedural Sound Synthesis of Bird Song Using Particle Swarm Optimization‘, presented at 2:30, is a stellar example. He created a machine learning system that uses bird sound recordings to find settings for a procedural audio model. Its a great improvement over other methods, and opens up a whole field of machine learning applied to sound synthesis.

At 3 o’clock in the same session is another paper from our team, Angeliki Mourgela presenting ‘Perceptually Motivated Hearing Loss Simulation for Audio Mixing Reference‘. Roughly 1 in 6 people suffer from some form of hearing loss, yet amazingly, sound engineers don’t know what the content will sound like to them. Wouldn’t it be great if the engineer could quickly audition any content as it would sound to hearing impaired listeners? That’s the aim of this research.

About three years ago, I published a meta-analysis on perception of high resolution audio, which received considerable attention. But almost all prior studies dealt with music content, and there are good reasons to consider more controlled stimuli too (noise, tones, etc). The poster ‘Discrimination of High-Resolution Audio without Music‘ does just that. Similarly, perceptual aspects of dynamic range compression is an oft debated topic, for which we have performed listening tests, and this is rigorously investigated in ‘Just Noticeable Difference for Dynamic Range Compression via “Limiting” of a Stereophonic Mix‘. Both posters are in the 3-4:30 session.

The full program can be explored on the Convention Calendar or the Convention website. Come say hi to us if you’re there! Josh Reiss (author of this blog entry), J. T. Colonel, Angeliki Mourgela and Dave Moffat from the Audio Engineering research team within the Centre for Digital Music, will all be there.


Back end developer needed for sound synthesis start-up

logo black on white

FXive (fxive.com) is a real-time sound effect synthesis framework in the browser, spun-out from research developed at Queen Mary University of London by the team behind this blog. It is currently front-end only. We’d like to subcontract a backend developer to implement:

  • Sign-up, log-in and subscription system
  • Payment system for subscription, which offers unlimited sound downloads, and purchasing sounds individually

Additional functionalities can be discussed.

If you’re interested or know a web developer that might be interested, please get in touch with us from fxiveteam@gmail.com .

You can check out some of the sound effect synthesis models used in FXive in previous blog entries.


@c4dm @QMUL #backend #webdeveloper #nodejs


Cool sound design and audio effects projects

Every year, I teach two classes (modules), Sound Design and Digital Audio Effects. In both classes, the final assignment involves creating an original work that involves audio programming and using concepts taught in class. But the students also have a lot of free reign to experiment and explore their own ideas. Last year, I had a well received blog entry about the projects.

The results are always great. Lots of really cool ideas, many of which could lead to a publication, or would be great to listen to regardless of the fact that it was an assignment. Here’s a few of the projects this year.

From the Sound Design class;

  • A truly novel abstract sound synthesis (amplitude and frequency modulation) where parameters are controlled by pitch recognition and face recognition machine learning models, using the microphone and the webcam. Users could use their voice and move their face around to affect the sound.
  • An impressive one had six sound models: rain, bouncing ball, sea waves, fire, wind and explosions. It also had a website where each synthesised sound could be compared against real recordings. We couldn’t always tell which was real and which was synthesised!


  • An auditory model of a London Underground train, from the perspective of a passenger on a train, or waiting at a platform. It had a great animation.


  • Two projects involved creating interactive soundscapes auralising an image. One involved a famous photo taken by the photographer, Gregory Crewdson. encapsulating  a dark side of suburban America through surreal, cinematic imagery. The other was an estate area, where there are no bodies visible , giving the impression of an eerie atmosphere where background noises and small sounds are given prominence.

And from the Digital Audio Effects class;

  • A create-your-own distortion effect, where the user can interactively modify the wave shaping curve.
  • Input dependant modulation signal based on the physical mass/ spring system
  • A Swedish death metal guitar effect combining lots of effects for a very distinctive sound
  • A very creative all-in-one audio toy, ‘Ring delay’. This  augmented ping-pong delay effect gives controls over the panning of the delays, the equalization of the audio input and delays, and the output gain. Delays can be played backwards, and the output can be set out-of-phase. Finally, a ring modulator can modulate the audio input to create new sounds to be delayed.
  • Chordify, which transforms an incoming signal, ideally individual notes, into a chord of three different pitches.


  • An audio effects chain inspired by interning at a local radio station. The student helped the owner produce tracks using effects chain presets. But this producers understanding of compressors, EQ, distortion effects… was fairly limited. So the student recreated one of the effects chains into a plugin that only has two adjustable parameters which control multiple parameters inside. 
  • Old Styler, a plug-in that applies sort of a ‘vintage’ effect so that it sounds like from an old radio or an old, black and white movie. Here’s how it sounds.
  • There were some advanced reverbs, including a VST implementation of a state-of-the-art reverberation algorithm known as a Scattering Delay Network (SDN), and a Church reverb incorporating some additional effects to get that ‘church sound’ just right.
  • A pretty amazing cave simulator, with both reverb and random water droplet sounds as part of the VST plug-in.


  • A bit crusher, which also had noise, downsampling and filtering to allow lots of ways to degrade the signal.
  • A VST implementation of the Euclidian Algorithm for world rhythms as described by Goddfried Toussaint in his paper The Euclidean Algorithm Generates Traditional Musical Rhythms.
  • A mid/side processor, with excellent analysis to verify that the student got the implementation just right.
  • Multi-functional distortion pedal. Guitarists often compose music in their bedroom and would benefit from having an effect to facilitate filling the song with a range of sounds, traditionally belonging to other instruments. That’s what this plug-in did, using a lot of clever tricks to widen the soundstage of the guitar.
  • Related to the multi-functional distortion, two students created multiband distortion effects.
  • A Python project that separates a track into harmonic, percussive, and residual components which can be adjusted individually.
  • An effect that attempts to resynthesise any audio input with sine wave oscillators that take their frequencies from the well-tempered scale. This goes far beyond auto-tune, yet can be quite subtle.
  • A source separator plug-in based on Dan Barry’s ADRESS algorithm, described here and here. Along with Mikel Gainza, Dan Barry cofounded the company Sonic Ladder, which released the successful software Riffstation, based on their research.

There were many other interesting assignments, including several variations on tape emulation. But this selection really shows both the talent of the students and the possibilities to create new and interesting sounds.

Aeroacoustic Sound Effects – Journal Article

I am delighted to be able to announce that my article on Creating Real-Time Aeroacoustic Sound Effects Using Physically Informed Models is in this months Journal of the Audio Engineering Society. This is an invited article following winning the best paper award at the Audio Engineering Society 141st Convention in LA. It is an open access article so free for all to download!

The article extends the original paper by examining how the Aeolian tone synthesis models can be used to create a number of sound effects. The benefits of these models are that the produce plausible sound effects which operate in real-time. Users are presented with a number of highly relevant parameters to control the effects which can be mapped directly to 3D models within game engines.

The basics of the Aeolian tone were given in a previous blog post. To summarise, a tone is generated when air passes around an object and vortices are shed behind it. Fluid dynamic equations are available which allow a prediction of the tone frequency based on the physics of the interaction between the air and object. The Aeolian tone is modelled as a compact sound source.

To model a sword or similar object a number of these compact sound sources are placed in a row. A previous blog post describes this in more detail. The majority of compact sound sources are placed at the tip as this is where the airspeed is greatest and the greatest sound is generated.

The behaviour of a sword when being swung has to be modelled which then used to control some of the parameters in the equations. This behaviour can be controlled by a game engine making fully integrated procedural audio models.

The sword model was extended to include objects like a baseball bat and golf club, as well as a broom handle. The compact sound source of a cavity tone was also added in to replicate swords which have grooved profiles. Subjective evaluation gave excellent results, especially for thicker objects which were perceived as plausible as pre-recorded samples.

The synthesis model could be extended to look at a range of sword cross sections as well as any influence of the material of the sword. It is envisaged that other sporting equipment which swing or fly through the air could be modelled using compact sound sources.

A propeller sound is one which is common in games and film and partially based on the sounds generated from the Aeolian tone and vortex shedding. As a blade passes through the air vortices are shed at a specific frequency along the length. To model individual propeller blades the profiles of a number were obtained with specific span length (centre to tip) and chord lengths (leading edge to trailing edge).

Another major sound source is the loading sounds generated by the torque and thrust. A procedure for modelling these sounds is outlined in the article. Missing from the propeller model is distortion sounds. These are more associated with rotors which turn in the horizontal plane.

An important sound when hearing a propeller powered aircraft is the engine sound. The one taken for this model was based on one of Andy Farnell’s from his book Designing Sound. Once complete a user is able to select an aircraft from a pre-programmed bank and set the flight path. If linked to a game engine the physical dimensions and flight paths can all be controlled procedurally.

Listening tests indicate that the synthesis model was as plausible as an alternative method but still not as plausible as pre-recorded samples. It is believed that results may have been more favourable if modelling electric-powered drones and aircraft which do not have the sound of a combustion engine.

The final model exploring the use of the Aeolian tone was that of an Aeolian Harp. This is a musical instrument that is activated by wind blowing around the strings. The vortices that are shed behind the string can activate a mechanical vibration if they are around the frequency of one of the strings natural harmonics. This produces a distinctive sound.

The digital model allows a user to synthesis a harp of up to 13 strings. Tension, mass density, length and diameter can all be adjusted to replicate a wide variety of string material and harp size. Users can also control a wind model modified from one presented in Andy Farnell’s book Designing Sound, with control over the amount of gusts. Listening tests indicate that the sound is not as plausible as pre-recorded ones but is as plausible as alternative synthesis methods.

The article describes the design processes in more detail as well as the fluid dynamic principles each was developed from. All models developed are open source and implemented in pure data. Links to these are in the paper as well as my previous publications. Demo videos can be found on YouTube.

Weird and wonderful research to be unveiled at the 144th Audio Engineering Society Convention


Last year, we previewed the142nd and 143rd AES Conventions, which we followed with a wrap-up discussions here and here. The next AES  convention is just around the corner, May 23 to 26 in Milan. As before, the Audio Engineering research team here aim to be quite active at the convention.

These conventions have thousands of attendees, but aren’t so large that you get lost or overwhelmed. Away from the main exhibition hall is the Technical Program, which includes plenty of tutorials and presentations on cutting edge research.

So we’ve gathered together some information about a lot of the events that caught our eye as being unusual, exceptionally high quality involved in, attending, or just worth mentioning. And this Convention will certainly live up to the hype.

Wednesday May 23rd

From 11:15 to 12:45 that day, there’s an interesting poster by a team of researchers from the University of Limerick titled Can Visual Priming Affect the Perceived Sound Quality of a Voice Signal in Voice over Internet Protocol (VoIP) Applications? This builds on work we discussed in a previous blog entry, where they did a perceptual study of DFA Faders, looking at how people’s perception of mixing changes when the sound engineer only pretends to make an adjustment.

As expected given the location, there’s lots of great work being presented by Italian researchers. The first one that caught my eye is the 2:30-4 poster on Active noise control for snoring reduction. Whether you’re a loud snorer, sleep next to someone who is a loud snorer or just interested in unusual applications of audio signal processing, this one is worth checking out.

Do you get annoyed sometimes when driving and the road surface changes to something really noisy? Surely someone should do a study and find out which roads are noisiest so that then we can put a bit of effort into better road design and better in-vehicle equalisation and noise reduction? Well, now its finally happened with this paper in the same session on Deep Neural Networks for Road Surface Roughness Classification from Acoustic Signals.

Thursday, May 24

If you were to spend only one day this year immersing yourself in frontier audio engineering research, this is the day to do it.

How do people mix music differently in different countries? And do people perceive the mixes differently based on their different cultural backgrounds? These are the sorts of questions our research team here have been asking. Find out more in this 9:30 presentation by Amandine Pras. She led this Case Study of Cultural Influences on Mixing Practices, in collaboration with Brecht De Man (now with Birmingham City University) and myself.

Rod Selfridge has been blazing new trails in sound synthesis and procedural audio. He won the Best Student Paper Award at AES 141st Convention and the Best Paper Award at Sound and Music Computing. He’ll give another great presentation at noon on Physically Derived Synthesis Model of an Edge Tone which was also discussed in a recent blog entry.

I love the title of this next paper, Miniaturized Noise Generation System—A Simulation of a Simulation, which will be presented at 2:30pm by researchers from Intel Technology in Gdansk, Poland. This idea of a meta-simulation is not as uncommon as you might think; we do digital emulation of old analogue synthesizers, and I’ve seen papers on numerical models of Foley rain sound generators.

A highlight for our team here is our 2:45 pm presentation, FXive: A Web Platform for Procedural Sound Synthesis. We’ll be unveiling a disruptive innovation for sound design, FXive.com, aimed at replacing reliance on sound effect libraries. Please come check it out, and get in touch with the presenters or any members of the team to find out more.

Immediately following this is a presentation which asks Can Algorithms Replace a Sound Engineer? This is a question the research team here have also investigated a lot, you could even say it was the main focus of our research for several years. The team behind this presentation are asking it in relation to Auto-EQ. I’m sure it will be interesting, and I hope they reference a few of our papers on the subject.

From 9-10:30, I will chair a Workshop on The State of the Art in Sound Synthesis and Procedural Audio, featuring the world’s experts on the subject. Outside of speech and possibly music, sound synthesis is still in its infancy, but its destined to change the world of sound design in the near future. Find out why.

12:15 — 13:45 is a workshop related to machine learning in audio (a subject that is sometimes called Machine Listening), Deep Learning for Audio Applications. Deep learning can be quite a technical subject, and there’s a lot of hype around it. So a Workshop on the subject is a good way to get a feel for it. See below for another machine listening related workshop on Friday.

The Heyser Lecture, named after Richard Heyser (we discussed some of his work in a previous entry), is a prestigious evening talk given by one of the eminent individuals in the field. This one will be presented by Malcolm Hawksford. , a man who has had major impact on research in audio engineering for decades.


The 9:30 — 11 poster session features some unusual but very interesting research. A talented team of researchers from Ancona will present A Preliminary Study of Sounds Emitted by Honey Bees in a Beehive.

Intense solar activity in March 2012 caused some amazing solar storms here on Earth. Researchers in Finland recorded them, and some very unusual results will be presented in the same session with the poster titled Analysis of Reports and Crackling Sounds with Associated Magnetic Field Disturbances Recorded during a Geomagnetic Storm on March 7, 2012 in Southern Finland.

You’ve been living in a cave if you haven’t noticed the recent proliferation of smart devices, especially in the audio field. But what makes them tick, is there a common framework and how are they tested? Find out more at 10:45 when researchers from Audio Precision will present The Anatomy, Physiology, and Diagnostics of Smart Audio Devices.

From 3 to 4:30, there’s a Workshop on Artificial Intelligence in Your Audio. It follows on from a highly successful workshop we did on the subject at the last Convention.


A couple of weeks ago, John Flynn wrote an excellent blog entry describing his paper on Improving the Frequency Response Magnitude and Phase of Analogue-Matched Digital Filters. His work is a true advance on the state of the art, providing digital filters with closer matches to their analogue counterparts than any previous approaches. The full details will be unveiled in his presentation at 10:30.

If you haven’t seen Mariana Lopez presenting research, you’re missing out. Her enthusiasm for the subject is infectious, and she has a wonderful ability to convey the technical details, their deeper meanings and their importance to any audience. See her one hour tutorial on Hearing the Past: Using Acoustic Measurement Techniques and Computer Models to Study Heritage Sites, starting at 9:15.

The full program can be explored on the Convention Calendar or the Convention website. Come say hi to us if you’re there! Josh Reiss (author of this blog entry), John Flynn, Parham Bahadoran and Adan Benito from the Audio Engineering research team within the Centre for Digital Music, along with two recent graduates Brecht De Man and Rod Selfridge, will all be there.

The edgiest tone yet…

As my PhD is coming to an end and the writing phase is getting more intense, it seemed about time I described the last of the aeroacoustic sounds I have implemented as a sound effect model. May 24th at the 144th Audio Engineering Society Convention in Milan, I will present ‘Physically Derived Synthesis Model of an Edge Tone.’
The edge tone is the sound created when a planar jet of air strikes an edge or wedge. The edge tone is probably most often seen as means of excitation for flue instruments. These instruments are ones like a recorder, piccolo, flute and pipe organ. For example, in a recorder air is blown by the mouth through a mouthpiece into a planar jet and then onto a wedge. The forces generated couple with the tube body of the recorder and a tone based on the dimension of the tube is generated.


Mouthpiece of a recorder


The edge tone model I have developed is viewed in isolation rather than coupled to a resonator as in the musical instruments example. While researching the edge tone it seemed clear to me that this tone has not had the same attention as the Aeolian tone I have previously modelled (here) but a volume of research and data was available to help understand and develop this model.

How does the edge tone work?

The most important process in generating the edge tone is the set up of a feedback loop from the nozzle exit to the wedge. This is similar to the process that generates the cavity tone which I discussed here. The diagram below will help with the explanation.


Illustration of jet of air striking a wedge


The air comes out of the nozzle and travels towards the wedge. A jet of air naturally has some instabilities which are magnified as the jet travels and reaches the wedge. At the wedge, vortices are shed on opposite sides of the wedge and an oscillating pressure pulse is generated. The pressure pulse travels back towards the nozzle and re-enforces the instabilities. At the correct frequency (wavelength) a feedback loop is created and a strong discrete tone can be heard.



To make the edge tone more complicated, if the air speed is varied or the distance between the nozzle exit to the wedge is varies, different modes exist. The values at which the modes change also exhibit hysteresis – the mode changes up and down do not occur at the same airspeed or distance.

Creating a synthesis model

There are a number of equations defined by researchers from the fluid dynamics field, each unique but depend on an integer mode number. Nowhere in my search did I find a method of predicting the mode number. Unlike previous modelling approaches, I decided to collate all the results I had where the mode number was given, both wind tunnel measurements and computational simulations. These were then input to the Weka machine learning workbench and a decision tree was devised. This was then implemented to predict the mode number.


All the prediction equations had a significant error compared to the measured and simulated results so again the results were used to create a new equation to predict the frequency for each mode.


With the mode predicted and the subsequent frequency predicted, the actual sound synthesis was generated by noise shaping with a white noise source and a bandpass filter. The Q value for the filter was unknown but, as with the cavity tone, it is known that the more turbulent the flow the smaller and more diffuse the vortices and the wider the band of frequencies around the predicted edge tone is. The Q value for the bandpass was set to be proportional to this.

And what next…?

Unlike the Aeolian tone where I was able to create a number of sound effects, the edge tone has not yet been implemented into a wider model. This is due to time rather than anything else. One area of further development which would be of great interest would be to couple the edge tone model to a resonator to emulate a musical instrument. Some previous synthesis models use a white noise source and an excitation or a signal based on the residual between an actual sample and the model of the resonator.


Once a standing wave has been established in the resonator, the edge tone locks in at that frequency rather than the one predicted in the equation. So the predicted edge tone may only be present while a musical note is in the transient state but it is known that this has a strong influence over the timbre and may have interesting results.


For an analysis of whistles and how their design affects their sound check out his article. The feedback mechanism described for the edge tone also very similar to the one that generates the hole tone. This is the discrete tone that is generated by a boiling kettle. This is usually a circular jet striking a plate with a circular hole and a feedback loop established.


Hole tone form a kettle


A very similar tone can be generated by a vertical take-off and landing vehicle when the jets from the lift fans are pointing down to the ground or deck. These are both areas for future development and where interesting sound effects could be made.


Vertical take-off of a Harrier jet


Sound Synthesis – Are we there yet?

TL;DR. Yes

At the beginning of my PhD, I began to read the sound effect synthesis literature, and I quickly discovered that there was little to no standardisation or consistency in evaluation of sound effect synthesis models – particularly in relations to the sounds they produce. Surely one of the most important aspects of a synthetic system, is whether it can artifically produce a convincing replacement for what it is intended to synthesize. We could have the most intractable and relatable sound model in the world, but if it does not sound anything like it is intended to, then will any sound designers or end users ever use it?

There are many different methods for measuring how effective a sound synthesis model is. Jaffe proposed evaluating synthesis techniques for music based on ten criteria. However, only two of the ten criteria actually consider any sounds made by the synthesiser.

This is crazy! How can anyone know what synthesis method can produce a convincingly realistic sound?

So, we performed a formal evaluation study, where a range of different synthesis techniques where compared in a range of different situations. Some synthesis techniques are indistinguishable from a recorded sample, in a fixed medium environment. In short – Yes, we are there yet. There are sound synthesis methods that sound more realistic than high quality recorded samples. But there is clearly so much more work to be done…

For more information read this paper