Sound Effects Taxonomy

At the upcoming International Conference on Digital Audio Effects, Dave Moffat will be presenting recent work on creating a sound effects taxonomy using unsupervised learning. The paper can be found here.

A taxonomy of sound effects is useful for a range of reasons. Sound designers often spend considerable time searching for sound effects. Classically, sound effects are arranged based on some key word tagging, and based on what caused the sound to be created – such as bacon cooking would have the name “BaconCook”, the tags “Bacon Cook, Sizzle, Open Pan, Food” and be placed in the category “cooking”. However, most sound designers know that the sound of frying bacon can sound very similar to the sound of rain (See this TED talk for more info), but rain is in an entirely different folder, in a different section of the SFx Library.

The approach, is to analyse the raw content of the audio files in the sound effects library, and allow a computer to determine which sounds are similar, based on the actual sonic content of the sound sample. As such, the sounds of rain and frying bacon will be placed much closer together, allowing a sound designer to quickly and easily find related sounds that relate to each other.

Here’s a figure from the paper, comparing the generated taxonomy to the original sound effect library classification scheme.



The Swoosh of the Sword

When we watch Game of Thrones or play the latest Assassin’s Creed the sound effect added to a sword being swung adds realism, drama and overall excitement to our viewing experience.

There are a number of methods for producing sword sound effects, from filtering white noise with a bandpass filter to solving the fundamental equations for fluid dynamics using finite volume methods. One method investigated by the Audio Engineering research team at QMUL was to find semi-empirical equations used in the Aeroacoustic community as an alternative to solving the full Navier Stokes equations. Running in real-time these provide computationally efficient methods of achieving accurate results – we can model any sword, swung at any speed and even adjust the model to replicate the sound of a baseball bat or golf club!

The starting point for these sound effect models is that of the Aeolian tone, (see previous blog entry – The Aeolian tone is the sound generated as air flows around an object, in the case of our model, a cylinder. In the previous blog we describe the creation of a sound synthesis model for the Aeolian tone, including a link to a demo version of the model.

For a sword we take a number of the Aeolian tone models and place them on a virtual sword at different place settings. This is shown below:


Each Aeolian tone model is called a compact source. It can be seen that more are placed at the tip of the sword rather than the hilt. This is because the acoustic intensity is far higher for faster moving sources. There are 6 sources placed at the tip, positioned at a distance of 7 x the sword diameter. This distance is based on when the aerodynamic effects become de-correlated, although a simplification. One source is placed at the hilt and the final source equidistant between the last tip source and the hilt.

The complete model is presented in a GUI as shown below:


Referring to the both previous figures, it can be seen that the user is able to move the observer position within a 3D space. The thickness of the blade can be set at the tip and the hilt as well as the length of the blade. It is then linearly interpolated over the blade length so that each source diameter can be calculated.

The azimuth and elevation of the sword pre and post swing can be set. The strike position is fixed to an azimuth of 180 degrees and this is the point where the sword reaches its maximum speed. The user sets the top speed of the tip from the GUI. The Prime button makes sure all the variables are pushed through into the correct places in equations and the Go button triggers the swing.

It can be seen that there are 4 presets. Model 1 is a thin fencing type sword and Model 2 is a thicker sword. To test versatility of the model we decided to try and model a golf club. The preset PGA will set the model to implement this. The golf club model involves making the diameter of the source at the tip much larger, to represent the striking face of a golf club. It was found that those unfamiliar with golf did not identify the sound immediately so a simple golf ball strike sound is synthesised as the club reaches top speed.

To test versatility further, we created a model to replicate the sound of a baseball bat; preset MLB. This is exactly the same model as the sword with the dimensions just adjusted to the length of a bat plus the tip and hilt thickness. A video with all the preset sounds is given below. This includes two sounds created by a model with reduced physics, LoQ1 & LoQ2. These were created to investigate if there is any difference in perception.

The demo model was connected to the animation of a knight character in the Unity game engine. The speed of the sword is directly mapped from the animation to the sound effect model and the model observer position set to the camera position. A video of the result is given below:

Real-Time Synthesis of an Aeolian tone

Aeroacoustics are sounds generated by objects and the air and is a unique group of sounds. Examples of these sounds are a sword swooshing through the air, jet engines, propellers as well as the wind blowing through cracks, etc.  The Aeolian tone is one of the fundamental sounds; the cavity tone and edge tone being others. When designing these sound effects we want to model these fundamental sounds. It then should be possible to make a wide range of sound effects based on these. We want the sounds to be true to the physics generating them and operate in real-time. Completed effects will be suitable for use in video games, TV, film and virtual or augmented reality.

The Aeolian tone is the sound generated when air moves past a string, cylinder or similar object. It’s the whistling noise we may hear coming from a fence in the wind or the swoosh of a sword. An Aeolian Harp is a wind instrument that has been harnessing the Aeolian tone for hundreds of years. If fact, the word Aeolian comes from the Greek god of wind Aeolus.

The physics behind this sound….

When air moves past a cylinder spirals called vortices form behind it, moving away with the air flow. The vortices build up on both sides of the cylinder and detach in an alternating sequence. We call this vortex shedding and the downstream trail of vortices, a Von Karman Vortex Street. An illustration of this is given below:


As a vortex sheds from each side there is a change in the lift force from one side to the other. It’s the frequency of this oscillating force that is the fundamental tone frequency. The sound radiates in a direction perpendicular to the flow. There is also a smaller drag force associated with each vortex shed. It is much smaller than the lift force, twice the frequency and radiates parallel to the flow. Both the lift and drag tones have harmonics present.

Can we replicate this…?

In 1878 Vincent Strouhal realized there was a relationship between the diameter of a string, the speed it was travelling thought the air and the frequency of tone produces. We find the Strouhal number varies with the turbulence around the cylinder. Luckily, we have a parameter that represents the turbulence called the Reynolds number. It’s calculated from the viscosity, density and velocity of air, and the diameter of the string. From this we can calculate the Strouhal number and get the fundamental tone frequency.

This is the heart of our model and was the launching point for our model. Acoustic sound sources can be often represented by compact sound sources. These are monopoles, dipoles and quadrupoles. For the Aeolian tone the compact sound source is a dipole.

We have an equation for the acoustic intensity. This is proportional to airspeed to the power of 6. It also includes the relationship between the sound source and listener. The bandwidth around the fundamental tone peak is proportional to the Reynolds number. We calculate this from published experimental results.

The vortex wake acoustic intensity is also calculated. This is much lower that the tone dipole at low airspeed but is proportional to airspeed to the power of 8. There is little wake sound below the fundamental tone frequency and it decreases proportional to the frequency squared.

We use the graphical programming language Pure Data to realise the equations and relationships. A white noise source and bandpass filters can generate the tone sounds and harmonics. The wake noise is a brown noise source shaped by high pass filtering. You can get the Pure Data patch of the model by clicking here.

Our sound effect operates in real-time and is interactive. A user or game engine can adjust:

  • Airspeed
  • Diameter and length of the cylinder
  • Distance between observer and source
  • Azimuth and elevation between observer and source
  • Panning and gain

We can now use the sound source to build up further models. For example, an airspeed model that replicates the wind can reproduce the sound of wind through a fence. The swoosh of a sword is sources lines up in a row with speed adjusted to radius of the arc.

Model complete…?

Not quite. We can calculate the bandwidth of the fundamental tone but have no data for the bandwidth of harmonics. In the current model we set them at the same value. The equation of the acoustic intensity of the wake is an approximation. The equation represents the physics but is not an exact value. We have to use best judgement when scaling it to the acoustic intensity of the fundamental tone.

A string or wire has a natural vibration frequency. There is an interaction between this and the vortex shedding frequency. This modifies the sound heard by a significant factor.