Weird and wonderful research to be unveiled at the 144th Audio Engineering Society Convention

th

Last year, we previewed the142nd and 143rd AES Conventions, which we followed with a wrap-up discussions here and here. The next AES  convention is just around the corner, May 23 to 26 in Milan. As before, the Audio Engineering research team here aim to be quite active at the convention.

These conventions have thousands of attendees, but aren’t so large that you get lost or overwhelmed. Away from the main exhibition hall is the Technical Program, which includes plenty of tutorials and presentations on cutting edge research.

So we’ve gathered together some information about a lot of the events that caught our eye as being unusual, exceptionally high quality involved in, attending, or just worth mentioning. And this Convention will certainly live up to the hype.

Wednesday May 23rd

From 11:15 to 12:45 that day, there’s an interesting poster by a team of researchers from the University of Limerick titled Can Visual Priming Affect the Perceived Sound Quality of a Voice Signal in Voice over Internet Protocol (VoIP) Applications? This builds on work we discussed in a previous blog entry, where they did a perceptual study of DFA Faders, looking at how people’s perception of mixing changes when the sound engineer only pretends to make an adjustment.

As expected given the location, there’s lots of great work being presented by Italian researchers. The first one that caught my eye is the 2:30-4 poster on Active noise control for snoring reduction. Whether you’re a loud snorer, sleep next to someone who is a loud snorer or just interested in unusual applications of audio signal processing, this one is worth checking out.

Do you get annoyed sometimes when driving and the road surface changes to something really noisy? Surely someone should do a study and find out which roads are noisiest so that then we can put a bit of effort into better road design and better in-vehicle equalisation and noise reduction? Well, now its finally happened with this paper in the same session on Deep Neural Networks for Road Surface Roughness Classification from Acoustic Signals.

Thursday, May 24

If you were to spend only one day this year immersing yourself in frontier audio engineering research, this is the day to do it.

How do people mix music differently in different countries? And do people perceive the mixes differently based on their different cultural backgrounds? These are the sorts of questions our research team here have been asking. Find out more in this 9:30 presentation by Amandine Pras. She led this Case Study of Cultural Influences on Mixing Practices, in collaboration with Brecht De Man (now with Birmingham City University) and myself.

Rod Selfridge has been blazing new trails in sound synthesis and procedural audio. He won the Best Student Paper Award at AES 141st Convention and the Best Paper Award at Sound and Music Computing. He’ll give another great presentation at noon on Physically Derived Synthesis Model of an Edge Tone which was also discussed in a recent blog entry.

I love the title of this next paper, Miniaturized Noise Generation System—A Simulation of a Simulation, which will be presented at 2:30pm by researchers from Intel Technology in Gdansk, Poland. This idea of a meta-simulation is not as uncommon as you might think; we do digital emulation of old analogue synthesizers, and I’ve seen papers on numerical models of Foley rain sound generators.

A highlight for our team here is our 2:45 pm presentation, FXive: A Web Platform for Procedural Sound Synthesis. We’ll be unveiling a disruptive innovation for sound design, FXive.com, aimed at replacing reliance on sound effect libraries. Please come check it out, and get in touch with the presenters or any members of the team to find out more.

Immediately following this is a presentation which asks Can Algorithms Replace a Sound Engineer? This is a question the research team here have also investigated a lot, you could even say it was the main focus of our research for several years. The team behind this presentation are asking it in relation to Auto-EQ. I’m sure it will be interesting, and I hope they reference a few of our papers on the subject.

From 9-10:30, I will chair a Workshop on The State of the Art in Sound Synthesis and Procedural Audio, featuring the world’s experts on the subject. Outside of speech and possibly music, sound synthesis is still in its infancy, but its destined to change the world of sound design in the near future. Find out why.

12:15 — 13:45 is a workshop related to machine learning in audio (a subject that is sometimes called Machine Listening), Deep Learning for Audio Applications. Deep learning can be quite a technical subject, and there’s a lot of hype around it. So a Workshop on the subject is a good way to get a feel for it. See below for another machine listening related workshop on Friday.

The Heyser Lecture, named after Richard Heyser (we discussed some of his work in a previous entry), is a prestigious evening talk given by one of the eminent individuals in the field. This one will be presented by Malcolm Hawksford. , a man who has had major impact on research in audio engineering for decades.

Friday

The 9:30 — 11 poster session features some unusual but very interesting research. A talented team of researchers from Ancona will present A Preliminary Study of Sounds Emitted by Honey Bees in a Beehive.

Intense solar activity in March 2012 caused some amazing solar storms here on Earth. Researchers in Finland recorded them, and some very unusual results will be presented in the same session with the poster titled Analysis of Reports and Crackling Sounds with Associated Magnetic Field Disturbances Recorded during a Geomagnetic Storm on March 7, 2012 in Southern Finland.

You’ve been living in a cave if you haven’t noticed the recent proliferation of smart devices, especially in the audio field. But what makes them tick, is there a common framework and how are they tested? Find out more at 10:45 when researchers from Audio Precision will present The Anatomy, Physiology, and Diagnostics of Smart Audio Devices.

From 3 to 4:30, there’s a Workshop on Artificial Intelligence in Your Audio. It follows on from a highly successful workshop we did on the subject at the last Convention.

Saturday

A couple of weeks ago, John Flynn wrote an excellent blog entry describing his paper on Improving the Frequency Response Magnitude and Phase of Analogue-Matched Digital Filters. His work is a true advance on the state of the art, providing digital filters with closer matches to their analogue counterparts than any previous approaches. The full details will be unveiled in his presentation at 10:30.

If you haven’t seen Mariana Lopez presenting research, you’re missing out. Her enthusiasm for the subject is infectious, and she has a wonderful ability to convey the technical details, their deeper meanings and their importance to any audience. See her one hour tutorial on Hearing the Past: Using Acoustic Measurement Techniques and Computer Models to Study Heritage Sites, starting at 9:15.

The full program can be explored on the Convention Calendar or the Convention website. Come say hi to us if you’re there! Josh Reiss (author of this blog entry), John Flynn, Parham Bahadoran and Adan Benito from the Audio Engineering research team within the Centre for Digital Music, along with two recent graduates Brecht De Man and Rod Selfridge, will all be there.

Advertisements

The edgiest tone yet…

As my PhD is coming to an end and the writing phase is getting more intense, it seemed about time I described the last of the aeroacoustic sounds I have implemented as a sound effect model. May 24th at the 144th Audio Engineering Society Convention in Milan, I will present ‘Physically Derived Synthesis Model of an Edge Tone.’
The edge tone is the sound created when a planar jet of air strikes an edge or wedge. The edge tone is probably most often seen as means of excitation for flue instruments. These instruments are ones like a recorder, piccolo, flute and pipe organ. For example, in a recorder air is blown by the mouth through a mouthpiece into a planar jet and then onto a wedge. The forces generated couple with the tube body of the recorder and a tone based on the dimension of the tube is generated.

 

Mouthpiece of a recorder

 

The edge tone model I have developed is viewed in isolation rather than coupled to a resonator as in the musical instruments example. While researching the edge tone it seemed clear to me that this tone has not had the same attention as the Aeolian tone I have previously modelled (here) but a volume of research and data was available to help understand and develop this model.

How does the edge tone work?

The most important process in generating the edge tone is the set up of a feedback loop from the nozzle exit to the wedge. This is similar to the process that generates the cavity tone which I discussed here. The diagram below will help with the explanation.

 

Illustration of jet of air striking a wedge

 

The air comes out of the nozzle and travels towards the wedge. A jet of air naturally has some instabilities which are magnified as the jet travels and reaches the wedge. At the wedge, vortices are shed on opposite sides of the wedge and an oscillating pressure pulse is generated. The pressure pulse travels back towards the nozzle and re-enforces the instabilities. At the correct frequency (wavelength) a feedback loop is created and a strong discrete tone can be heard.

 

 

To make the edge tone more complicated, if the air speed is varied or the distance between the nozzle exit to the wedge is varies, different modes exist. The values at which the modes change also exhibit hysteresis – the mode changes up and down do not occur at the same airspeed or distance.

Creating a synthesis model

There are a number of equations defined by researchers from the fluid dynamics field, each unique but depend on an integer mode number. Nowhere in my search did I find a method of predicting the mode number. Unlike previous modelling approaches, I decided to collate all the results I had where the mode number was given, both wind tunnel measurements and computational simulations. These were then input to the Weka machine learning workbench and a decision tree was devised. This was then implemented to predict the mode number.

 

All the prediction equations had a significant error compared to the measured and simulated results so again the results were used to create a new equation to predict the frequency for each mode.

 

With the mode predicted and the subsequent frequency predicted, the actual sound synthesis was generated by noise shaping with a white noise source and a bandpass filter. The Q value for the filter was unknown but, as with the cavity tone, it is known that the more turbulent the flow the smaller and more diffuse the vortices and the wider the band of frequencies around the predicted edge tone is. The Q value for the bandpass was set to be proportional to this.

And what next…?

Unlike the Aeolian tone where I was able to create a number of sound effects, the edge tone has not yet been implemented into a wider model. This is due to time rather than anything else. One area of further development which would be of great interest would be to couple the edge tone model to a resonator to emulate a musical instrument. Some previous synthesis models use a white noise source and an excitation or a signal based on the residual between an actual sample and the model of the resonator.

 

Once a standing wave has been established in the resonator, the edge tone locks in at that frequency rather than the one predicted in the equation. So the predicted edge tone may only be present while a musical note is in the transient state but it is known that this has a strong influence over the timbre and may have interesting results.

 

For an analysis of whistles and how their design affects their sound check out his article. The feedback mechanism described for the edge tone also very similar to the one that generates the hole tone. This is the discrete tone that is generated by a boiling kettle. This is usually a circular jet striking a plate with a circular hole and a feedback loop established.

 

Hole tone form a kettle

 

A very similar tone can be generated by a vertical take-off and landing vehicle when the jets from the lift fans are pointing down to the ground or deck. These are both areas for future development and where interesting sound effects could be made.

 

Vertical take-off of a Harrier jet

 

Sound Synthesis – Are we there yet?

TL;DR. Yes

At the beginning of my PhD, I began to read the sound effect synthesis literature, and I quickly discovered that there was little to no standardisation or consistency in evaluation of sound effect synthesis models – particularly in relations to the sounds they produce. Surely one of the most important aspects of a synthetic system, is whether it can artifically produce a convincing replacement for what it is intended to synthesize. We could have the most intractable and relatable sound model in the world, but if it does not sound anything like it is intended to, then will any sound designers or end users ever use it?

There are many different methods for measuring how effective a sound synthesis model is. Jaffe proposed evaluating synthesis techniques for music based on ten criteria. However, only two of the ten criteria actually consider any sounds made by the synthesiser.

This is crazy! How can anyone know what synthesis method can produce a convincingly realistic sound?

So, we performed a formal evaluation study, where a range of different synthesis techniques where compared in a range of different situations. Some synthesis techniques are indistinguishable from a recorded sample, in a fixed medium environment. In short – Yes, we are there yet. There are sound synthesis methods that sound more realistic than high quality recorded samples. But there is clearly so much more work to be done…

For more information read this paper

Creative projects in sound design and audio effects

This past semester I taught two classes (modules), Sound Design and Digital Audio Effects. In both classes, the final assignment involves creating an original work that involves audio programming and using concepts taught in class. But the students also have a lot of free reign to experiment and explore their own ideas.

The results are always great. Lots of really cool ideas, many of which could lead to a publication, or would be great to listen to regardless of the fact that it was an assignment. Here’s a few examples.

From the Sound Design class;

  • Synthesizing THX’s audio trademark, Deep Note. This is a complex sound, ‘a distinctive synthesized crescendo that glissandos from a low rumble to a high pitch’. It was created by the legendary James Moorer, who is responsible for some of the greatest papers ever published in the Journal of the Audio Engineering Society.
  • Recreating the sound of a Space Shuttle launch, with separate components for ‘Air Burning/Lapping’ and ‘Flame Eruption/Flame Exposing’ by generating the sounds of the Combustion chain and the Exhaust chain.
  • A student created a soundscape inspired by the 1968 Romanian play ‘Jonah (A four scenes tragedy)’,  written by Marin Sorescu. Published in 1968, when Romania was ruled by the communist regime. By carefully modulating the volume of filtered noise, she was able to achieve some great synthesis of waves crashing on a shore.
  • One student made a great drum and bass track, manipulating samples and mixing in some of his own recorded sounds. These included a nice ‘thud’ by filtering the sound of a tightened towel, percussive sounds by shaking rice in a plastic container. and the sizzling sound of frying bacon for tape hiss.
  • Synthesizing the sound of a motorbike, including engine startup, gears and driving sound, gear lever click and indicator.
  • A short audio piece to accompany a ghost story, using synthesised and recorded sounds. What I really like is that the student storyboarded it.

storyboard

  • A train on a stormy day, which had the neat trick of converting a footstep synthesis model into the chugging of a train.
  • The sounds of the London Underground, doors sliding and beeping, bumps and breaks… all fully synthesized.

And from the Digital Audio Effects class;

  • An autotune specifically for bass guitar. We discussed auto-tune and its unusual history previously.
  • Sound wave propagation causes temperature variation, but speed of sound is a function of temperature. Notably, the positive half cycle of a wave (compression) causes an increase in temperature and velocity, while the negative half (rarefaction) causes a decrease in temperature and velocity, turning a sine wave into something like a sawtooth. This effect is only significant in high pressure sound waves. Its also frequency dependent; high frequency components travel faster than low frequency components.
    Mark Daunt created a MIDI instrument as a VST Plug-in that generates sounds based on this shock-wave formation formula. Sliders allow the user to adjust parameters in the formula and use a MIDI keyboard to play tones that express characteristics of the calculated waveforms.

  • Synthesizing applause, a subject which we have discussed here before. The student has been working in this area for another project, but made significant improvements for the assignment, including adding presets for various conditions.
  • A student devised a distortion effect based on waveshaping in the form of a weighted sum of Legendre polynomials. These are interesting functions and her resulting sounds are surprising and pleasing. Its the type of work that could be taken a lot further.
  • One student had a bug in an implementation of a filter. Noticing that it created some interesting sounds, he managed to turn it into a cool original distortion effect.
  • There’s an Octagon-shaped room with strange acoustics here on campus. Using a database of impulse response measurements from the room, one student created a VST plug-in that allows the user to hear how audio sounds for any source and microphone positions. In earlier blog entries, we discussed related topics, acoustic reverberators and anechoic chambers.

Screen Shot 2018-03-22 at 20.21.58-14

  • Another excellent sounding audio effect was a spectral delay using the phase vocoder, with delays applied differently depending on frequency bin. This created a sound like ‘stars falling from the sky’. Here’s a sine sweep before and after the effect is applied.

https://soundcloud.com/justjosh71/sine-sweep-original

There were many other interesting assignments (plucked string effect for piano synthesizer, enhanced chorus effects, inharmonic resonator, an all-in-one plug-in to recreate 80s rock/pop guitar effects…). But this selection really shows both the talent of the students and the possibilities to create new and interesting sounds.

You’re invited to my Inaugural Lecture as Professor of Audio Engineering

When writing a blog like this, there’s sometimes a thin line between information and promotion. I suppose this one is on the promotion-side, but its a big deal for me and for at least a few of you it will involve some interesting information.

Queen Mary University of London has Inaugural Lectures for all its professors. I was promoted to Professor last year and so its time for me to do mine. It will be an evening lecture at the University on April 17th, free for all and open to the public. Its quite an event, with a large crowd and a reception after the talk.

I’m going to try to tie together a lot of strands of my research (when I say ‘my’, I mean the great research done by my students, staff and collaborators). That won’t be too hard since there are some common themes throughout. But I’m also going to try to make it as fun and engaging as possible, lots of demonstrations, no dense formal PowerPoint, and a bit of theatre.

You can register online by going to https://www.eventbrite.co.uk/e/do-you-hear-what-i-hear-the-science-of-everyday-sounds-tickets-43749224107 and it has all the information about time, location and so on.

Here’s the full details.

Do you hear what I hear? The science of everyday sounds.

The Inaugural Lecture of Professor Josh Reiss, Professor of Audio Engineering

Tue 17 April 2018, 18:30 – 19:30 BST
ArtsTwo, Queen Mary Mile End Campus
327 Mile End Road, London, E1 4NS

item178407

Details: The sounds around us shape our perception of the world. In films, games, music and virtual reality, we recreate those sounds or create unreal sounds to evoke emotions and capture the imagination. But there is a world of fascinating phenomena related to sound and perception that is not yet understood. If we can gain a deep understanding of how we perceive and respond to complex audio, we could not only interpret the produced content, but we could create new content of unprecedented quality and range.
This talk considers the possibilities opened up by such research. What are the limits of human hearing? Can we create a realistic virtual world without relying on recorded samples? If every sound in a major film or game soundtrack were computer-generated, could we reach a level of realism comparable to modern computer graphics? Could a robot replace the sound engineer? Investigating such questions leads to a deeper understanding of auditory perception, and has the potential to revolutionise sound design and music production. Research breakthroughs concerning such questions will be discussed, and cutting-edge technologies will be demonstrated.

Biography: Josh Reiss is a Professor of Audio Engineering with the Centre for Digital Music at Queen Mary University of London. He has published more than 200 scientific papers (including over 50 in premier journals and 4 best paper awards), and co-authored the textbook Audio Effects: Theory, Implementation and Application. His research has been featured in dozens of original articles and interviews since 2007, including Scientific American, New Scientist, Guardian, Forbes magazine, La Presse and on BBC Radio 4, BBC World Service, Channel 4, Radio Deutsche Welle, LBC and ITN, among others. He is a former Governor of the Audio Engineering Society (AES), chair of their Publications Policy Committee, and co-chair of the Technical Committee on High-resolution Audio. His Royal Academy of Engineering Enterprise Fellowship resulted in founding the high-tech spin-out company, LandR, which currently has over a million and a half subscribers and is valued at over £30M. He has investigated psychoacoustics, sound synthesis, multichannel signal processing, intelligent music production, and digital audio effects. His primary focus of research, which ties together many of the above topics, is on the use of state-of-the-art signal processing techniques for professional sound engineering. He maintains a popular blog, YouTube channel and twitter feed for scientific education and dissemination of research activities.

 

The cavity tone……

In September 2017, I attended the 20th International Conference on Digital Audio Effects in Edinburgh. At this conference, I presented my work on a real-time physically derived model of a cavity tone. The cavity tone is one of the fundamental aeroacoustic sounds, similar to previously described Aeolian tone. The cavity tone commonly occurs in aircraft when opening bomb bay doors or by the cavities left when the landing gear is extended. Another example of the cavity tone can be seen when swinging a sword with a grooved profile.

The physics of operation is a can be a little complicated. To try and keep it simple, air flows over the cavity and comes into contact with air at a different velocity within the cavity. The movement of air at one speed over air at another cause what’s known as shear layer between the two. The shear layer is unstable and flaps against the trailing edge of the cavity causing a pressure pulse. The pressure pulse travels back upstream to the leading edge and re-enforces the instability. This causes a feedback loop which will occur at set frequencies. Away from the cavity the pressure pulse will be heard as an acoustic tone – the cavity tone!

A diagram of this is shown below:

Like the previously described Aeolian tone, there are equations to derive the frequency of the cavity tone. This is based on the length of the cavity and the airspeed. There are a number of modes of operation, usually ranging from 1 – 4. The acoustic intensity has also been defined which is based on airspeed, position of the listener and geometry of the cavity.

The implementation of an individual mode cavity tone is shown in the figure below. The Reynolds number is a dimensionless measure of the ratio between the inertia and viscous force in the flow and Q relates to the bandwidth of the passband of the bandpass filter.

Comparing our model’s average frequency prediction to published results we found it was 0.3% lower than theoretical frequencies, 2.0% lower than computed frequencies and 6.4% lower than measured frequencies. A copy of the pure data synthesis model can be downloaded here.

 

The final whistle blows

Previously, we discussed screams, applause, bouncing and pouring water. Continuing our examination of every day sounds, we bring you… the whistle.

This one is a little challenging though. To name just a few, there are pea whistles, tin whistles, steam whistles, dog whistles and of course, human whistling. Covering all of this is a lot more than a single blog entry. So lets stick to the standard pea whistle or pellet whistle (or ‘escargot’ or barrel whistle because of its snail-like shape), which is the basis for a lot of the whistles that you’ve heard.

metal pea whistle

 

Typical metal pea whistle, featuring mouthpiece,  bevelled edge and sound hole where air can escape, and barrel-shaped air chamber and a pellet inside.

 

Whistles are the oldest known type of flute. They have a stopped lower end and a flue that directs the player’s breath from the mouth hole at the upper end against the edge of a hole cut in the whistle wall, causing the enclosed air to vibrate. Most whistle instruments have no finger holes and sound only one pitch.

A whistle produces sound from a stream of gas, most commonly air, and typically powered by steam or by someone blowing air. The conversion of energy to sound comes from an interaction between the air stream and a solid material.

In a pea whistle, the air stream enters through the mouthpiece. It hits the bevel (sloped edge for the opening) and splits, outwards into the air and inwards filling the air chamber. It continues to swirl around and fill the chamber until the air pressure inside  is so great that it pops out of the sound hole (a small opening next to the bevel), making room for the process to start over again. The dominant pitch of the whistle is determined by the rate at which air packs and unpacks the air chamber. The movement of air forces the pea or pellet inside the chamber to move around and around. This sometimes interrupts the flow of air and creates a warble to the whistle sound.

The size of the whistle cavity determines the volume of air contained in the whistle and the pitch of the sound produced. The air fills and empties from the chamber so many times per second, which gives the fundamental frequency of the sound.

The whistle construction and the design of the mouthpiece also have a dramatic effect on sound. A whistle made from a thick metal will produce a brighter sound compared to the more resonant mellow sound if thinner metal is used. Modern whistles are produce using different types of plastic, which increases the tones and sounds now available. The design of the mouthpiece can also dramatically alter the sound. Even a few thousandths of an inch difference in the airway, angle of the blade, size or width of the entry hole, can make a drastic difference as far as volume, tone, and chiff (breathiness or solidness of the sound) are concerned. And according to the whistle Wiki page, which might be changed by the time you read this, ‘One characteristic of a whistle is that it creates a pure, or nearly pure, tone.’

Well, is all of that correct? When we looked at the sounds of pouring hot and cold water we found that the simple explanations were not correct. In explaining the whistle, can we go a bit further than a bit of handwaving about the pea causing a warble? Do the different whistles differ a lot in sound?

Lets start with some whistle sounds. Here’s a great video where you get to hear a dozen referee’s whistles.

Looking at the spectrogram below, you can see that all the whistles produce dominant frequencies somewhere between 2200 and 4400 Hz. Some other features are also apparent. There seems to be some second and even third harmonic content. And it doesn’t seem to be just one frequency and its overtones. Rather, there are two or three closely spaced frequencies whenever the whistle is blown.

Referee Whistles

But this sound sample is all fairly short whistle blows, which could be why the pitches are not constant. And one should never rely on just one sample or one audio file (as the authors did here). So lets look at just one long whistle sound.

joe whistle spec

joe whistle

You can see that it remains fairly constant, and the harmonics are clearly present, though I can’t say if they are partly due to dynamic range compression or any other processing. However, there are semi-periodic dips or disruptions in the fundamental pitch. You can see this more clearly in the waveform, and this is almost certainly due to the pea temporarily blocking the sound hole and weakening the sound.

The same general behaviour appears with other whistles, though with some variation in the dips and their rate of occurrence, and in the frequencies and their strengths.

Once I started writing this blog, I was pointed to the fact that Perry Cook had already discussed synthesizing whistle sounds in his wonderful book Real Sound Synthesis for Interactive Applications. In building up part of a model of a police/referee whistle, he wrote

 ‘Experiments and spectrograms using real police/referee whistles showed that when the pea is in the immediate region of the jet oscillator, there is a decrease in pitch (about 7%), an increase in amplitude (about 6 dB), and a small increase in the noise component (about 2 dB)… The oscillator exhibits three significant harmonics: f, 2f and 3f at 0 dB, -10 dB and -25 dB, respectively…’

With the exception of the increase in amplitude due to the pea (was that a typo?), my results are all in rough agreement with his. So depending on whether I’m a glass half empty / glass half full kind of person, I could either be disappointed that I’m just repeating what he did, or glad that my results are independently confirmed.

This information from a few whistle recordings should be good enough to characterise the behaviour and come up with a simple, controllable synthesis. Jiawei Liu took a different approach. In his Master’s thesis, he simulated whistles using computational fluid dynamics and acoustic finite element simulation. It was very interesting work, as was a related approach by Shia, but they’re both a bit like using a sledgehammer to kill a fly. Massive effort and lots of computation, when a model that probably sounds just as good could have been derived using semi-empirical equations that model aeroacoustic sounds directly, as discussed in our previous blog entries on sound synthesis of an Aeolian Harp, a Propeller. Sword sounds, swinging objects or Aeolian tones.

There’s been some research into automatic identification of referee whistle sounds, for instance, initial work of Shirley and Oldfield in 2011 and then a more advanced algorithm a few years later. But these are either standard machine learning techniques, or based on the most basic aspects of the whistle sound, like its fundamental frequency. In either case, they don’t use much understanding of the nature of the sound. But I suppose that’s fine. They work, they enable intelligent production techniques for sports broadcasts,  and they don’t need to delve into the physical or perceptual aspects.

I said I’d stick to pellet whistles, but I can’t resist mentioning a truly fascinating and unusual synthesis of another whistle sound. Steam locomotives were equipped with train whistles for warning and signalling. to generate the sound, the train driver pulls a cord in the driver’s cabin, thereby opening a valve, so that steam shoots out of an gap and against the sharp edge of a bell. This makes the bell vibrate rapidly, which creates a whistling sound. In 1972, Herbert Chaudiere created an incredibly detailed sound system for model trains. This analogue electronic system  generated all the memorable sounds of the steam locomotive; the bark of exhausting steam, the rhythmic toll of the bell, and the wail of the chime whistle, and reproduced these sounds from a loudspeaker carried in the model locomotive.

The preparation of this blog entry also illustrates some of the problems with crowd sourced metadata and user generated tagging. When trying to find some good sound examples, I searched the whole’s most popular sound effects archive, freesound, for ‘pea whistle’. It came up with only one hit, a recording of steam and liquid escaping from a pot of boiling black-eyed peas!

References:

  • Chaudiere, H. T. (1972). Model Railroad Sound system. Journal of the Audio Engineering Society, 20(8), 650-655.
  • Liu, J. (2012). Simulation of whistle noise using computational fluid dynamics and acoustic finite element simulation, MSc Thesis, U. Kentucky.
  • Shia, Y., Da Silvab, A., & Scavonea (2014), G. Numerical Simulation of Whistles Using Lattice Boltzmann Methods, ISMA, Le Mans, France
  • Cook, P. R. (2002). Real sound synthesis for interactive applications. CRC Press.
  • Oldfield, R. G., & Shirley, B. G. (2011, May). Automatic mixing and tracking of on-pitch football action for television broadcasts. In Audio Engineering Society Convention 130
  • Oldfield, R., Shirley, B., & Satongar, D. (2015, October). Application of object-based audio for automated mixing of live football broadcast. In Audio Engineering Society Convention 139.