Follow The Sound: A Procedural Audio Minigame

In-game screenshot from Follow The Sound
In-game screenshot from Follow The Sound.

From the beginning of February until the end of April, IGGI PhD researcher Dimitris Menexopoulos did an internship at Nemisindo, a fast-growing start-up company that offers state of the art services in procedural audio. His main task was to design and build a minigame in Unreal Engine that would showcase as many of the procedural models featured in their plugins as possible.

As of early August 2022, Nemisindo has three plugins for the Unreal Engine out in the market (plus plugins for Unity, and more to come):

  • The Action Pack, which includes Explosion, Gunshot, Rifle, Helicopter, Jet, Propeller, Rocket, Alarm, Alert, Siren and Fire models.
  • The Adaptive Footsteps, which comes with 5 different shoe types, 7 surface types and additional controls for the pace, step firmness, steadiness, etc.
  • The Nature Pack, which brings nature-inspired Droplets, Fire, Rain, Water, Waves and Wind models to any interactive environment.

As is illustrated above, the sonic options are many. But making a game that has a smooth audio-driven gameplay while concisely featuring many of the above models was a unique challenge.

The initial idea was to make a third person shooter kind of game, implementing the Rifle and Footsteps models on the main character and the rest in the scene’s environment and objects.

But that’s beaten to death, right? Plus, the intense action might shift the player’s attention from the audio to the graphics. So, how does one keep the game audio-focused?

After a bit further brainstorming, the answer came in the form of a common game mechanic that has stood the test of time like few, but approached with a slightly unorthodox mindset:

Collectibles. But emitting directional sound and being hidden in a 3D scene.

That’s roughly how Follow The Sound came to be. The objective is quite simple: there are ten sound emitting items scattered around the map. Each item is visually representative of the sound it emits. Can you locate and collect them all in under three minutes?

All sounds are procedurally generated at runtime.

Here’s some basic information about the sound models used in the game:

  • Collectibles: Droplet, Gunshot, Rifle, Explosion, Jet, Helicopter, Propeller, Alarm, Fire, Rocket
  • Footsteps: Trainers/Sneakers on Concrete, Gravel and Grass surfaces
  • Nature Ambience: Wind model from the Nature Pack
  • UI Collection Sound: Two-Tone Siren model (yes, with enough imagination, you can also make procedural UI sounds!)

Follow The Sound was created in Unreal Engine 4.26. All 3D models used in the minigame are free assets downloaded from the Unreal Marketplace (City Park Environment Collection by SilverTm) and from cgtrader.com.

Learn more about Nemisindo’s Action Pack, Adaptive Footsteps and Nature Pack at nemisindo.com.

Dimitris Menexopoulos would like to thank his PhD supervisor and CEO of Nemisindo, Dr. Joshua Reiss, for this creatively and technically didactic opportunity. Additionally, he’d like to thank Nemisindo members Selim Sheta, Jack Walters and Clifford Moses Manasseh for their valuable feedback and help in packaging the final version of the game.

You can freely download a Windows version of Follow The Sound here.

Engine sounds for game engines

Engines and motors feature heavily in video games, but they can be very problematic. As game car audio expert Greg Hill said, “For engine sounds most people think we just record a car and throw the sound into a folder and it magically just works in the game. Game sounds are interactive – unlike passive sound like movies where the sound is baked onto an audio track… sound has to be recorded a certain way so it can be decomposed and reconstructed into a ‘sonic model’ and linked to the physics so the player has full control over every parameter.”

And therein lies the issue; recorded sounds are fixed, but the game sounds need to be adaptable and controllable. One can get around this, but only with huge effort.

So Nemisindo (the Zulu word for Sound Effects, our start-up company) have brought their deep and advanced engine sound generator to Unity as an audio plugin.

The Nemisindo Engine implements a flexible, controllable and realistic engine for all your game vehicles. It is a Unity native audio plugin and can generate sound effects in real time, completely procedurally. The plugin offers an interactive interface with 14 parameters and 16 realistic presets (Formula One, monster truck, motorbike…). It also offers various functions to change the parameters of the sound effect at runtime. Hence the parameters can be linked to any in-game events which can lead to very organic and interesting results.

Here’s a demonstration video.

The engine is available as a native Unity audio plugin at https://assetstore.unity.com/packages/tools/audio/nemisindo-engine-procedural-sound-effects-222246

You can also try out the engine model on Nemisindo’s online sound design platform at

https://nemisindo.com/models/engine-advanced.html

Online sound design service – improved and relaunched

Last year, we launched an online sound design service at nemisindo.com . Nemisindo (the Zulu word for ‘sound effects’) is a start-up based on our procedural audio research at Queen Mary University of London.

Since then, we’ve launched procedural audio plug-ons for the Unreal and Unity game engines. We’ve also been continually improving the online service. But we’ve been holding back the big changes… until now.

Before, users could generate sounds from lots of different sound models, as well as access stored presets for those models. We’ve merged this all into one page, with easy and intuitive search features. And we’ve added more sounds (76 sound models with over 800 presets). So users don’t need to do much browsing around, they can easily find settings to generate the sorts of sounds they want.

We’re also really excited to be able to announce the launch of a new user community feature on our website. Now, users can create their own presets, save them to their profile, and view and use other peoples’ presets. This will enable a community to grow, building and sharing settings for generating great sound effects. And we benefit too, since the users will show us more interesting and creative ways of using our technology.

On top of all this, we’ve made a number of improvements across the website based on feedback from the community, including lots of helpful videos, simplifying interfaces and improvements to the underlying sound generation techniques.

Our online service is free to use, but please register on the site. And registration is needed to download sounds and store and share presets. As always, feel free to get in touch.

Thank you,
The Nemisindo Team

The sounds of nature, procedurally generated

#sounddesign #soundeffects #GameAudio

I was tempted to give this blog entry the name The Call of Nature, and then I remembered that for a lot of people, that has a very different meaning. 😉

Nature sounds are essential to sound design. They provide ambience and set the scene. Formally, they are almost all keynote sounds, that is, heard continuously or frequently enough to form the background against which other sounds are perceived.

But they can be very challenging to work with in creative sound design. You may have a hundred samples of ocean waves, but none of them matches the visuals. And this becomes even harder when the visual aspects might change depending on what happens in a game or VR context.

We want to help game developers implement dynamic and adaptive atmospheric sounds in their projects. So Nemisindo (the Zulu word for Sound Effects, our start-up company) has brought state-of-the-art procedural Nature models to Unreal Engine. From lonely humid caves, to huge hurricanes, we’ve got it covered.
The Nature Pack features 6 sound effects models for popular nature sounds:

  • Droplets
  • Wind
  • Waves
  • Rain
  • Fire
  • Water

These models generate audio in real time, completely procedurally so no samples are stored.

This approach offers a lot more than standard sample packs libraries. Sounds are generated in real-time, based on intuitive parameters that can be customized and linked to in-game events. The plugin is integrated with Unreal Engine, unlike other solutions that involve third party software. This means you can design your own sounds directly in the game or simulation editor.

With the Nature Pack, you’ll be able to create incredibly detailed interactive audio scenes, with roaring flames, crashing waterfalls and gentle waves breaking on the shore. The plugin comes with 50+ presets so you can get started in no time. And since it does not rely on pre-recorded samples, it’s very lightweight compared to sample libraries.

The Nature Pack plugin is available in the Unreal Marketplace at unrealengine.com/marketplace/product/nemisindo-nature-pack

And here’s a short video introducing the plugin and its features:

Working with the Web Audio API out now- book, source code & videos

I was tempted to call this “What I did during lockdown”. Like many people, when Covid hit, I gave myself a project while working from home and isolated.

The Web Audio API provides a powerful and versatile system for controlling audio on the Web. It allows developers to generate sounds, select sources, add effects, create visualizations and render audio scenes in an immersive environment.

In recent years I had developed a Love / Mild Annoyance (hate is too strong) relationship with the Web Audio API. And I also noticed that there really wasn’t a comprehensive and useful guide to the Web Audio API. Sure, there’s the online specification and plenty of other documentation, but there’s a whole lot that it leaves out, whether one wants to do FIR filtering or one wants the easiest way to record the output of an audio node, for instance. And there’s nothing like a teaching book or a reference that one might keep handy while doing Web Audio programming.

So writing that book, titled Working with the Web Audio API, became my Covid project, and its finally hit the shelves! Its part of the AES Presents book series, and the publisher’s link to the book is;

https://www.routledge.com/Working-with-the-Web-Audio-API/Reiss/p/book/9781032118673

but you can find it through all the usual places to buy books. The accompanying source code is available at;

https://github.com/joshreiss/Working-with-the-Web-Audio-API

I think there’s about 70 source code examples, covering every audio node and every important feature of the API.

And I’ve made about 20 instructional videos covering many aspects in the YouTube channel.

https://tinyurl.com/y3mtauav

I’ll keep improving the github repository and YouTube channel whenever I get a chance. So please, check it out and let me know what you think. 🙂

Adaptive footstep sound effects

Adaptive footsteps plug-in released for Unreal and Unity game engines

From the creeping, ominous footsteps in a horror film to the thud clunk of an armored soldier in an action game, footstep sounds are one of the most widely souht after sound effects in creative content. But to get realistic variation, one needs hundreds of different samples for each character, each foot, each surface, and at different paces. Even then, repetition becomes a problem.

So at nemisindo.com , we’ve developed a procedural model for generating footstep sounds without the use of recorded samples. We’ve released it as the Nemisindo Adaptive Footsteps plug-in for game engines, available in the Unity Asset Store and in the Unreal Marketplace. You can also try it out at https://nemisindo.com/models/footsteps.html . It offers a lot more than standard sample packs libraries: footsteps are generated in real-time, based on intuitive parameters that you can control.

The plugin provides benefits that no other audio plugin does;

  • Customisable: 4 different shoe types, 7 surface types, and controls for pace, step firmness, steadiness, etc.
  • Convenient: Easy to set up, comes with 12 presets to get started in no time.
  • Versatile: Automatic and Manual modes can be added to any element in a game scene.
  • Lightweight: Uses very little less disk space; the entire code takes about the same space as one footstep sample.

In a research paper soon to appear at the 152nd Audio Engineering Society Convention, we tried a different approach. We implemented multilayer neural network architectures for footstep synthesis and compared the results with real recordings and various sound synthesis methods, including Nemisindo’s online implementation. The neural approach is not yet applicable to most sound design problems, since it does not offer parametric control. But the listening test was very useful. It showed that Nemisindo’s procedural approach outperformed all other traditional sound synthesis approaches, and gave us insights that led to further improvements.

Here’s a short video introducing the Unity plugin:

And a video introducing it for Unreal

And a nice tutorial video on how to use it in Unreal

So please check it out. Its a big footstep forward in procedural and adaptive sound design (sorry, couldn’t resist the wordplay 😁).

Lights, camera… Action sound effects pack!

Procedural audio refers to the real-time generation of sounds that can adapt to changing input parameters. In its pure form, no recorded samples are stored, sounds are generated from ‘scratch’. This has huge benefits for game audio and VR, since very little memory is required, and sound generation can be controlled by the game or virtual environment.
Nemisindo (the Zulu word for ‘sound effects’) is a spin-out company from our research group, offering sound design services based around procedural audio technology. Previously, we blogged about Nemisindo’s first procedural audio plugin for the Unreal game engine, the Action Pack. Since then we have released an adaptive footsteps plugin for Unreal too. But today I’ll give you more detail on the Action Pack, now available for both Unreal and Unity.

Here’s a video all about the Action Pack for Unity;

The Action Pack contains 11 different procedural audio models, all based atleast in part from research in this group and within Nemisindo. Here’s the list, with links to online demonstrators of each sound model;

  • Explosion – a wide ranging sound model, capable of generating sounds from fireworks to thuds to grenades to bombs and more
  • Fire – the sound of a fire is a well-known sound texture, necessary for the ambience of many scenes. To get it right, we offer control of all the key characteristics
  • Gunshot – capturing characteristics of all the elements of a gunshot; the shell, the gassing, the bullet casing
  • Rifle – aimed at precise models of particular rifle designs, like the Winchester or Beretta
  • Helicopter – generates the sounds of a helicopter (engine, blades…), for arbitrary speeds, listener positions and more
  • Propeller – based on our research into aeroacoustics, this models the sound of aircraft propellers, from modern small drones to the World War 2 bomber the Flying Fortress
  • Jet – a powerful jet engine, with subtle control over things like thrust and turbine whine
  • Rocket – capturing the intense, powerful sound of a rocket launch
  • Alarm – based on the same principles for the design of car alarms, fire alarms, motion sensors…
  • Alert – red alert, background emergency sounds and more
  • Siren – did you know that police and ambulance sirens have different sounds in different states and countries,, with different settings for different situations? This sound model matches the range of siren sounds in at least five countries.

The biggest benefit of using these plug-ins, in my opinion, is how easy the sound design becomes. What would have been hours or even days of sourcing sound samples, processing them, loading them and assigning them to game events, now becomes just minutes to link plug-in parameters to game events.

You can get the Action Pack in the Unity Asset Store and in the Unreal Marketplace .

Submit your research paper to the 152nd AES Convention

The next Audio Engineering Society Convention will be in May, in the Hague, the Netherlands. Its expected to be the first major AES event with an in-person presence (though it has an online component too) since the whole Covid situation began. It will cover the whole field of audio engineering, with workshops, panel discussions, tutorials, keynotes, recording competitions and more. And attendees cover the full range of students, educators, researchers, audiophiles, professional engineers and industry representatives.
I’m always focused on the Technical Program for these events, where lots of new research is published and presented, and I expect this one to be great. Just based on some expected submissions that I know of, there’s sure to be some great papers on sound synthesis, game audio, immersive and spatial audio, higher quality and deeper understanding of audio effects, plenty of machine learning and neural network, novel mixing and mastering tools, and lots of new psychoacoustics research.
And that’s just the ones I’ve heard about!
Its definitely not too late to submit your own work, see the Call for Submissions. The deadline for full paper submissions (Category 1) or abstract + precis submissions (Category 2) is February 15th. And the deadline for abstract-only submissions (Category 3) is March 1st. In all cases, you submit a full paper for the final version if accepted (though for Category 3 this is optional). So the main difference between the 3 categories is the depth of reviewing, from full peer review for initial paper submissions to ‘light touch’ reviewing for an initial abstract submission.
For those who aren’t familiar with it, great research has been, and continues to be, presented at AES conventions. The very first music composition on a digital computer was presented at the 9th AES Convention in 1957. Schroeder’s reverberator first appeared there, the invention of the parametric equalizer was announced and explained there in 1972, Farina’s work on the swept sine technique for room response estimation was unveiled there, and has received over 1365 citations. Other famous firsts from the Technical program include the introduction of Feedback Delay Networks, Gardner’s famous paper on zero delay convolution, now used in almost all fast convolution algorithms, the unveiling of Spatial audio object coding, and the Gerzon-Craven noise shaping theorem, which is at the heart of many A to D and D to A converters.
So please consider submitting your research there, and I hope to see you there too, whether virtually or in person.

AES Presidency – and without a coup attempt

Happy New Year everyone!

As of January 1st, I (that’s Josh Reiss, the main author of this blog) am the president of the Audio Engineering Society. I wrote about being elected to this position before. Its an incredible honour.

For those who don’t know, the Audio Engineering Society (AES) is the largest professional society in audio engineering and related fields, and the only professional society devoted exclusively to audio technology. It has over 10,000 members, hosts conferences, conventions and lots of other events, publishes a renowned journal, develops standards and so much more. It was founded in 1948 and has grown to become an international organisation that unites audio engineers, creative artists, scientist and students worldwide by promoting advances in audio and disseminating new knowledge and research.

Anyway, I expect an exciting and challenging year ahead in this role. And one of my first tasks was to deliver the President’s Message, sort of announcing the start of my term to the AES community and laying down some of my thoughts about it. You can read all about it here.

And looking forward to seeing you all at the next AES Convention in The Hague in May, either in-person or online.

Exploiting Game Graphics Rendering For Sound Generation

A fascinating game audio research topic

When rendering the assets available in them, video games are often pushing the processing limits of their host machines, be it personal computer systems, dedicated gaming consoles or even cell phone devices. As modern video games continue to rapidly develop into increasingly rich and varied experiences, both aesthetic and technical considerations, among other factors, become important components to their success.

Particularly the graphics department is the one that often makes or breaks a game from the get-go for many players and this is why developers pay extra attention to it. One of the techniques that are commonly used in order to provide variety and realism in game graphics is procedural generation, sometimes also called procedural content generation (PCG in short). The term can be defined in different ways, but I believe the definition from Wikipedia is quite accurately descriptive:

In computing, procedural generation is a method of creating data algorithmically as opposed to manually, typically through a combination of human-generated assets and algorithms coupled with computer-generated randomness and processing power.

No Man’s Sky Next Generation (2020) by Hello Games features almost exclusively PCG assets, including audio.

Despite the fact that game graphics frequently enjoy the affordances of PCG techniques, game audio has only in the last decade, or so, begun to take advantage of it. Why? In his book Designing Sound (2010)procedural audio expert Andy Farnell gives the following explanation:

In fact, game audio started using procedural technology. Early game consoles and personal computers had synthesiser chips that produced sound effects and music in real time, but once sample technology matured it quickly took over because of its perceived realism. Thus synthesised sound was relegated to the scrapheap.

The complexity of modern games has come to challenge this approach. Indeed, samples are frozen audio snapshots in time and creative processing can only alter them so far in providing convincing realism, depth and variety at the end of the day. Procedural audio on the other hand, does exactly that by definition and the current state of technology is developed enough to support its successful implementation. Of course, different game development scenarios call for different solutions and more often than not, a mixed approach is preferable.

To get a taste of cutting-edge procedural audio effects implementation, below you can find an example using the latest plugin for the Unreal Engine by Nemisindo, which was released a few days ago.

All sounds are procedurally generated in real-time during gameplay.

So, far we took a look at procedurally generated graphics independently from audio. But how can existing graphics information, available in the game engine, be used to generate the sounds and music produced when objects interact in a game environment?

At this point, it is important to clarify the distinction between data and metadata inside a game engine. From the perspective of graphics, data can be perceived as everything we see, such as the shapes of the 2D or 3D models, the textures, the colours etc. Conversely, audio data are anything we hear, such as the music, sound effects and voiceovers. Metadata in both these cases, are all the stored parameters in memory that carry information about the state of each data asset at a certain point in time.

A really good way to start realizing graphics-based procedural audio is to try to translate and share metadata among PCG algorithms for audio and graphics. What if the varying state of a moving character’s speed could be directly used to produce infinitely unique step variations? What if different shoe or ground materials could drive different parameters of step sound physical models at runtime? What if information about the weather and time of the day could trigger different interactive music cues and set specific audio effect parameters to create unique soundscapes at every moment of gameplay? The possibilities are endless and go beyond games, as game engines are steadily finding their place in other fields, ranging from multimedia art to computational medicine.

Thank you for reaching this far, we hope you found this article informative. We will leave you with a video demonstration of Homo Informis, a music creation system by new IGGI PhD reseracher Dimitris Menexopoulos, which works by translating still and moving images into sound and by exposing the sonic parameters to be controlled by body gestures and EEG waves in real time.

Dimitris Menexopoulos — Homo Informis (2020)