Exploiting Game Graphics Rendering For Sound Generation

A fascinating game audio research topic

When rendering the assets available in them, video games are often pushing the processing limits of their host machines, be it personal computer systems, dedicated gaming consoles or even cell phone devices. As modern video games continue to rapidly develop into increasingly rich and varied experiences, both aesthetic and technical considerations, among other factors, become important components to their success.

Particularly the graphics department is the one that often makes or breaks a game from the get-go for many players and this is why developers pay extra attention to it. One of the techniques that are commonly used in order to provide variety and realism in game graphics is procedural generation, sometimes also called procedural content generation (PCG in short). The term can be defined in different ways, but I believe the definition from Wikipedia is quite accurately descriptive:

In computing, procedural generation is a method of creating data algorithmically as opposed to manually, typically through a combination of human-generated assets and algorithms coupled with computer-generated randomness and processing power.

No Man’s Sky Next Generation (2020) by Hello Games features almost exclusively PCG assets, including audio.

Despite the fact that game graphics frequently enjoy the affordances of PCG techniques, game audio has only in the last decade, or so, begun to take advantage of it. Why? In his book Designing Sound (2010)procedural audio expert Andy Farnell gives the following explanation:

In fact, game audio started using procedural technology. Early game consoles and personal computers had synthesiser chips that produced sound effects and music in real time, but once sample technology matured it quickly took over because of its perceived realism. Thus synthesised sound was relegated to the scrapheap.

The complexity of modern games has come to challenge this approach. Indeed, samples are frozen audio snapshots in time and creative processing can only alter them so far in providing convincing realism, depth and variety at the end of the day. Procedural audio on the other hand, does exactly that by definition and the current state of technology is developed enough to support its successful implementation. Of course, different game development scenarios call for different solutions and more often than not, a mixed approach is preferable.

To get a taste of cutting-edge procedural audio effects implementation, below you can find an example using the latest plugin for the Unreal Engine by Nemisindo, which was released a few days ago.

All sounds are procedurally generated in real-time during gameplay.

So, far we took a look at procedurally generated graphics independently from audio. But how can existing graphics information, available in the game engine, be used to generate the sounds and music produced when objects interact in a game environment?

At this point, it is important to clarify the distinction between data and metadata inside a game engine. From the perspective of graphics, data can be perceived as everything we see, such as the shapes of the 2D or 3D models, the textures, the colours etc. Conversely, audio data are anything we hear, such as the music, sound effects and voiceovers. Metadata in both these cases, are all the stored parameters in memory that carry information about the state of each data asset at a certain point in time.

A really good way to start realizing graphics-based procedural audio is to try to translate and share metadata among PCG algorithms for audio and graphics. What if the varying state of a moving character’s speed could be directly used to produce infinitely unique step variations? What if different shoe or ground materials could drive different parameters of step sound physical models at runtime? What if information about the weather and time of the day could trigger different interactive music cues and set specific audio effect parameters to create unique soundscapes at every moment of gameplay? The possibilities are endless and go beyond games, as game engines are steadily finding their place in other fields, ranging from multimedia art to computational medicine.

Thank you for reaching this far, we hope you found this article informative. We will leave you with a video demonstration of Homo Informis, a music creation system by new IGGI PhD reseracher Dimitris Menexopoulos, which works by translating still and moving images into sound and by exposing the sonic parameters to be controlled by body gestures and EEG waves in real time.

Dimitris Menexopoulos — Homo Informis (2020)