Greatest JAES papers of all time, Part 1

The Journal of the Audio Engineering Society (JAES) is the premier publication of the AES, and is the only peer-reviewed journal devoted exclusively to audio technology. The first issue was published in 1949, though volume 1 began in 1953. For the past 70 years, it has had major impact on the science, education and practice of audio engineering and related fields.

I was curious which were the most important JAES papers, so had a look at Google Scholar to see which had the most citations. This has lots of issues, not just because Scholar won’t find everything, but because a lot of the impact is in products and practice, which doesn’t usually lead to citing the papers. Nevertheless, I looked over the list, picked out some of the most interesting ones and following no rules except my own biases, selected the Greatest Papers of All Time Published in the Journal of the Audio Engineering Society. Not surprisingly, the list is much longer than a single blog entry, so this is just part 1.

All of the papers below are available from the Audio Engineering Society (AES) E-library, the world’s most comprehensive collection of audio information. It contains over 16,000 fully searchable PDF files documenting the progression of audio research from 1953 to the present day. It includes every AES paper published at a convention, conference or in the Journal. Members of the AES get free access to the E-library. To arrange for an institutional license, giving full access to all members of an institution, contact Lori Jackson Lori Jackson directly, or go to http://www.aes.org/e-lib/subscribe/ .

Selected greatest JAES papers

ambisonicsThis is the main ambisonics paper by one* of its originator, Michael Gerzon, and perhaps the first place the theory was described in detail (and very clearly too). Ambisonics is incredibly flexible and elegant. It is now used in a lot of games and has become the preferred audio format for virtual reality. Two other JAES ambisonics papers are also very highly cited. In 1985, Michael Gerzon’s Ambisonics in multichannel broadcasting and video (368 citations) described the high potential of ambisonics for broadcast audio, which is now reaching its potential due to the emergence of object-based audio production. And 2005 saw Mark Poletti’s Three-dimensional surround sound systems based on spherical harmonics (348 citations), which rigorously laid out and generalised all the mathematical theory of ambisonics.

*See the comment on this entry. Jerry Bauck correctly pointed out that Duane H. Cooper was the first to describe ambisonics in some form, and Michael Gerzon credited him for it too. Cooper’s work was also published in JAES. Thanks Jerry.

James Moorer

This isn’t one of the highest cited papers, but it still had huge impact, and James Moorer is a legend in the field of audio engineering (see his prescient ‘Audio in the New Millenium‘). The paper popularised the phase vocoder, now one of the most important building blocks of modern audio effects. Auto-tune, anyone?

Richard Heyser’s Time Delay Spectrometry technique allowed one to make high quality anechoic spectral measurements in the presence of a reverberant environment. It was ahead of its time since despite the efficiency and elegance, computing power was not up to employing the method. But by the 1980s, it was possible to perform complex on-site measurements of systems and spaces using Time Delay Spectrometry. The AES now organises Heyser Memorial Lectures in his honor.

hrtf

Together, these two papers by Henrik Møller et al completed transformed the world of binaural audio. The first paper described the first major dataset of detailed HRTFs, and how they vary from subject to subject. The second studied localization performance when subjects listened to a soundfield, the same soundfield using binaural recordings with their own HRTFs, and those soundfields using the HRTFs of others. It nailed down the state of the art and the challenges for future research.

The early MPEG audio standards. MPEG 1 unveiled the MP3, followed by the improved MPEG2 AAC. They changed the face of not just audio encoding, but completely revolutionised music consumption and the music industry.

John Chowning was a pioneer and visionary in computer music. This seminal work described FM synthesis, where the timbre of a simple waveform is changed by frequency modulating it with another frequency also in the audio range, resulting in a surprisingly rich control of audio spectra and their evolution in time. In 1971, Chowning also published The simulation of moving sound sources (278 citations), perhaps the first system (and using digital technology) for synthesising an evolving sound scene.

The famous Glasberg and Moore loudness model is perhaps the most widely used auditory model for loudness and masking estimation. Other aspects of it have appeared in other papers (including A model of loudness applicable to time-varying sounds, 487 citations, 2002).

More greatest papers in the next blog entry.

Advertisements