The Audio Engineering research team here submit a lot of conference papers. In our internal reviewing and when we review submissions by others, certain things come up again and again. I’ve compiled all this together as some general advice for putting together a research paper for an academic conference, especially in engineering or computer science. Of course, there are always exceptions, and the advice below doesn’t always apply. But its worth thinking of this as a checklist to catch errors and issues in an early draft.
Make sure the abstract is self-contained. Don’t assume the person reading the abstract will read the paper, or vice-versa. Avoid acronyms. Be sure to actually say what the results were and what you found out, rather than just saying you applied the techniques and analysed the data that came out.
The abstract is part summary of the paper, and part an advertisement for why someone should read the paper. And keep in mind that far more people read the abstract than read the paper itself.
Make clear what the problem is and why it is important. Why is this paper needed, and what is going to distinguish this paper from the others?
In the last paragraph, outline the structure of the rest of the paper. But make sure that it is specific to the structure of the paper.
Background/state of the art/prior work – this could be a subsection of introduction, text within the introduction, or its own section right after the introduction. What have others done, what is the most closely related work? Don’t just list a lot of references. Have something to say about each reference, and relate them to the paper. If a lot of people have approached the same or similar problems, consider putting the methods into a table, where for each method, you have columns for short description, the reference(s), their various properties and their assumptions. If you think no one has dealt with your topic before, you probably just haven’t looked deep enough 😉 . Regardless, you should still explain what is the closest work, perhaps highlighting how they’ve overlooked your specific problem.
Problem formulation – after describing state of the art, this could be a subsection of introduction, text within the introduction, or its own section. Give a clear and unambiguous statement of the problem, as you define it and as it is addressed herein. The aim here is to be rigorous, and remove any doubt about what you are doing. It also allows other work to be framed in the same way. When appropriate, this is described mathematically, e.g., we define these terms, assume this and that, and we attempt to find an optimal solution to the following equation.
The structure of this, the core of the paper, is highly dependent on the specific work. One good approach is to have quite a lot of figures and tables. Then most of the writing is mainly just explaining and discusses these figures and tables, and the ordering of these should be mostly clear.
A typical ordering is
Describe the method, giving block diagrams where appropriate
Give any plots that analyse and illustrate the method, but aren’t using the method to produce results that address the problem
Present the results of using your method to address the problem. Keep the interpretation of the results here short, unless detailed explanation of a result it is needed to justify the next result that is presented. If there is lengthy discussion or interpretation, then leave that to a discussion section.
Equations and notation
For most papers in signal processing and related fields, at least a few equations are expected. The aim with equations is always to make the paper more understandable and less ambiguous. So avoid including equations just for the sake of it, avoid equations if they are just an obvious intermediate step, or if they aren’t really used in any way (e.g. ‘we use the Fourier transform, which by the way, can be given in this equation. Moving on…’), do use equations if they clear up any confusion when a technical concept is explained just with text.
Make sure every equation can be fully understood. All terms and notation should be defined, right before or right after they are used in the text. The logic or process of going from one equation to the next should be made clear.
Tables and figures
Where possible, these should be somewhat self-contained. So one should be able to look at a figure and understand it without reading the paper. If that isn’t possible, then it should be understood just by looking at the figure and figure caption. If not, then by just looking at the figure, caption and a small amount of text where the figure is described.
Figure captions typically go immediately below figures, but table captions typically above tables.
Label axes in figures wherever possible, and give units. If units are not appropriate, make clear that an axis is unitless. For any text within a figure, make sure that the font size used is close to the font size of the main text in the paper. Often, if you import a figure from software intending for viewing on a screen (like matlab), then the font can appear miniscule when the figure is imported into a paper.
Make sure all figures and tables are numbered and are all referenced, by their number, in the main text. Position them close to where they are first mentioned in the text. Don’t use phrasing that refers to their location, like ‘the figure below’ or ‘the table on the next page’, partly because their location may change in the final version.
Make sure all figures are high quality. Print out the paper before submitting and check that it all looks good, is high resolution, and nicely formatted.
Discussion and future work may be separate sections or part of the conclusion. Discussion is useful if the results need to be interpreted, but is often kept very brief in short papers where the results may speak for themselves.
Future work is not about what the author plans to do next. Its about research questions that arose or were not addressed, and research directions that are worth pursuing. The answers to these research questions may be pursued by the author or others. Here, you are encouraging others to build on the work in this paper, and suggesting to them the most promising directions and approaches. Future work is usually just a couple sentences or couple paragraphs at the end of conclusion, unless there is something particularly special about it.
The conclusion should not simply repeat the abstract or summarise the paper, though there may be an element of that. Its about getting across what were the main things that the reader should take away and remember. What was found out? What was surprising? What are the main insights that arose? If the research question is straightforward and directly addressed, what was the answer?
The most important criterion for references is to cite wherever it justifies a claim, clarifies a point, identifies where an idea is coming from someone else, or helps the reader find pertinent additional material. If you’re dealing with a very niche or underexplored topic, you may wish to give a full review of all existing literature on the subject.
Aim for references to come from high impact, recent peer reviewed journal articles, or as close to that as possible. So for instance, choose a journal over a conference article if you can, but maybe a highly cited conference paper over an obscure journal paper.
Avoid using web site references. If the reference is essentially just a URL, then put that directly in the text or as a footnote, but not as a citation. And no one cares when you accessed the website so no need to say ‘accessed on [date]’. If it’s a temporary record that may have only been there for a short period of time before the paper submission date, its probably not a reliable reference, won’t help the reader and you should probably find an alternative citation.
Check your reference formatting, especially if you use someone else’s reference library or some automatically generated citations. For instance some citations will have a publisher and a conference name, so it reads as ‘the X Society Conference, published by the X Society.
Be consistent. So for instance, have all references use author initials, or none of them. Always use journal abbreviations, or never use them. Always include the city of a conference, or never do it. And so on.
Headphones have been around for over a hundred years, but recently there has been a surge in new technologies, spurred on in part by the explosive popularity of Beats headphones. In this blog, we will look at three advances in headphones arising from high tech start-ups. I’ve been introduced to each of these companies recently, but don’t have any affiliation with them.
EAVE (formerly Eartex) are a London-based company, who have developed headphones aimed at the industrial workplace; construction sites, the maritime industry… Typical ear defenders do a good job of blocking out noise, but make communication extremely difficult. EAVE’s headphones are designed to protect from excessive noise, yet still allow effective communication with others. One of the founders, David Greenberg, has a background in auditory neuroscience, focusing on hearing disorders. He brought that knowledge to the company. He used his knowledge of hearing aids to design headphones that amplify speech while attenuating noise sources. They are designed for use in existing communication networks, and use beam forming microphones to focus the microphone on the speaker’s voice. They also have sensors to monitor noise levels so that noise maps can be created and personal noise exposure data can be gathered.
This use of additional sensors in the headset opens up lots of opportunities. Ossic are a company that emerged from Abbey Road Red, the start-up incubator established by the legendary Abbey Road Studios. Their headphone is packed with sensors, measuring the shape of your ears, head and torso. This allows them to estimate your own head-related transfer function, or HRTF, which describes how sounds are filtered as they travel from to your ear canal. They can then apply this filtering to the headphone output, allowing sounds to be far more accurately placed around you. Without HRTF filtering, sources always appear to be coming from inside your head.
Its not as simple as that of course. For instance, when you move your head, you can still identify the direction of arrival of different sound sources. So the Ossic headphones also incorporate head tracking. And a well-measured HRTF is essential for accurate localization, but calibration to the ear is not perfect. So their headphones also have eight drivers rather than the usual two, allowing more careful positioning of sounds over a wide range of frequencies.
Ossic was funded by a Kickstarter campaign. Another headphone start-up, Ora, currently has a Kickstarter campaign. Ora is a venture that was founded at Tandem Launch, who create companies often arising from academic research, and have previously invested in research arising from the audio engineering research team behind this blog.
Ora aim to release ‘the world’s first graphene headphones.’ Graphene is a form of carbon, shaped in a one atom thick lattice of hexagons. In 2004, Andre Geim and Konstantin Novoselov of the University of Manchester, isolated the material, analysed its properties, and showed how it could be easily fabricated, for which they won the Nobel prize in 2010. Andre Geim, by the way, is a colourful character, and the only person to have won both the Nobel and Ig Nobel prizes, the latter awarded for experiments involving levitating frogs.
Graphene has some amazing properties. Its 200 times stronger than the strongest steel, efficiently conducts heat and electricity and is nearly transparent. In 2013, Zhou and Zettl published early results on a graphene-based loudspeaker. In 2014, Dejan Todorovic and colleagues investigated the feasibility of graphene as a microphone membrane, and simulations suggested that it could have high sensitivity (the voltage generated in response to a pressure input) over a wide frequency range, far better than conventional microphones. Later that year, Peter Gaskell and others from McGill University performed physical and acoustical measurements of graphene oxide which confirmed Todorovic’s simulation results. Interestingly, they seemed unaware of Todorovic’s work.
Graphene loudspeaker, courtesy Zettl Research Group, Lawrence Berkeley National Laboratory and University of California at Berkeley
Ora’s founders include some of the graphene microphone researchers from McGill University. Ora’s headphone uses a Graphene-based composite material optimized for use in acoustic transducers. One of the many benefits is the very wide frequency range, making it an appealing choice for high resolution audio reproduction.
I should be clear. This blog is not meant as an endorsement of any of the mentioned companies. I haven’t tried their products. They are a sample of what is going on at the frontiers of headphone technology, but by no means cover the full range of exciting developments. Still, one thing is clear. High-end headphones in the near future will sound very different from the typical consumer headphones around today.
The Audio Engineering team (C4DM) was present in this year’s edition of Sónar+D in Barcelona. Sónar+D is an international conference integrated to Sónar festival that focus on the interdisciplinary approach between creativity and technology.
The Sónar Innovation Challenge (SIC), co-organized by the MTG, <<is an online and on site platform for the creative minds that want to be one step ahead and experiment with the future of technology. It brings together innovative tech companies and creators, collaborating to solve challenges that will lead to disruptive prototypes showcased in Sónar+D.>>
In this year’s challenge, Marco Martínez was part of the enhanced dj assistant by the Music Technology Group at Universitat Pompeu Fabra, which challenged participants to create a user-friendly, visually appealing and musically motivated system that DJs can use to remix music collections in exciting new ways.
Thus, after nearly one month of online meetings, the challengers and mentors finally met at Sónar, and during 4 days of intensive brain-storming-programming-prototyping at more than 30°C the team came with ATOMIX:
From multitrack recording (stems) and using advanced algorithms and cutting edge technologies in feature extraction, clustering, synthesis and visualisation. It segments a collection of stems into atoms of sound and groups them by timbre similarity. Thus, through concatenative synthesis, ATOMIX allows you to manipulate and exchange atoms of sound in real-time with professional DAW controls, achieving a one-of-a-kind live music exploration.
The project is still in a prototype stage and we hope to hear news of development very soon.
At the recent Audio Engineering Society Convention, one of the most interesting talks was in the E-Briefs sessions. These are usually short presentations, dealing with late-breaking research results, work in progress, or engineering reports. The work, by Charalampos Papadokos presented an e-brief titled ‘Power Out of Thin Air: Harvesting of Acoustic Energy’.
Ambient energy sources are those sources all around us, like solar and kinetic energy. Energy harvesting is the capture and storage of ambient energy. It’s not a new concept at all, and dates back to the windmill and the waterwheel. Ambient power has been collected from electromagnetic radiation since the invention of crystal radios by Sir Jagadish Chandra Bose, a true renaissance man who made important contributions to many fields. But nowadays, people are looking for energy harvesting from many more possible sources, often for powering small devices, like wearable electronics and wireless sensor networks. The big advantages, of course, is that energy harvesters do not consume resources like oil or coal, and energy harvesting might enable some devices to operate almost indefinitely.
But two of the main challenges is that many ambient energy sources are very low power, and the harvesting may be difficult.
Typical power densities from energy harvesting can vary over orders of magnitude. Here’s the energy densities for various ambient sources, taken from the Open Access book chapter ‘Electrostatic Conversion for Vibration Energy Harvesting‘ by S. Boisseau, G. Despesse and B. Ahmed Seddik ‘.
You can see that vibration, which includes acoustic vibrations, has about 1/100th the energy density of solar power, or even less. The numbers are arguable, but at first glance it looks like it will be exceedingly difficult to get any significant energy from acoustic sources unless one can harvest over a very large area.
That’s where this e-brief paper comes in. Papadokos and his co-author, John Mourjopoulos, have a patented approach to harvesting the acoustic energy inside a loudspeaker enclosure. Others had considered harvesting the sound energy from loudspeakers before (see the work of Matsuda, for instance), but mainly just as a way of testing their harvesting approach, and not really exploiting the properties of loudspeakers. Papadokos and Mourjopoulos had the insight to realise that many loudspeakers are enclosed and the enclosure has abundant acoustic energy that might be harvested without interfering with the external design and without interfering with the sound presented to the listener. In earlier work, Papadokos and Mourjopoulos found that sound pressure within the enclosure often exceeds 130 dBs within a loudspeaker enclosure. Here, they simulated the effect of a piezoelectric plate in the enclosure, to convert the acoustic energy to electrical energy. Results showed that it might be possible to generate 2.6 volts under regular operating conditions, thus proving the concept of harvesting acoustic energy from loudspeaker enclosures, at least in simulation.
The 142nd AES Convention was held last month in the creative heart of Berlin. The four-day program and its more than 2000 attendees covered several workshops, tutorials, technical tours and special events, all related to the latest trends and developments in audio research. But as much as scale, it’s attention to detail that makes AES special. There’s an emphasis on the research side of audio topics as much as the side of panels of experts discussing a range of provocative and practical topics.
It can be said that 3D Audio: Recording and Reproduction, Binaural Listening and Audio for VR were the most popular topics among workshops, tutorial, papers and engineering briefs. However, a significant portion of the program was also devoted to common audio topics such as digital filter design, live audio, loudspeaker design, recording, audio encoding, microphones, and music production techniques just to name a few.
For this reason, here at the Audio Engineering research team within C4DM, we bring you what we believe were the highlights, the key talks or the most relevant topics that took place during the convention.
What better way to start AES than with a mastering experts’ workshop discussing about the future of the field? Jonathan Wyner (iZotope) introduced us to the current challenges that this discipline faces. This related to the demographic, economic and target formatting issues that are constantly evolving and changing due to advances in the music technology industry and its consumers.
When discussing the future of mastering, the panel was reluctant to a fully automated future. But pointed out that the main challenge of assistive tools is to understand artistry intentions and genre-based decisions without the need of the expert knowledge of the mastering engineer. Concluding that research efforts should go towards the development of an intelligent assistant, able to function as an smart preset that provides master engineers a starting point.
This paper described a method to digitally model an analogue dynamic range compression. Based on the analysis of processed and unprocessed audio waveforms, a generic model of dynamic range compression is proposed and its parameters are derived from iterative optimization techniques.
Audio samples were reproduced and the quality of the audio produced by the digital model was demonstrated. However, it should be noted that the parameters of the digital compressor can not be changed, thus, this could be an interesting future work path, as well as the inclusion of other audio effects such as equalizers or delay lines.
In the paper ‘Formal Usability Evaluation of Audio Track Widget Graphical Representation for Two-Dimensional Stage Audio Mixing Interface‘ an evaluation of different graphical track visualization styles is proposed. Multitrack visualizations included text only, different colour conventions for circles containing text or icons related to the type of instruments, circles with opacity assigned to audio features and also a traditional channel strip mixing interface.
Efficiency was tested and it was concluded that subjects preferred instrument icons as well as the traditional mixing interface. In this way, taking into account several works and proposals on alternative mixing interfaces (2D and 3D), there is still a lot of scope to explore on how to build an intuitive, efficient and simple interface capable of replacing the good known channel strip.
This tutorial, was based on the engineering brief ‘Quantization Noise of Warped and Parallel Filters Using Floating Point Arithmetic’ where warped parallel filters are proposed, which aim to have the frequency resolution of the human ear.
Thus, via Matlab, we explored various approaches for achieving this goal, including warped FIR and IIR, Kautz, and fixed-pole parallel filters. Providing in this way a very useful tool that can be used for various applications such as room EQ, physical modelling synthesis and perhaps to improve existing intelligent music production systems.
Abbey Road’s James Clarke presented a great poster with the actual algorithm that was used for the remixed, remastered and expanded version of The Beatles’ album Live at the Hollywood Bowl. The method achieved to isolate the crowd noise, allowing to separate into clean tracks everything that Paul McCartney, John Lennon, Ringo Starr and George Harrison played live in 1964.
The results speak for themselves (audio comparison). Thus, based on a Non-negative Matrix Factorization (NMF) algorithm, this work provides a great research tool for source separation and reverse-engineer of mixes.
Other keynotes worth to mention:
The rest of the paper proceedings are available in the AES E-library.