How does this sound? Evaluating audio technologies

The audio engineering team here have done a lot of work on audio evaluation, both in collaboration with companies and as an essential part of our research. Some challenges come up time and time again, not just in terms of formal approaches, but also in terms of just establishing a methodology that works. I’m aware of cases where a company has put a lot of effort into evaluating the technologies that they create, only for it to make absolutely no difference in the product. So here are some ideas about how to do it, especially from an informal industry perspective.

– When you are tasked with evaluating a technology, you should always maintain a dialogue with the developer. More than anyone else, he or she knows what the tool is supposed to do, how it all works, what content might be best to use and has suggestions on how to evaluate it.

subjective evaluation details

– Developers should always have some test audio content that they use during development. They work with this content all the time to check that the algorithm is modifying or analysing the audio correctly. We’ll come back to this.

– The first stage of evaluation is documentation. Each tool should have some form of user guide, tester guide and developer guide. The idea is that if the technology remains unused for a period of time and those who worked on it have moved on, a new person can read the guides and have a good idea how to use it and test it, and a new developer should be able to understand the algorithm and the source code. Documentation should also include test audio content, preferably both input and output files with information on how the tool should be used with this content.

– The next stage of evaluation is duplication. You should be able run the tool as suggested in the guide and get the expected results with the test audio. If anything in the documentation is incorrect or incomplete, get in touch with the developers for more information.

– Then we have the collection stage. You need test content to evaluate the tool. The most important content is that which shows off exactly what the tool is intended to do. You should also gather content that tests challenging cases, or content where you need to ensure that the effect doesn’t make things worse.

– The preparation stage is next, though this may be performed in tandem with collection. With the test content, you may need to edit it, in order that its ready to use in testing. You may also want to create manually create target content, demonstrating ideal results, or at least of similar sound quality to expected results.

– Next is informal perceptual evaluation. This is lots of listening and playing around with the tool. The goal is to identify problems, find out when it works best, identify interesting cases, problematic or preferred parameter settings.


– Now on to semi-formal evaluation. Have focused questions that you need to find the answer to and procedures and methodologies to answer them. Be sure to document your findings, so that you can say what content causes what problem, how and why, etc. This needs to be done so that the problem can be exactly replicated by developers, and so that you can see if the problem still exists in the next iteration.

– Now comes the all-important listening tests. Be sure that the technology is at a level such that the test will give meaningful results. You don’t want to ask a bunch of people to listen and evaluate if the tool still has major known bugs. You also want to make sure that the test is structured in such a way so that it gives really useful information. This is very important, and often overlooked. Finding out that people preferred implementation A over implementation B is nice, but its much better to find out why, and how much, and if listeners would have preferred something else. You also want to do this test with lots of content. If, for instance only one piece of content is used in a listening test, then you’ve only found out that people prefer A over B for one example. So, generally, listening tests should involve lots of questions, lots of content, and everything should be randomised to prevent bias. You may not have time to do everything, but its definitely worth putting significant time and effort into listening test design.

Keeping Score for the Team

We’ve developed the Web Audio Evaluation Toolbox, designed to make listening test design and implementation straightforward and high quality.

– And there is the feedback stage. Evaluation counts for very little unless all the useful information gets back to developers (and possibly others), and influences further development. All this feedback needs to be prepared and stored, so that people can always refer back to it.

– Finally, there is revisiting and reiteration. If we identify a problem, or a place for improvement, we need to perform the same evaluation on the next iteration of the tool to ensure that the problem has indeed been fixed. Otherwise, issues perpetuate and we never actually know if the tool is improving and problems are resolved and closed.

By the way, I highly recommend the book Perceptual Audio Evaluation by Bech and Zacharov, which is the bible on this subject.


So you want to write a research paper

The Audio Engineering research team here submit a lot of conference papers. In our internal reviewing and when we review submissions by others, certain things come up again and again. I’ve compiled all this together as some general advice for putting together a research paper for an academic conference, especially in engineering or computer science. Of course, there are always exceptions, and the advice below doesn’t always apply. But its worth thinking of this as a checklist to catch errors and issues in an early draft.

Make sure the abstract is self-contained. Don’t assume the person reading the abstract will read the paper, or vice-versa. Avoid acronyms. Be sure to actually say what the results were and what you found out, rather than just saying you applied the techniques and analysed the data that came out.
The abstract is part summary of the paper, and part an advertisement for why someone should read the paper. And keep in mind that far more people read the abstract than read the paper itself.
Make clear what the problem is and why it is important. Why is this paper needed, and what is going to distinguish this paper from the others?
In the last paragraph, outline the structure of the rest of the paper. But make sure that it is specific to the structure of the paper.

Background/state of the art/prior work – this could be a subsection of introduction, text within the introduction, or its own section right after the introduction. What have others done, what is the most closely related work? Don’t just list a lot of references. Have something to say about each reference, and relate them to the paper. If a lot of people have approached the same or similar problems, consider putting the methods into a table, where for each method, you have columns for short description, the reference(s), their various properties and their assumptions. If you think no one has dealt with your topic before, you probably just haven’t looked deep enough 😉 . Regardless, you should still explain what is the closest work, perhaps highlighting how they’ve overlooked your specific problem.

Problem formulation – after describing state of the art, this could be a subsection of introduction, text within the introduction, or its own section. Give a clear and unambiguous statement of the problem, as you define it and as it is addressed herein. The aim here is to be rigorous, and remove any doubt about what you are doing. It also allows other work to be framed in the same way. When appropriate, this is described mathematically, e.g., we define these terms, assume this and that, and we attempt to find an optimal solution to the following equation.

The structure of this, the core of the paper, is highly dependent on the specific work. One good approach is to have quite a lot of figures and tables. Then most of the writing is mainly just explaining and discusses these figures and tables, and the ordering of these should be mostly clear.
A typical ordering is
Describe the method, giving block diagrams where appropriate
Give any plots that analyse and illustrate the method, but aren’t using the method to produce results that address the problem
Present the results of using your method to address the problem. Keep the interpretation of the results here short, unless detailed explanation of a result it is needed to justify the next result that is presented. If there is lengthy discussion or interpretation, then leave that to a discussion section.

Equations and notation
For most papers in signal processing and related fields, at least a few equations are expected. The aim with equations is always to make the paper more understandable and less ambiguous. So avoid including equations just for the sake of it, avoid equations if they are just an obvious intermediate step, or if they aren’t really used in any way (e.g. ‘we use the Fourier transform, which by the way, can be given in this equation. Moving on…’), do use equations if they clear up any confusion when a technical concept is explained just with text.
Make sure every equation can be fully understood. All terms and notation should be defined, right before or right after they are used in the text. The logic or process of going from one equation to the next should be made clear.
Tables and figures
Where possible, these should be somewhat self-contained. So one should be able to look at a figure and understand it without reading the paper. If that isn’t possible, then it should be understood just by looking at the figure and figure caption. If not, then by just looking at the figure, caption and a small amount of text where the figure is described.
Figure captions typically go immediately below figures, but table captions typically above tables.
Label axes in figures wherever possible, and give units. If units are not appropriate, make clear that an axis is unitless. For any text within a figure, make sure that the font size used is close to the font size of the main text in the paper. Often, if you import a figure from software intending for viewing on a screen (like matlab), then the font can appear miniscule when the figure is imported into a paper.
Make sure all figures and tables are numbered and are all referenced, by their number, in the main text. Position them close to where they are first mentioned in the text. Don’t use phrasing that refers to their location, like ‘the figure below’ or ‘the table on the next page’, partly because their location may change in the final version.
Make sure all figures are high quality. Print out the paper before submitting and check that it all looks good, is high resolution, and nicely formatted.


Discussion/Future work/conclusion
Discussion and future work may be separate sections or part of the conclusion. Discussion is useful if the results need to be interpreted, but is often kept very brief in short papers where the results may speak for themselves.
Future work is not about what the author plans to do next. Its about research questions that arose or were not addressed, and research directions that are worth pursuing. The answers to these research questions may be pursued by the author or others. Here, you are encouraging others to build on the work in this paper, and suggesting to them the most promising directions and approaches. Future work is usually just a couple sentences or couple paragraphs at the end of conclusion, unless there is something particularly special about it.
The conclusion should not simply repeat the abstract or summarise the paper, though there may be an element of that. Its about getting across what were the main things that the reader should take away and remember. What was found out? What was surprising? What are the main insights that arose? If the research question is straightforward and directly addressed, what was the answer?


The most important criterion for references is to cite wherever it justifies a claim, clarifies a point, identifies where an idea is coming from someone else, or helps the reader find pertinent additional material. If you’re dealing with a very niche or underexplored topic, you may wish to give a full review of all existing literature on the subject.
Aim for references to come from high impact, recent peer reviewed journal articles, or as close to that as possible. So for instance, choose a journal over a conference article if you can, but maybe a highly cited conference paper over an obscure journal paper.
Avoid using web site references. If the reference is essentially just a URL, then put that directly in the text or as a footnote, but not as a citation. And no one cares when you accessed the website so no need to say ‘accessed on [date]’. If it’s a temporary record that may have only been there for a short period of time before the paper submission date, its probably not a reliable reference, won’t help the reader and you should probably find an alternative citation.
Check your reference formatting, especially if you use someone else’s reference library or some automatically generated citations. For instance some citations will have a publisher and a conference name, so it reads as ‘the X Society Conference, published by the X Society.
Be consistent. So for instance, have all references use author initials, or none of them. Always use journal abbreviations, or never use them. Always include the city of a conference, or never do it. And so on.

What a PhD thesis is really about… really!

I was recently pointed to a blog post about doing a PhD. It had lots of interesting advice, mainly along the lines of ‘if you are finding it difficult, don’t worry, that probably means you’re doing it right.’ True, and good advice to keep in mind for PhD researchers who might be feeling lost in the wilderness. But it reminded me that I’d recently given a talk about PhD research, based on experience I have either examining or supervising dozens of theses, and some of the main points that I made are worth sharing. And I think they are applicable to research-based PhDs across lots of different disciplines.

First off, lets think of a few things that a PhD thesis is not supposed to be;


  • A thesis isn’t easy

See the blog I mentioned above. Easy research may still be publishable, but its not going to make a thesis. If you’re finding it easy, you’re probably missing the point.

  • A thesis is not only what you already know

I’ve known researchers unwilling to learn a bit of new maths, or learn what’s going on under the hood in the software they use. Expect the research to lead you out of your comfort zone.

  • A thesis isn’t just something you do to get a phd

It’s not simply a box that needs to be checked off so that you can get ‘Doctor’ next to your name.

  • A thesis isn’t obvious

If you and most others can predict the outcome in advance based on common sense, then why do it?

  • A thesis isn’t just several years of hard work

It may take years of hard work to achieve, but that’s not the point. You don’t get a PhD just for time and effort.

  • A thesis isn’t about building a system

that’s challenging and technical, and may be a byproduct of the research, but its not the research result.

  • A thesis isn’t a lot of little achievements

I’ve seen theses that read a bit like ‘I did this little interesting thing, then this other one, then another one…’ That doesn’t look good. If no one contribution is strong enough to be a thesis, then just putting them all into one document still isn’t a strong contribution. Note that in some cases, you can do a ‘thesis by publication’, which is a collection of papers, usually with an introduction and some wrapper information. But in which case it should still tie together with an overall contribution.

So with that in mind, lets now think about what a thesis is, with a few highlighted aspects that are often neglected.


  • A thesis advances knowledge

That’s the key. Some new understanding, new insights, backed up by evidence and critical thinking. But this also suggests that it needs to actually be an advance, so you really need to know the prior art. How much reading of the literature one should do is a different question, and depends on the topic, the field, and the researcher. But in my experience, researchers generally don’t explore the literature deep enough. One thing is for sure though; if the researcher ever makes the claim that no one has done this before, they better have the evidence to back that up.

  • A thesis is an argument

The word ‘thesis’ comes from Greek, and means an argument in the sense of putting forth a position. That means that there needs to be some element of controversy in the topic, and the thesis provides strong evidence supporting a particular position. That is, someone knowledgeable in the field could read the abstract and think, ‘no, I don’t believe it,’ but then change his or her mind after reading the whole thesis.

  • A thesis tells a story

People tend to forget that it’s a book. Its meant to be read, and in some sense, enjoyed. So the researcher should think about the reader. I don’t mean it should be silly or salacious, but it should be engaging, and the researcher should always consider whether they (or at least some people in the field) would want to read what they’d written.