Transcribing the spoken word

B&O Tape Recorder

The advent and subsequent mass adoption of embedded Flash audio and video content, over the last five years or so, has brought with it many challenges, but none more important than the issue of universal access.

Take audio for instance. A person with impaired hearing may find getting the information they need from embedded audio difficult or impossible. How can we ensure that the audio content we publish is as non-restrictive as possible for our users?

This is where the guidance of leading advocacy groups can help. WCAG 2.0, from the W3C, requires the provision of alternatives for pre-recorded audio. It states that:

Unless the pre-recorded audio provides no more information than is already presented in text, you should provide an alternative that presents equivalent information.

WebAIM’s helpful WCAG 2.0 Checklist describes this ‘alternative’ as:

A descriptive text transcript (including all relevant auditory clues and indicators).

This need for text-based equivalents for non-textual web content is strengthened further by government laws. In the US, Section 508, states that all Federal agencies shall:

Provide a text equivalent for every non-text element.

In the UK we have PAS 78 (and later this year BS 8878). Describing what’s expected from websites to comply with the Disability Discrimination Act, it states that:

Developers should consider the accessibility of any audio content on the website.

And goes on to say:

The inclusion of transcripts should also be considered.

So we’ve ascertained the need for word-for-word transcripts for both independent audio content and significant audio-based additions to related textual content. So what does a typical audio transcript look like and how easy are they to produce?

Writing an audio transcript

Listen to the following sample of open source audio (duration: less than a minute). Its narrator briefly talks about the post-Gutenberg impact of the printed book.

A transcript for this audio would look and read like the following:

NARRATOR: After Gutenberg, realms of everyday life once ruled and served by memory would be governed by the printed page. In the late middle ages, for the small literate class, hand-written books had provided an aid, and sometimes a substitute, for memory. But the printed book was far more portable, more accurate, more convenient to refer to, and of course: more public. Whatever was in print after being written by an author was also known to printers, proofreaders and anyone reached by the printed page. A man could now refer to the rules of grammar, the speeches of Cicero, and the text of theology, canon law, and morality without storing them in himself.

The narrator’s clear delivery and considered pauses made for an easy and enjoyable transcription. But this isn’t always the case. You can encounter anything from poor audio quality, loud background noises, voices talking over one-another as well as the uttering of unfamiliar accents and words. They can all make for a frustrating and elongated process of pause…rewind…play…listen…type…repeat.

You mustn’t lose heart though. If you cannot decipher a word or phrase then just remember that anyone listening to the audio itself will, most likely, encounter the same problem. I like to insert a simple [inaudible word] or [inaudible phrase] in this instance, but I’d also suggest offering a means of contact at the foot of your transcript so that anyone is free to provide suggestions if they so wish.

Transcribing a podcast offers a variety of challenges. Roundtable discussions or Skype call-ins will often act as a breeding ground for people cutting-in mid-sentence to offer their opinion – causing voices to momentarily overlap. Using a combination of ellipsis (…) and (over), we can help to re-create some of the chaos. Take these extracts from a copy of BBC Radio 4’s Money Box Live transcript (PDF, 53k):

LEWIS: It’s legally possible now, but the scheme has to agree to it as well…
McLEAN: Absolutely.
LEWIS: …which they don’t all do.

McPHAIL: It still leaves decades of existing workers who are going to be retiring at 62 for years to come.
LEWIS: (over) I thought you might say that Tom.

We’re not looking to produce a literacy classic so try to resist the temptation to tidy things up too much. By all means cut out the inevitable ums and errs, but remember that a person with a language-based learning disability, such as dyslexia, may find it helps to listen to an audio file whilst reading the alternative text content. So you should always look to produce a faithful impression of the original recording.

But what about those incidences when someone has stated something that is factually incorrect? Do you uphold the stance that whatever has been said must be written word-for-word? Not always. If the error is small – say someone’s name as been wrongly pronounced – I’d correct it. But if it’s a personal opinion then, clearly, that has to be included – regardless of severity. If you’re at all worried about printing a deliberately incorrect statement, and don’t want to necessarily give the impression you’re endorsing it, then either insert a traditional [sic] beside the word or phrase in question or a simple disclaimer at the beginning of the transcript. Something like:

The views expressed here do not necessarily reflect those of our own. Neither us nor the participants can guarantee the accuracy of this information.

If all this feels too much like hard work – maybe the thought of transcribing a twice-weekly podcast fills you with dread – then, thankfully, there are services that will help you. But always remember to test your transcripts to check that it actually matches the audio it’s representing.

When the spoken word is the message there will always be a need for text-based equivalents. Transcribing audio, where appropriate, should form part of the content planning and publishing process we go through. When we talk about our content having the ability to influence, educate, inform, and entertain; there has to be universal access. You owe it to your avid audience.

4 thoughts on “Transcribing the spoken word

  1. Destry Wion

    Another great article, Richard. Nice to see someone making the bridge between CS and accessibility. However, this would seem to make clear a gap in your positioning model from “They’ll thank you later”.

    What seems to be missing now is the “Integrateur” role on which things like graphic charter, prototypes, markup, accessibility (transcription?), scripting, etc would be attached. I would say HTML = page templates if your talking about the markup kind, so that would be something to clear up with the Interaction Designer role too, I would guess.

    The Integrateur role would certainly be linked between the Interaction Designer and the Programmer, and the Content Strategist would be linked to the Integrateur via the prototypes and accessibility halos, at least.

    “Translation” (or Localisation/Internationalisation) is another halo that might at least have a dotted halo as many web projects are international these days. Whether that’s added to the Information Architect role or not, I’m not sure, but I’m going to have some translators have a look and let me know because I’m quite curious now what they would think. They’ll probably claim needing their own role in red, of course. LOL!


  2. Destry Wion

    Now that I think about it. There’s probably two levels of accessibility that suggest two kinds of roles. The accessibility embedded in markup is what I was talking about earlier and suited to a skilled integrateur these days. But transcription is a whole bigger thing outside of the Integrateur’s domain, and thus, like the Translator, could be another required role in the model. Question, I guess, is when do you stop for the sake of simplicity?

    I love your model, by the way, and enjoy playing with it. :)

  3. Richard Post author

    Thanks Destry. Simplicity was certainly a consideration! It’s all food for thought though.

    There are many roles and responsibilities I toyed with promoting on the model; one being the ‘Marketer’. They would have lightened the branding/advertising load on the copywriter and could have provided another helpful link to the marketing elements of SEO.

    A ‘Translator’ would indeed be an interesting addition. I too would like to hear how they see their position within this and other models of this ilk.

  4. Pingback: Friday content strategy: installment 3 « new media mentality