FOTW Audio Productions Logo
Image courtesy of Gerard Siderius Unsplash.com

The Year of AI: the Good, the Bad & the Ugly

In 2015, I had a subscription with Reader’s Digest that my mum had got me for a birthday present, and I was surprised when one issue had the headline: 2015: The Year of Podcasting. This rang true, as many people in my workplace (a radio station at the time) were becoming interested in it, as well as the general public becoming aware of it — not quite the critical mass that hit during COVID — but it was getting there.

Well, I sort of feel that this year, 2024, has been the The Year of AI. Even though I first started hearing about Chat-GPT last year, the volume of AI discussion seems to have gone through the roof, and the awareness of AI with the general public is at an all-time high.

One of the reasons many of us adults are thinking about AI is we are worried about how it will affect our job prospects. ‘Robots got my job’ has been a kind of a cynical joke since last century, with industrial jobs often being made more efficient through technological automation, moving people away from making things by hand. But this has meant that there has been a persistent fear for all of us in the workforce whether or not our jobs have any kind of longevity.

A case in point: the same company that supplies my editing software that I pay a yearly subscription to has also just started putting out free AI tools for cleaning up audio. When I was first told about this, I felt angry, and thought: they are doing me out of a job, but are still happy to take my money!

But the truth is that many of the editing processes I use already have a degree of AI. The manual labour aspect is still there — it takes about four hours to edit one hour of raw audio — but there are also tools of automation that help me along the way, like a plugin that removes ‘mouth clicks’ (the noise that saliva makes) and a compressor to duck down sudden peals of laughter, that might otherwise cause ‘clipping’ and a little bit of distortion.

Though there is not yet a button that I press and everything is done for me, which some people often suspect happens with technology-type jobs. Editing is a part of podcasting that is a bit of a grind and podcasters often find they don’t like doing it, as it involve repetitively listening to people say the same thing over and over again, to hear if you have successfully removed an ‘umm’ without leaving behind an audible ‘edit point’.

But if it wasn’t for sophisticated editing software that barely existed thirty years ago, I would not be able to do so much with sound. Back in the 70s, there was a mother and daughter dialogue editor team who could clean up noisy audio, just with recorded magnetic tape, removing mouth clicks and background industrial noise — and they won a special Oscar for it! That’s how specialised a skill this kind of work was previously.

Anyway, this year has been and still is a reckoning year for me with technology, as I’ve become friends with a guy from Germany who runs an AI company specifically focused on podcasting.

I had actually started a new job at TAFE, which is a state-run technical college in Australia, teaching podcast recording and editing to students – and I was about half-way through the semester, when I found on LinkedIn a tool that this guy had posted, where you could upload your finished podcast, and the tool would tell you if it was well mixed or not – and suggesting what you could do to get it up to an optimum level.

The tool was checking four things: 1) whether the podcast met the overall loudness target that podcasting platform ask for; 2) whether there were any peaks, like sudden laughs, that might cause distortion; 3) was there any background noise that might be distracting for the listener (like aircon, construction work outside, etc); and 4) was there a wide dynamic range that might present to the listener as fluctuating audio levels

I immediately thought this would be a great tool to share with the students, and I left a comment for the guy who had originally shared it, whose name was Patrick.

Patrick replied to me, and then we did a bit of back and forth on LinkedIn, and I discovered that although he was based in Germany, he had also studied Computer Science in Australia in the next state to me. Also, the town Patrick is from near the Swiss border, my Dad had travelled through in the 1970s as a backpacker, and he had written about it a diary that I have of his. I didn’t realise this until after Patrick and I started talking, as his town name sounded familiar to me!

Anyway, after talking to him, I learnt that the podcast grader tool was not the main thing he had worked on. His main platform LemonSpeak was for creating content to help you market your podcast. You upload your podcast episode or interview, and it creates a transcript, which is immediately useful for podcasters, particularly when it comes to editing.

But it also creates SEO, based on an analysis of the interview. It gives you suggestions of what heading to use as a podcast title that might improve its searchability in Apple Podcasts, as well as giving a summary of the episode, creating show notes, and also, social media posts that you can copy and paste, when sharing your podcast.

Although I had a great aversion to AI, particularly Chat-GPT, after seeing it in action, I also wanted to learn a bit more about it, and at least understand it. I know deep down, part of it was fuelled by the age-old fear of not wanting to be left behind. But on a more positive, altruistic side, I thought that it was something I could teach to my class, if I could see how it would work in with a normal podcast workflow.

But I was also afraid of betraying my principles if I go over to the darkside of AI. Already, I had a feeling that the big tech companies were an enemy of creative people, as they had a track record of not really treating creative work and the people who make it, with great integrity. There was an expectation that the little person would give their work away for free with the promise of wider exposure by the tech platform. But if the same thing was asked of the tech companies — share your source code with us and we’ll help give you exposure — it would be considered an outrageous ask!

But there was one more fear . . . a fear of my own ideas becoming ‘contaminated’ by someone else’s. Even worse than that, with AI, where I will not even know who’s idea it originally belongs to; there is no paper-trail of influence. As when you write something original for yourself, like a short story, you are often aware of who you are mimicking in a particular section of your story (oh, that’s like a Charles Bukowski line, and that syntax is similar to Salman Rushdie, etc). But if I was to feed my story into Chat-GPT, and get a better version of it — the polished version comes back, but I have no idea whose polish it actually is! What ingested material has contributed to improving my own work?

These are maybe extreme views about AI, and I think when Patrick reads this, he won’t be happy!

But there was another thing that I discovered when I used his software, that has sort of stayed with me. There was a project that a friend and I had been working on about the horror comics of the 1950s. It is a fictionalised story about two young guys working at a comics company in New York that have just started a new line of horror comics. The story opens with the narrator describing them hard at work in their run-down office, with their supervisor Walt watching over them — when they put down their pencils, and go for lunch at a nearby deli. Over a sandwich, they talk about how they feel about the work they are doing — how it’s giving one of them nightmares, and how the other is too ashamed to tell his parents where he is working and so has been lying to them. They also talk about some of the discrimination they feel, as one of them is Italian. They get so immersed in their conversation, that they suddenly see the time, and run back to their office — and that’s where the story, as it is so far, ends.

The audio of the story is a single person narration, mixed with some music and sound design, which I did over a Christmas break a few years ago. Just to test out Patrick’s software, I uploaded this audio story, and first had a read of the transcript it created, which turned out alright.

But it’s when I read the one-paragraph summary, that I was completely shocked.

I just could not believe how well the AI had analysed the entire story, and was able to pick out what the most significant themes were.

Not only that, it is had been able to draw them from seperate sections of the story, and then connect them up in the same sentence — the fears of the two boys, had been linked together in the one sentence:

“Our friend Leonard’s having nightmares from the horror stories we’re drawing, and Mike’s worried about his folks finding out what he really does for a living.”

This wasn’t stealing from some other greater-than-me, writer’s book or website. This was a deep analytical moment, where it was using my content, to actually analyse it and summarise it in a succinct way — and reflect it back to me, the way a friend might in conversation, who has listened to a story I’ve just told — and then objectively pick out the most salient aspects of it, in a way that I might not have noticed or been able to do on my own, as you lose some clarity when you are emotionally close to a piece of work.

I know lots of people reading this will go, duh! I’m making it seem more precious than it is — lots of people have had this experience using AI, which is why they use it — and it’s an obvious reason. The AI is acting as a ‘sounding board’ — not creating the content for you, but giving you an educated reflection of your own work, and offering you suggestions or ways of improving it — or even just the highlights you should focus on.

So, maybe in a way, it’s working as an ‘editor’ with you. And you can determine at what level you create your own work — and at what point, you want some editorial reflection.

And going back to podcasting now, when we make a podcast, usually, all of our effort goes into recording and producing it — and by the time we finish three or four episodes, we are burnt out, and just want to get it out there.

I actually have had the problem in the past, where I have co-produced a podcast on a specific topic, that was then uploaded to Apple Podcasts, and because we did not use a good keyword in the title, or in the metadata, the podcast does not turn up when we search for it by topic. And although I tried to improve it later, by changing the episode headings and description, it did not improve the situation.

So it is very important to carefully plan how you release your podcast, choosing the right name, titles, and sub-titles — as well as show notes and metadata — which all go towards making it searchable in podcast databases. This would be the final stage of making a podcast which I would think of as “marketing and distribution.”

Often, people spend so much time making a good podcast — or actually, any kind of creative work — just to release it into the world, and watch it sink into the sea of noise without a trace. It’s very disheartening! So, it’s important to try and learn and understand a little bit about this last stage of podcast production, and see if anything can help you with it.

I am still testing out Patrick’s software and working out how it fits in with a solid podcasting workflow.

But more than that: when I started talking to him, I thought that he was someone genuine, who was also smart enough to come up with something that would help people in the long run. If you are interested, have a look at his software and see for yourself.

Anyway, let’s see if this really is the Year of AI, or if like podcasting, it’s a hot topic for a while, that then simmers down to a slow boil.

Header image courtesy of Gerard Siderius via Unsplash

Leave a Reply

Discover more from FOTW AUDIO PRODUCTIONS

Subscribe now to keep reading and get access to the full archive.

Continue reading