Recording and publishing are the first steps in creating a podcast. Here’s why you should transcribe the audio into text and how to do it.

Vector illustration of graph and chart marketing growth
Image via emojoez.

It’s impossible, these days, to overlook the value and importance of SEO. Search Engine Optimization is all about growing the quality and quantity of website traffic by increasing the visibility of a website on search engines.

SEO means catering to search engine algorithms to help you push your content toward people who may be looking for it. It’s such an incredible asset to us here at PremiumBeat, and it could help you take your podcast to greater heights.

There are some excellent ways to improve the SEO of your podcast. Use a searchable keyword to create a site that will host your podcast (some services like Transistor will help you create a website and publish your content).

Create descriptions of each episode or even make some blog posts to layout everything discussed on the episode.

Also, be sure to include the keyword a few times in posts, link to relevant articles that deal with your topic, create some metadata, and include relatable images. All of these factors impact how search engines will rank your episode.

Importantly, however, transcribing your podcast and attaching the transcription to your episode is one of the most influential SEO factors you shouldn’t overlook. This will help your SEO ranking by giving sites like Google a good idea of everything discussed within the episode, expanding your searchability beyond something like a keyword.


What Is Transcription?

To put it simply, transcription is when you take the audio recording from a podcast or a video. A program transcribes (writes all the words they hear) in text form to have the written transcript of the content you record.

Seems simple enough, right?


Options for Your Audience

I know “listens” are the most critical metric for your podcast. But, some people may want to skim over your transcript before downloading or playing.

This is also invaluable to deaf and hard-of-hearing people, as listening may not be a viable option for them, and it shouldn’t count them out from enjoying your content.

I find that the more options you give your audience to engage in your material, the happier and more involved they’ll be across the board.


Critical Review

Take the transcription and read back over it. A transcription of your podcast, especially if you have timestamps included, is incredibly useful for reviewing your podcast.

It can give you an idea of where to insert chapter breaks within your podcast. And, honestly, it’s a good idea to have something in text form that you can review if you can’t go back and physically listen to your show.


Marketing Options

Vector illustration of likes, shares, and comments popping up on a mobile screen
Image via MaDedee.

In this age, sometimes, who decides to pick up your content and share it can make the difference between a podcast for your friends or the entire world.

When you have a show’s transcription, media outlets or blogs might pick up the text version of the interview for quotes or share it as content on their site for an exciting discussion.

They may not want to share the entire podcast or spend the time grabbing clips from your show, but if they can easily access the transcription, media members might help take your podcast to audiences you never had access to in the first place.


What Makes for a Good Transcription

Illustration of audio files conversion into text format
Image via BSD.

As far as things you can control, most transcription services state that you’ll want as little background noise as possible for the best results, and the fewer guests, the better.

Accents tend to be difficult for the software to complete the automated transcriptions and the humans working on your project. Languages other than English tend to come at a higher price point, and even thick accents may cost more.

As for the companies you might be enlisting, the best things to look for are accuracy, how they’ll export your transcription, turnaround times, and identifying speakers (for example, if you’re in an interview, it’ll separate the guest and host).

You also want to check if they’re automated or employ workers to verify quality, provide timestamps within your transcription, the cost per minute, and have mobile apps to allow access to your projects across multiple devices.

Keep an eye out for any hidden charges or extra fees that might rear their ugly heads on the backend, too.


How to Transcribe Your Podcast

Of course, you can always roll up your sleeves and handle transcribing your audio yourself. There’s a massive advantage here—you’ll know better than any other human or machine what you said in the podcast. The DIY method can, however, be highly laborious and time-consuming.

One time, I transcribed an hour-long interview that somehow ended up taking over five hours to transcribe due to bad audio.

Before heading down this route, you should at least check out a free option like Otter (listed below). This service offers a playback feature so that the AI can set a road map of the transcription out for you. Then, you can go back and make finalized corrections later.

There are a ton of options out there for online transcription services. So, I narrowed it down to the three best options that are still different enough from one another to warrant their specific category.

Online transcription companies will offer a few of the services listed below, and some offer all of them.

  1. Purely Automated: The AI listens to the audio you submitted and does the best it can to transcribe the contents. Most seem to average between 80-95% accuracy, depending on the outliers of background noises and how clearly the voices are speaking.
  2. Manual Transcription: These are transcribed and overseen by an actual person. Almost every company that offers this has an accuracy rate of 99% and is highly effective. The common turnaround times for these projects is usually 12-36 hours for a physical copy of your show.
  3. Strict Verbatim: This is where a person makes multiple passes at your audio to triple check that the transcription is perfect. These come with a 100% accuracy listing from the companies who offer it.

1. Rev: The Professional Choice

Logo with the word "Rev" in red letters on a white background
Image via rev.com.

Rev is becoming more and more of a household name for transcription services. They offer a multitude of services that are not only beneficial for your podcast, but could also prove to be an excellent asset for your video productions.

Rev offers affordable automated transcriptions for $.25/minute. The transcriptions aren’t guaranteed to be completely accurate, but they have an average turnaround of five minutes.

Rev also offers professional transcriptions for audio or video with human oversight for $1.25/minute that are guaranteed to be 99% accurate and have a turnaround of usually twelve hours.

Additionally, they have FCC- and ADA-compliant English captions and subtitles for $1.25/minute, foreign subtitles of over eight languages for $3-$7/minute, and foreign language written translations in over 35 languages for $.10/word.

This is a speedy and reliable service that many industry professionals currently rely on.


2. Scribie

Their automated service is more than 50% cheaper than Rev, with a cost of only $.10/minute. Scribie’s accuracy ranges from 80-95% and offers a thirty-minute turnaround to give you a Word document of whatever you submit.

You’re looking at a $.80/min price tag for their manual transcription service with a 36-hour turnaround and 99% accuracy. Their service also includes audio time coding and speaker tracking within that price.

Plus, for an up-charge of just $.50/min, you can get strict verbatim transcripts that should be completely accurate. So, if you select the tough verbatim option, you’re only looking at $.5/min more than Rev for an even more reliable transcription.


3. Otter – Best Free Option

Otter logo with blue letters on a white background
Image via Otter.

Otter is entirely automated, and the primary option gets you up to 600 minutes of transcription for absolutely free. The free service will likely include everything you could want for transcribing your podcast.

Download the app so you can access your transcriptions across multiple machines. The app even has a function to play the audio against what it has transcribed, so you can listen to the audio and double-check the service’s AI.

I used Otter in my interview with Watchmen DP Gregory Middleton. To be honest, I had to go back multiple times to triple-check Otter’s work on our interview, which was only two voices and no background noise. This ended up being very time-consuming and a chore, but it did get me the basic outline for absolutely no charge.

So, I learned that if something is going to print, I’d much rather pay for a professional service, but if I need a transcript of something for review or a simple hard copy Otter is a good option.

Also, if you were about to try to transcribe your podcast painstakingly, then definitely use this service first, and then go back to correct any mistakes. You will save yourself hours.


4. Simon Says

One of the newer entries into the transcription world is Simon Says. Simon Says is all AI-focused in its transcription methods with a heavy emphasis on video.

While most transcription services are “pay as you go,” Simon Says’ pricing structure is monthly due to their plugins offered for several editing programs and collaborative tools provided within the software.

That being said, if you’re interested at all in adding video to your podcast (I also recommend doing this as the popularity of video-form podcasts has skyrocketed in the past five years), I think Simon Says might be an excellent option. My only hesitation is the price, so be ready to pay a little bit more.

The pricing is as follows:

Screenshot of Simon Says price options
Image via Simon Says.

5. Google Docs

One weird method that I’ll admit I was skeptical of was using Google docs as a voice-to-text transcribe. The benefit of using this is to have this running while you record your podcast.

To set it up, go to Tools>Voice Typing.

Screenshot of how to select Voice Typing in Google Docs

I included this on the list because it’s the easiest service for people to access; however, that doesn’t mean it’s accurate or a good fit for your business/podcast.

If I had to compare this to the accuracy of another transcription service, I don’t think it would be a fair comparison by a long shot. But, if you want to save some money, why not just try Google Docs?


6. Happy Scribe

Like most of the services listed in this article, Happy Scribe has two ways of transcribing offered—transcribed by AI or by humans.

Their pricing is listed as such:

  • A.I. transcription is 85% accurate at $0.20/minute with a five minute turnaround
  • Human transcription is 99% accurate at $1.95/minute with a twenty-four hour turnaround

So, really, it comes down to you testing and playing with some of these AI transcription services to see which one works the best. You’ll also want to be sure and keep trying to capture the best possible audio quality you can get.

If you want some tips on recording good audio, check out our tutorials below.


Put em to the Test!

We decided to pit these three services in a head-to-head battle by having them all transcribe the same piece of audio. I purchased the automated and manual transcriptions done by a human from Rev and Scribie. I also used Otter’s free automatic transcription.

The excerpt I chose was a selection from Darin Bradley’s novel Chimpanzee. The passage was two minutes long, and I voiced it personally in a quiet room, directly into a microphone. I would say this is a high-quality audio recording, probably much more so than the likes that these services are used to receiving.


The Results: Automated Transcription

Closeup of a robot typing on a computer keyboard
Image via Mopic.

Every one of these companies was able to get me an automated transcription within three minutes, which is fantastic. Some of the main challenges each automated system faced were nouns, such as the name of the park, the character Sireen, or the spelling of the author’s name, Darin. They also struggled to interpret the audio grammatically, such as pauses or sentence breaks.

The winner here is Rev. I estimated that to fix the grammatical errors and words the AI got wrong would take me a total of four minutes, and their transcription cost me $1.00 (they have a minimum set fee of $1.00).

Now, this means that it’d take me four minutes to fix two minutes worth of an audio transcription, so please bear that in mind if you intend to submit something that could be hours long. But, it was able to lay down the foundation in a way, for just a buck, that probably saved me thirty minutes, which I deem well worth the cost.

The completely free Otter came exceptionally close behind. It churned out the audio as one giant block of text, but slightly more erroneous than the Rev transcription. I should note that it was the only automated service actually to nail “excerpt” instead of typing “exert.”

I’d estimate it took me about five minutes to edit this transcription. Since the service is free and competitively accurate with a paid service, I recommend using Otter for automated transcriptions.

Coming in last was the paid automated service from Scribie. It seemed to work with breaking up sentences and comprehending the grammatical translation of a spoken voice. But, compared to the other two, Scribie struggled the most.


The Results: Manual Transcription

Otter doesn’t offer a manual transcription service, so we have a direct head-to-head competition between Rev and Scribie for these paid services.

First off, I feel like they let the automated service tackle the transcription before sending over the final files to a human to work on. (This tactic makes sense, and it’s precisely what I recommend you do if you’re trying to transcribe a podcast by yourself.)

I say this because it’s one of the only ways I can wrap my head around one of Rev’s only mistakes. They nailed the spelling of the author’s name, Darin Bradley, the first time, but then later in the same line, spelled it differently.

Screenshot of an error made from a transcription
From the manual transcription by Rev.

Both companies have a 99% accuracy rate on their manual transcriptions, and they did, in fact, deliver— pretty much nailing the transcription. Rev could spell the character Sireen’s name correctly while Scribie still thought the name was Irene. This type of mistake is familiar to any transcription service. But, really, we are splitting (fictional character) hairs because both services were almost flawless.

So, the best way to compare these two services and know which one to recommend lies beyond the transcription. The Scribie service costs roughly 2/3rds of what Rev costs, so if you’re submitting lengthy podcasts, this can save you quite a bit.

Scribie also included timestamps, which I find extremely valuable. Timestamps are especially useful if you add chapter breaks within your podcast episodes. I recommend doing this if you haven’t before. It’s much quicker than scrubbing through the edited audio to place your chapter. You search the transcription for the keyword for what you were talking about, and you have a timestamp for the exact spot you’re looking for.

Rev, however, does not include timestamps. And, despite only having one speaker, they broke the audio into nine paragraphs. This is unnecessary and actually involves more editing just to remove those breaks.

What Rev does offer—and this is big—is the ability (like Otter) to playback the audio at different speeds while the service highlights the exact word it transcribed to the audio. Playback makes the editing process so much easier.

Rev also allows you to take notes in the margin beside the text of the transcription and offers the ability to download the document how you please. Rev’s overall UX and design is much better than Scribie and, honestly, more aesthetically pleasing.

Scribie offers document downloads only, but in every kind of file you could want.

Gradient style sound wave in form of piles
Image via Sky vectors.

Rev dominated with an incredible thirty-three-minute turnaround for the transcription, as far as turnaround time. Meanwhile, Scribie took about thirty hours to transcribe the same piece of audio.

This is a massive difference and something to note if you’re in a crunch and can’t decide between the services.


The Winner: Scribie!

Ultimately, for precisely transcribing a podcast, I recommend Scribie, after testing the two. It’s more affordable, and since the transcription is 99% correct from both services, you don’t need to do much editing. Just paste the text directly to your site.

Rev’s more advanced UX and editing features would win out if I needed audio transcribed for a published article. But, strictly speaking about podcast transcriptions, I’m going to pocket the extra cash I’d save and call it a day.


Here is the written excerpt that I read:

An excerpt from Chimpanzee by Darin Bradley, copyright 2014 Darin Bradley. (Reprinted with permission.)

They didn’t always shoot people.

In the beginning, when civic offenders were conscripted into the Homeland Renewal Project, they were monitored only by crew chiefs. Hourly employees with managerial experience. People used time sheets. Signatures. They carried their meals with them in paper bags. But when the crews organized, when they started collecting protection money, to keep you from harm at the hands of other people on the crew—gang affiliations, race riots—workers disappeared. The crews became micro-politics. They followed the examples of the mobs in the larger cities, looking for someone to blame. They carried weapons in their lunch bags. Renewal became a safe opportunity to sell your contraband, in your standard-issue, reflective red jumpsuit.

They deputized the crew chiefs. Gave them shotguns. At first, they tried non-lethal rounds, but those caused an uprising. So, they killed a few. It no longer makes the news.

The lien against my education is twenty-three pages long. It contains abbreviated transcripts of my yearly audits, when I, like every other student borrower, sat down in the loan therapist’s office on campus and let him index my cognitive chemical tendencies, my entrained associations, my affective self-models, which source most of my intellect.

It’s important to remember that we are not “in charge.” You don’t own your body, it owns you. It’s the same thing.

You don’t own your education. It’s on loan until you pay it off.

I am good at being unemployed. I can act interested and positive when Sireen, my wife, calls to check on me in the middle of the day. She stays concerned about my moods. About all of this. I am good at walking downtown—from our borough at the other end of the city because Sireen and I lease only one car, which she needs for the job she still has. I know which blocks are the most vacant, so to avoid them. I know whom to talk to. I know which times of day are safe for spending an hour in Sentinel Park, in the heart of downtown, doing nothing but being a guy with a coffee sitting in a park.


Are you looking for even more tips on getting your podcast out globally? Check these out:

Cover image via Anselia.