Why manually transcribe things by hand when you can use machine-learning algorithms to get the job done? Let’s explore some options.

In my last video, I tested out a tool that uses AI tech to perform basic object removals. The results were quite impressive.

Today I’m testing another tool from the same company (RunwayML) that’s all about creating subtitles, which is a bit more beneficial for the masses. And, just like last time, I’m going to be pushing the tech to the limit to see just how well it performs. 


Why Use Subtitles?

While not everyone wants to perform object removals, almost everyone wants to create subtitles for their videos. Don’t believe me? Well, almost everyone.

A recent survey of US customers shows that 92% view videos with the sound off on mobile. It makes sense when you think of all those people waiting in a line somewhere, mindlessly scrolling their social feeds.

Imagine the millions of people in bed, ignoring their significant other via a mobile screen. Or, the number of teenagers in a classroom ignoring a teacher, vegging out to a video.

Man explaining the importance of subtitles

There are plenty of scenarios where having the sound on is inappropriate, and subtitles are here to help. The survey also shows that 80% of consumers are more likely to watch a video all the way through when subtitles are available. Those are pretty huge numbers.

Text can help grab a viewer’s attention and keep them interested. It can add context and help get a viewer locked into your message quickly. Most folks might swipe on by without text, never to return.

So, how can you add subtitles to a video?


The Technology is . . . There? 

Luckily, technology has finally reached a place where creators are no longer required to pay considerable fees to get a transcription of a video made—or, even worse, attempt to transcribe everything by hand themselves. Good luck with that. 

How to drag and drop video files

As if that wasn’t cool enough, auto-captioning for live streams is currently being rolled out to English channels.

These days, YouTube will automatically caption a video soon after it’s uploaded. What’s more, this feature is available in over ten languages.

Caption speaking to what is visually happening on screen

How is this possible? Well, Google uses machine-learning algorithms for their auto-captioning feature. It’s still imperfect, however, as Google writes in their documentation:

…the quality of the captions may vary. We encourage creators to add professional captions first. YouTube is constantly improving its speech recognition technology. However, automatic captions might misrepresent the spoken content due to mispronunciations, accents, dialects, or background noise. You should always review automatic captions and edit any parts that haven’t been properly transcribed.

– Google

Now that you’ve learned a bit about subtitles, let’s see how to create some. 


Subtitles vs. Captions

Before people start sending angry letters, let me get one thing straight: captions and subtitles are not the same thing.

Captions are specifically designed for the hearing impaired. They transcribe what is on screen, even describing sounds and various visual occurrences in the text.

Subtitles translate the spoken word. Captions give much more context to a scene than subtitles do.

Screenshot of how to edit captions

The Burn-In vs. SRT Files

Before actually creating the subtitles, you’ll first want to think about the viewer’s experience.

How will they see the subtitles? Will they have to click on a button to activate them? Can you force them to show up on-screen each time for every viewer?

This can be challenging if you share a video across multiple social media platforms. 

Screenshot of how to select closed captions

While there are various subtitle formats, the main distinction for the layman to understand is how you can “burn” some subtitles into your video. This means that the text elements will always be on screen. It’s permanent. This differs from the subtitles that you can toggle on and off, as you see available on YouTube.

With such a vast number of people watching videos sans sound, creators will benefit from the burn-in method. Let’s jump into Runway and see how the burn-in process works.


How to Create Subtitles in Runway

Step 1: Upload a Clip

Screenshot of how to upload a clip

The cool thing about Runway is that it’s a browser-based tool, allowing users to create subtitles with nothing other than a basic Internet connection.

Step 2: Generate the Subtitles

Screenshot of choosing a language for your subtitles

Once the clip is inside Runway’s video editor, please select it and navigate to the Subtitle tab. Select the flavor of English (or Spanish), and press Start.

Like Google, Runway uses machine-learning algorithms to generate the subtitles automatically. In my experience, the generation process was faster than in real-time.

Step 3: Edit and Refine

Screenshot of how to edit subtitles

Once generated, the subtitles show up on the track above your video layer. They can be edited and moved on the timeline as with any other clip.

To customize the look of the subtitles, select the clip and navigate to the Subtitle tab. Users can change the font, weight, size, fill, and background colors.

Step 4: Export

Screenshot of how to export a video

To burn the subtitles into the video, export the clip. To generate an SRT file, select the drop-down menu to the right of the Subtitle Properties.


Tongue-Twisting the Tech

My first test of Runway’s subtitle tool was nearly flawless, with only one word mistakenly being duplicated. Unlike YouTube’s auto-captioning feature, Runway adds punctuation, giving the subtitles a clean look. The results were awe-inspiring.

Indeed, I expected the results to be near perfect, as the audio quality from my professional microphone is quite good. There were no real background noises or distortions in the recording. I also didn’t use any slang and enunciated clearly in my neutral American accent.

To put the machine-learning tech to task, I decided to record several tongue-twisters, as well as sentences with semi-difficult vocabulary. While the punctuation looks nice on Runway, this was one area where the algorithms couldn’t follow along.

In one instance, question marks randomly peppered the sentence. This indeed tripped up Runway’s subtitle generator on multiple occasions.

Screenshot of how the captions were misinterpreted

Nevertheless, the tech did surprisingly well, overall.

Screenshot showing the captions interpreted correctly

I’m interested to see how advanced this tech will be in just five years. I imagine the machine-learning algorithms will be able to generate subtitles in real-time and provide instant translation in nearly any language.

That’s what I love about technology. It keeps getting better.


A few more fun ones for you:

Cover image via Vector Point Studio.