Skip to content

Introduction to live streams with closed captions


Clevercast supports closed captions for live streams, if this is included in your plan.

You can add (multilingual) closed captions to your live stream in several ways:

  • Automatic captions through speech-to-text conversion (AI), with optional real-time correction : the audio in the live stream is automatically converted into closed captions through Artificial Intelligence (AI). Optionally, the captions can be corrected by a human editor before they are shown in the video player (and auto-translated into other languages).
  • Automatic multilingual captions through machine translation (AI): closed captions resulting from speech-to-text conversion and manual transcription can be automatically translated into closed captions in multiple languages.
  • Transcription in real time: humans can use Clevercast to type the transcription and/or translation in real time using a browser, resulting in closed captions in the video player.

Live streams with AI generated captions will have a slightly higher latency. This is necessary to render the closed captions as accurately as possible. During the additional delay, the AI engine is able to ingest the entire context at the time of conversion, which in turn leads to it interpreting the words correctly and then placing them in a sentence. Without this delay, conversion would often have to occur before the speaker is finished with a sentence, leading to incorrect choices of words and phrases. This also makes it possible to manually correct the captions before they are auto-translated or shown in the stream.

When using transcription, with or without auto-translation, there is only the normal HTTP Live Streaming delay of about 18-30 seconds.

All types of closed captions can be combined with real-time audio translations through Translate@Home. Any number of languages is possible.

Note that an event with closed captions currently has a maximum duration of 24 consecutive hours. The closed captions will keep appearing for 24 hours after you've set the event to Preview or Started. If your event spans multiple days, you should set the event to Inactive or Ended during breaks.

Accuracy of closed captions

For closed captions resulting from speech-to-text conversion (directly or translated), Clevercast has a much higher accuracy than other solutions. There are two main factors that have an impact on the accuracy:

  • The speaker's language: accuracy is higher for more commonly used languages (e.g. English, Spanish, French, German, Italian, Portuguese, Japanese, Chinese, Dutch..).
  • Multiple languages spoken: accuracy of the captions will be significantly lower. In this case, we strongly recommend using real-time correction to make sure the AI generated captions are accurate.

If closed captions are generated through manual transcription, the accuracy depends on the human interpreter. If she does a good job, the source of the automatic translations will also be good, which will have a favorable impact on the accuracy of the auto-translated captions.

Speech-to-text: correction interface and stream delay

Our interface for real time correction allows you to edit the AI generated captions just before they are shown in the video player, so possible errors can still be corrected. See the Speech-To-Text Correction manual on how to use this interface.

Chat with transcription and correction rooms

Clevercast allows event managers to chat with transcribers or correctors while the event status is Preview, Started or Paused. On the event page, click the Manage Language Rooms’ button. When you get to the page, first press the ‘Connect’ button to connect to the rooms and enter your name. You can then chat with people in separate rooms or send messages to all rooms at once.

The player on this page has the same delay as the video player in the transcription room (= no delay) or in the correction room (= delay after speech-to-text conversion). It currently doesn’t allow you to see closed captions, though. In order to check the closed captions you must use the Preview player on the event page, where the live stream with closed captions has a delay of about one minute.