Skip to content

AI Vocabularies

Introduction

AI captioning

Clevercast supports the captioning of live streams, by using AI for automatic speech-to-text conversion and text-to-text translations. This way, you can automatically add closed captions to your live stream. Optionally, the captions can be corrected by a human editor before they are shown in the video player (and auto-translated into other languages).

AI captioning settings

AI captioning is done largely automatically (see our getting started guide). On the 'Event' > 'Caption Languages' page, you choose the speech-to-text and text-to-text languages to be used for captioning your live stream.

On the 'Account' > 'Settings' page you can find general settings to filter profanity and disfluency:

  • Filter profanity: if set, the speech to text engine will be instructed to filter out profanity (e.g. 'Bullshit' will become 'B*********'). Note that this filter will not be applied if you have a mixed floor.

AI vocabularies

To improve the AI speech-to-text conversion and/or text-to-text translation, Clevercast lets you create AI captioning vocabularies. A vocabulary lets you add terms (names, acronyms, industry jargon, technical phrases ...) that may appear in the live stream, so they will be correctly rendered as part of the closed captions. A vocabulary can be updated before and during the live stream.

Changes made during the live stream will still be applied to the AI captioning engine. Note that since the live stream has a limited delay, it may take some time before you see changes made during the live stream.

When using a real-time corrector (optional), some of the corrections may also lead to the vocabulary being updated (to avoid a corrector having to make the same correction twice). See the correction manual for more info.

Types of vocabulary data

A vocabulary consists of a number of terms with different types of (meta)data. The presence of some metadata types depends on the AI engine being used, which is different if you select mixed floor, and the use of text-to-text translations.

Note: a term in the vocabulary may contain up to 160 characters. If you need more characters, you should probably split it up into several terms (more effective for AI). If you have longer terms that shouldn't be split up, please contact us.

Clear Words from AI Captioning Output

This function provides a list of words that will be filtered out from the speech-to-text output. You can use this, for example, to remove filler or hesitation words (e.g. 'uh', 'um', 'umh'). The words are removed regardless of capitalization, but the letters must match exactly.

Speech-to-text conversion: boost and sounds-like

Except for events with a mixed floor (multiple languages being spoken in the live stream), the vocabulary allows you to boost certain terms for a better auditory recognition when converting speech to text. Additionally, a list of terms that sounds alike lets you provide alternative pronunciations of the term.

Text-to-text conversion: custom translations

For each of the terms in the vocabulary, you can add a custom translation for each automatically translated closed caption language. For example, you can use this to prevent certain words from getting translated (e.g. the name of a brand or product).

Autocorrect words during post-processing

This allows you to specify multiple terms to be replaced by the vocabulary term. E.g. if the vocabulary term is Clevercast, you could specify clever cast and klever kast as search terms to make sure the company name is spelt correctly.

Keep the following in mind:

  • Vocabulary terms can only be capitalised in a single way. E.g. once you've added FFmpeg to the vocabulary, it is no longer possible to add ffmpeg or FFMpeg.
  • Search terms are case-insensitive. E.g. the search term ff mpeg will also result in FF mpeg and FF MPEG to be replaced.

The capitalisation of a vocabulary term is always applied exactly, regardless of its position in a sentence. The only exception is when a sentence starts with a lowercase term.

  • If a term contains capital letter(s), they will always be capitalised no matter where in the sentence.
  • If a term starts with a lowercase letter, it will only be capitalised at the start of a sentence.

Managing event vocabularies

A vocabulary is available for each event in Clevercast with AI captioning. You can find a link to it on the 'Caption Languages' tab of an Event (or in the Live Management section of a webinar).

Link to the vocabulary interface

The vocabulary page itself is protected by a secure token, which allows you to let external users edit the vocabulary (just pass them the full URL with the secure token).

A vocabulary consists of a list of terms with metadata for speech-to-text boosting, custom translations and search-and-replace terms (called autocorrect). But not every vocabulary will contain all of these sections:

  • speech-to-text boosting is not available for events with a mixed floor
  • custom translations are only available if your event contains text-to-text translation languages

The vocabulary overview

You can add new terms to a vocabulary at any time. If the event has already started, they will still be applied.

By clicking on the 'Edit' link, you can change the metadata for a certain term. If you need to change the term itself, you must delete the term and create a new one. The 'Clear Vocabulary' button lets you remove all terms from an existing vocabulary.

Keep in mind that a term can only exist once in a vocabulary. You won't be able to add a term that differs only in lower and upper case from an already existing term.

Adding and removing text-to-text translation languages

We recommend that you first set the speech-to-text language and add all text-to-text languages for your event, before starting to edit its vocabulary.

  • If you add an extra text-to-text language to the Event settings, any existing terms with custom translations will also get a custom translation for the new language (= the term itself will be set as its custom translation for the new language).
  • If you delete a text-to-text language from the Event's settings, all custom translations for that language will automatically be deleted from the vocabulary.

If you have already created a vocabulary and need to change the event's languages (e.g. set a different speech-to-text language), you can avoid (meta)data loss by exporting the vocabulary. After you have changed the languages, you can then import the vocabulary again.

Exporting and importing vocabularies

Account vocabularies

Use the 'Account' > 'AI Vocabularies' menu to create and manage account vocabularies. Their only purpose is to store vocabulary terms and metadata, so you can reuse them in multiple events.

List of AI account vocabulary

An account vocabulary has a name, a speech-to-text language and (optionally) text-to-text language(s). These can be changed at any time.

  • Changing the name or speech-to-text language does not affect the terms in the vocabulary.
  • Changing the text-to-text language has the same impact on custom translations as with an event vocabulary (see above)

You can add and edit terms in the same way as for event vocabularies.

Vocabulary export

A Clevercast admin can export an event vocabulary to an account vocabulary. This can be done in two different ways.

  • By creating a new account vocabulary which is completely identical to the event vocabulary, including the speech-to-text and text-to-text languages. To do this, fill in the name of the new account vocabulary.
  • By merging the terms of the event vocabulary into an existing account vocabulary. To do this, select the account vocabulary (to which you want to add new terms).

Export vocabulary dialog

If you want to merge the terms into an existing account vocabulary, the account vocabulary should preferably contain the same text-to-text translation languages. If this is the case, the export process is simple:

  • terms that don't exist in the account vocabulary will be added, including their metadata.
  • terms that already exist in the account vocabulary remain unchanged. The metadata of these terms is not updated.

If the account vocabulary contains text-to-text languages that are not in the event vocabulary, a warning will be displayed. If you decide to go ahead, only custom translations for corresponding languages are added (for a language that is missing, the term itself is set as custom translation).

A warning is also shown if the speech-to-text language differs. However, this has no effect on the export process: all terms in the event vocabulary are exported, regardless of its language.

Vocabulary import

The terms of an account vocabulary can be imported into an event vocabulary. Ideally, the imported vocabulary should have the same speech-to-text and text-to-text languages as the event vocabulary. If not, the import dialog will display a warning.

Import dialog warning

The import process works in the same way as the export process:

  • terms that don't exist in the event vocabulary are added, including their metadata. This also happens if the speech-to-text languages differ.
  • terms that already exist in the event vocabulary remain unchanged*. The metadata of these terms is not updated.

Like the export process, an import will only add custom translations for languages that are available in both vocabularies. If the event vocabulary contains text-to-text languages that are not in the account vocabulary, the term itself is set as custom translation for those languages.

We strongly recommend to always open the event vocabulary afterwards, and go over the imported terms to check if everything is correct.