How to use AI Vocabularies
See our video tutorial for a quick overview.
Introduction
AI captioning
Clevercast supports the captioning of live streams, by using AI for automatic speech-to-text conversion and text-to-text translations. This way, you can automatically add closed captions to your live stream. Optionally, the captions can be corrected by a human editor before they are shown in the video player (and auto-translated into other languages).
AI captioning settings
AI captioning is done largely automatically (see our getting started guide). On the 'Event' > 'Caption Languages' page, you choose the speech-to-text and text-to-text languages to be used for captioning your live stream.
On the 'Account' > 'Settings' page you can find general settings to filter profanity and disfluency:
- Filter profanity: if set, the speech to text engine will be instructed to filter out profanity (e.g.
'Bullshit'
will become'B*********'
). Note that this filter will not be applied if you set the 'Use language model for alternating languages (lower accuracy)' option while selecting your speech-to-text language.
AI vocabularies
To improve the AI speech-to-text conversion and/or text-to-text translation, Clevercast lets you create an AI Vocabulary for each event. A vocabulary lets you add terms (names, acronyms, industry jargon, technical phrases ...) that may appear in the live stream, so they will be correctly rendered as part of the closed captions. A vocabulary can be updated before and during the live stream.
Unless multiple languages are spoken in the live stream, you only need to create a vocabulary for the speech-to-text language. Otherwise, you can also create vocabularies for the languages that you will set as speech-to-text language during the live stream. See the multiple speech-to-text languages section below for details.
Changes made during the live stream will still be applied to the AI captioning engine. Note that since the live stream has a limited delay, it may take some time before you see changes made during the live stream.
When using a real-time corrector (optional), some of the corrections may also lead to the vocabulary being updated (to avoid a corrector having to make the same correction twice). See the correction manual for more info.
Types of vocabulary data
A vocabulary consists of a number of terms with different types of (meta)data. The presence of some metadata types depends on the use of text-to-text translations (no translations) and on whether you select the 'Use language model for alternating languages' (no boost option).
Note: a term in the vocabulary may contain up to 160 characters. If you need more characters, you should probably split it up into several terms (more effective for AI). If you have longer terms that shouldn't be split up, please contact us.
Clear Words from AI Captioning Output
This function provides a list of words that will be filtered out from the speech-to-text output. You can use this, for example, to remove filler or hesitation words (e.g. 'uh', 'um', 'umh'). The words are removed regardless of capitalization, but the letters must match exactly.
Speech-to-text conversion: boost and sounds-like
The vocabulary allows you to boost certain terms for a better auditory recognition when converting speech to text. Additionally, a list of terms that sounds alike lets you provide alternative pronunciations of the term.
Note: if you select 'Use language model for alternating languages' for your speech-to-text language, these options are not available.
Autocorrect words after speech-to-text conversion
This allows you to specify multiple terms to be replaced by the vocabulary term. E.g. if the vocabulary term is Clevercast
, you could specify clever cast
and klever kast
as search terms to make sure the company name is spelt correctly.
Keep the following in mind:
- Vocabulary terms can only be capitalised in a single way. E.g. once you've added
FFmpeg
to the vocabulary, it is no longer possible to addffmpeg
orFFMpeg
. - Search terms are case-insensitive. E.g. the search term
ff mpeg
will also result inFF mpeg
andFF MPEG
to be replaced.
The capitalisation of a vocabulary term is always applied exactly, regardless of its position in a sentence. The only exception is when a sentence starts with a lowercase term.
- If a term contains capital letter(s), they will always be capitalised no matter where in the sentence.
- If a term starts with a lowercase letter, it will only be capitalised at the start of a sentence.
Autocorrection takes place before translation: this way, any corrections will also be part of the source of the text-to-text translation.
Text-to-text conversion: custom translations
For each of the terms in the vocabulary, you can add a custom translation for each automatically translated closed caption language. For example, you can use this to prevent certain words from getting translated (e.g. the name of a brand or product).
Managing event vocabularies
A vocabulary is available for each event in Clevercast with AI captioning. You can find a link to it on the 'Caption Languages' tab of an Event (or in the Live Management section of a webinar).
The vocabulary page itself is protected by a secure token, which allows you to let external users edit the vocabulary (just pass them the full URL with the secure token).
A vocabulary consists of a list of terms with metadata for speech-to-text boosting, custom translations and search-and-replace terms (called autocorrect). But not every vocabulary will contain all of these sections:
- speech-to-text boosting is not available for events using the 'Use language model for alternating languages' option
- custom translations are only available if your event contains text-to-text translation languages
You can add new terms to a vocabulary at any time. If the event has already started, they will still be applied.
By clicking on the 'Edit' link, you can change the metadata for a certain term. If you need to change the term itself, you must delete the term and create a new one. The 'Clear Vocabulary' button lets you remove all terms from an existing vocabulary.
Keep in mind that a term can only exist once in a vocabulary. You won't be able to add a term that differs only in lower and upper case from an already existing term.
Events with multiple speech-to-text languages
In most cases, all speakers will speak the same language during a live stream. In that case, it is sufficient to create a vocabulary for the (single) speech-to-text language.
In some cases, a live stream may have speakers in different languages. In that case, Clevercast allows you to change the speech-to-text language during the live stream. For such cases, the 'Change Language' dropdown button at the top of the Vocabulary page can be used. It lets you set the vocabulary for a text-to-text translation language that will be set as speech-to-text language at some point during the live stream. When this happens, this language's vocabulary will automatically be used by Clevercast.
For example, by selecting 'French', you will be able to edit the vocabulary terms that will be used when the speech-to-text language is changed to French during the event. You can add terms to the vocabulary that only apply for French speech-to-text conversion (= terms without custom translations) or terms that apply for speech-to-text conversion in all languages (= terms with custom translations). Terms that include text-to-text translations are automatically added to the vocabulary of every language.
Please note: editing the vocabulary for multiple languages is only necessary when you are planning to manually update the speech-to-text language during a live stream. It has no impact if the speech-to-text language is not updated during the event, even if you have selected the 'Use language model for alternating languages' option.
Adding and removing text-to-text translation languages
We recommend that you first set the speech-to-text language and add all text-to-text languages for your event, before starting to edit its vocabulary.
- If you add an extra text-to-text language to the Event settings, any existing terms with custom translations will also get a custom translation for the new language (= the term itself will be set as its custom translation for the new language).
- If you delete a text-to-text language from the Event's settings, all custom translations for that language will automatically be deleted from the vocabulary.
If you have already created a vocabulary and need to change the event's languages (e.g. set a different speech-to-text language), you can avoid (meta)data loss by exporting the vocabulary. After you have changed the languages, you can then import the vocabulary again.
Exporting and importing vocabularies
Account vocabularies
Use the 'Account' > 'AI Vocabularies' menu to create and manage account vocabularies. Their only purpose is to store vocabulary terms and metadata, so you can reuse them in multiple events.
An account vocabulary has a name, a speech-to-text language and (optionally) text-to-text language(s). These can be changed at any time.
- Changing the name or speech-to-text language does not affect the terms in the vocabulary.
- Changing the text-to-text language has the same impact on custom translations as with an event vocabulary (see above)
You can add and edit terms in the same way as for event vocabularies.
Vocabulary export
A Clevercast admin can export an event vocabulary to an account vocabulary. This can be done in two different ways.
- By creating a new account vocabulary which is completely identical to the event vocabulary, including the speech-to-text and text-to-text languages. To do this, fill in the name of the new account vocabulary.
- By merging the terms of the event vocabulary into an existing account vocabulary. To do this, select the account vocabulary (to which you want to add new terms).
If you want to merge the terms into an existing account vocabulary, the account vocabulary should preferably contain the same text-to-text translation languages. If this is the case, the export process is simple:
- terms that don't exist in the account vocabulary will be added, including their metadata.
- terms that already exist in the account vocabulary remain unchanged. The metadata of these terms is not updated.
The speech-to-text language must be the same in the event and account vocabularies. If necessary, you can (temporarily) change the speech-to-text language in the account vocabulary, and change it back after the export has been completed.
If the account vocabulary contains text-to-text languages that are not in the event vocabulary, a warning will be displayed. If you decide to go ahead, only custom translations for corresponding languages are added (for a language that is missing, the term itself is set as custom translation).
Vocabulary import
The terms of an account vocabulary can be imported into an event vocabulary.
The import process works in the same way as the export process:
- terms that don't exist in the event vocabulary are added, including their metadata. This also happens if the speech-to-text languages differ.
- terms that already exist in the event vocabulary remain unchanged*. The metadata of these terms is not updated.
The speech-to-text language must be the same in the event and account vocabularies. If necessary, you can (temporarily) change the speech-to-text language in the account vocabulary, and change it back after the export has been completed.
Ideally, the imported vocabulary should have the same text-to-text languages as the event vocabulary. If not, the import dialog will display a warning. Like the export process, an import will only add custom translations for languages that are available in both vocabularies. If the event vocabulary contains text-to-text languages that are not in the account vocabulary, the term itself is set as custom translation for those languages.
We strongly recommend to always open the event vocabulary afterwards, and go over the imported terms to check if everything is correct.