How to use Vocabularies
This section is part of the AI Live Streaming Manual.
Introduction
To improve the AI generation of closed captions and speech translations, Clevercast lets you create a vocabulary for each event. A vocabulary lets you add terms (names, acronyms, industry jargon, technical phrases ...) that may appear in the live stream, so they will be correctly rendered and pronounced. A vocabulary can be updated before and during the live stream.
Changes made during the live stream will still be applied to the AI captioning engine. Note that since the live stream is served with a delay, you won't immediately see or hear the effect of vocabulary changes in the video player.
When the event has a real-time corrector (optional), some of the corrections may also lead to the vocabulary being updated (to avoid a corrector having to make the same correction twice). See the correction manual for more info.
Managing event vocabularies
The Vocabulary interface
A vocabulary is attached to each event in Clevercast with AI captions or speech. You can find a link to it on the 'Caption Languages' tab of the Event page (or in the Live Management section for a webinar).
A vocabulary consists of a number of terms with different types of (meta)data, to make sure the terms are correctly displayed, translated and pronounced.
Access to the vocabulary page is protected by a secure token, which allows you to let external users edit the vocabulary (just pass them the full URL with the secure token).
Terms can be added to a vocabulary at any time. If the event has already started, they will still be applied.
By clicking on the 'Edit' link, you can change the metadata for a certain term. If you need to change the term itself, you must delete the term and create a new one. The 'Clear Vocabulary' button lets you remove all terms from an existing vocabulary.
Keep in mind that a term can only exist once in a vocabulary. You won't be able to add a term that differs only in lower and upper case from an already existing term.
Types of vocabulary data
A vocabulary consists of a list of terms with metadata for helping a language model during speech-to-text conversion and translation, and auto-correcting wrong terms afterwards.
Note: a term in the vocabulary may contain up to 160 characters. If you need more characters, you should probably split it up into several terms (more effective for AI).
Term or phrase
A term can be a word or phrase. In general, it is best to add separate words, unless words are commonly used together (e.g. 'Raspberry Pi', 'REST API').
For unknown personal names, we recommend adding both the last name and the combination of first and last name as a separate term (e.g. both 'Mikkelsen' and 'Barney Mikkelsen'). If a first name is rare or has a specific pronunciation, it too can be added as a term.
TIP: if your event has a list of speakers, make sure to add them to the vocabulary (for example: 'Barney', 'Mikkelsen' and 'Barney Mikkelsen')
Adding (complete or partial) sentences is not useful, as there is little chance that the speaker will utter the exact same sentence. Even for a scripted event, the speech-to-text conversion may result in sentences with a different structure.
Note: AI models are aware of well-known geographical names, major brands, names of public figures, etc. So, it makes little sense to add a list of European countries to a vocabulary. However, it's best to add the name of a local municipality or its mayor (for example).
Speech-to-text conversion: boost and sounds-like
The vocabulary allows you to boost certain terms for a better auditory recognition when converting speech to text. For most terms that differ from everyday language it is advisable to 'boost' them. If you see that a term is wrongly converted (e.g. during testing or while correcting) you could pass the wrong term along as a 'sounds-like' term.
Additionally, a list of terms that sounds alike lets you provide alternative pronunciations of the term, so the AI can recognize the term. This is mostly useful when a speaker (with an accent) is pronouncing names (e.g. local brand names, personal names).
Note: for some lesser-used languages, these options may not be available.
Autocorrect words after speech-to-text conversion
This allows you to specify multiple terms to be replaced by the vocabulary term. E.g. if the vocabulary term is Clevercast
, you could specify clever cast
and clevercat
as search terms to make sure the company name is spelt correctly, and the correct letter(s) are capitalized.
Keep the following in mind:
- Vocabulary terms can only be capitalised in a single way. E.g. once you've added
FFmpeg
to the vocabulary, it is no longer possible to addffmpeg
orFFMpeg
. - Search terms are case-insensitive. E.g. the search term
ff mpeg
will also result inFF mpeg
andFF MPEG
to be replaced.
The capitalisation of a vocabulary term is always applied exactly, regardless of its position in a sentence. The only exception is when a sentence starts with a lowercase term.
- If a term contains capital letter(s), they will always be capitalised no matter where in the sentence.
- If a term starts with a lowercase letter, it will only be capitalised at the start of a sentence.
Autocorrection takes place before translation: this way, any corrections will also be part of the source of the text-to-text translation.
Text-to-text conversion: custom translations
For each of the terms in the vocabulary, you can add a custom translation in each of the languages.
For example, you can use this to prevent certain words from getting translated by the language model (e.g. the name of a brand or product). For example, if the name of a brand is 'Cherry Garden', the language model shouldn't try to translate it.
Additional features
Clear Words from AI Captioning Output
This function lets you provide a list of words that will be filtered out from the speech-to-text output. You can use this, for example, to remove filler or hesitation words (e.g. 'uh', 'um', 'umh'). The words are removed regardless of capitalization, but the letters must match exactly.
Import Terms from Text Analysis
This feature uses AI to analyze a text you provide and retrieve terms that are relevant for a vocabulary, like names, brands, abbrevations and technical terms. Just copy a relevant text to the input field and press Submit.
A list of the identified terms is displayed, with translations. You can then choose which of these terms should be added to the vocabulary.
When you then press Submit, Clevercast adds the checked terms to the vocabulary.
Vocabularies with multiple speech-to-text languages
In most cases, all speakers will speak the same language during a live stream. In that case, it is sufficient to create a vocabulary for the (single) speech-to-text language.
In some cases, a live stream has speakers in different languages. In that case, Clevercast allows you to change the speech-to-text language during the live stream. For such cases, the 'Change Language' dropdown button at the top of the vocabulary page can be used. It lets you set the vocabulary for a text-to-text translation language that will become the speech-to-text language at some point during the live stream. When this happens, this language's vocabulary will automatically be used by Clevercast.
To add terms for different speech-to-text languages, change the speech-to-text language of the vocabulary.
You will see that existing terms with translations are already included. You can now add terms for the selected speech-to-text conversion language, with or without translations.
If there is a French speaker during the live stream, this will ensure correct captioning and pronunciation.
Adding and removing event languages
We recommend that you first select all AI caption and speech languages, before editing the vocabulary. If you change the AI languages afterwards, it will affect the vocabulary.
Adding an extra AI language
- The language is added to the vocabulary.
- If a term has translations, the term itself will be added as the translation for this new language.
Deleting an existing AI language
- The language is removed from the vocabulary (including all translations to this language).
Changing the speech-to-text language
Clevercast currently only allows you to change the default speech-to-text language if you first remove all other AI caption and speech languages. Doing so would cause you to lose the entire contents of your vocabulary.
If you want to preserve the content of your vocabulary, do the following:
- export the event vocabulary to a (new) account vocabulary (see below)
- in the account vocabulary, change the speech-to-text language (and update terms for this new language, if necessary)
- reconfigure the event: delete all existing AI language and create the new ones
- import the account vocabulary into the event vocabulary
Exporting and importing vocabularies
Account vocabularies
Use the 'Account' > 'AI Vocabularies' menu to create and manage account vocabularies. Their only purpose is to store vocabulary terms and metadata, so you can reuse them in multiple events.
An account vocabulary has a name, a speech-to-text language and (optionally) text-to-text language(s). These can be changed at any time.
- Changing the name or speech-to-text language does not affect the terms in the vocabulary.
- Changing the text-to-text language has the same impact on custom translations as with an event vocabulary (see above)
You can add and edit terms in the same way as for event vocabularies.
Vocabulary export
A Clevercast admin can export an event vocabulary to an account vocabulary. This can be done in two different ways.
- By creating a new account vocabulary which is completely identical to the event vocabulary, including the speech-to-text and text-to-text languages. To do this, fill in the name of the new account vocabulary.
- By merging the terms of the event vocabulary into an existing account vocabulary. To do this, select the account vocabulary (to which you want to add new terms).
If you want to merge the terms into an existing account vocabulary, the account vocabulary should preferably contain the same text-to-text translation languages. If this is the case, the export process is simple:
- terms that don't exist in the account vocabulary will be added, including their metadata.
- terms that already exist in the account vocabulary remain unchanged. The metadata of these terms is not updated.
The speech-to-text language must be the same in the event and account vocabularies. If necessary, you can (temporarily) change the speech-to-text language in the account vocabulary, and change it back after the export has been completed.
If the account vocabulary contains text-to-text languages that are not in the event vocabulary, a warning will be displayed. If you decide to go ahead, only custom translations for corresponding languages are added (for a language that is missing, the term itself is set as custom translation).
Vocabulary import
The terms of an account vocabulary can be imported into an event vocabulary.
The import process works in the same way as the export process:
- terms that don't exist in the event vocabulary are added, including their metadata. This also happens if the speech-to-text languages differ.
- terms that already exist in the event vocabulary remain unchanged. The metadata of these terms is not updated.
The speech-to-text language must be the same in the event and account vocabularies. If necessary, you can (temporarily) change the speech-to-text language in the account vocabulary, and change it back after the export has been completed.
Ideally, the imported vocabulary should have the same text-to-text languages as the event vocabulary. If not, the import dialog will display a warning. Like the export process, an import will only add custom translations for languages that are available in both vocabularies. If the event vocabulary contains text-to-text languages that are not in the account vocabulary, the term itself is set as custom translation for those languages.
We strongly recommend to always open the event vocabulary afterwards, and go over the imported terms to check if everything is correct.