SSML
Speech Synthesis Markup Language (SSML) is an XML-based standard that provides fine-grained control over how synthesised speech sounds. SSML tags specify pauses, emphasis, speaking rate, pitch variations, pronunciation of abbreviations or numbers, and the insertion of pre-recorded audio clips. For example, SSML can instruct a TTS engine to spell out an account number digit by digit rather than reading it as a whole number, or to pause before delivering important information. SSML is essential for producing professional-quality, brand-appropriate voice interactions. NiCE Cognigy Voice Gateway supports full SSML, enabling precise voice design across all supported TTS engines.
For enterprise teams, SSML matters because real-world outcomes depend on how the capability is integrated, governed, and measured — not just on the underlying technology. For example, SSML can instruct a TTS engine to spell out an account number digit by digit rather than reading it as a whole number, or to pause before delivering important information.