Speech Synthesis (Text-to-Speech / TTS)
Speech synthesis, or text-to-speech (TTS), converts text inputs into spoken audio output in a human-like voice. It is often used in conjunction with a speech-to-text system — while STT converts speech to text, TTS converts text back to speech. TTS uses a speech synthesizer to produce intelligible speech from written input, and is used in applications ranging from screen readers for visually impaired users to voice bots and conversational IVR in enterprise contact centers.
For enterprise voice AI deployments, TTS is the output layer that determines how the bot sounds to customers. Voice quality, naturalness, and brand alignment all depend on the TTS engine and configuration selected.