VoicePersona
A voice that matches the ear it's speaking into
A speech layer that gives every outbound or inbound call a chosen voice — any language you need, warm or formal, neutral or technical — with live A/B assignment that learns which voice each contact responds to over time.
- All Languages AIM supports, several registers each
- Live A/B learning, per contact
- 0 Voices shipped without domain-term audit
VoicePersona is the service that decides, turn by turn, which voice speaks on the call. It hosts a bank of pre-auditioned voices across every supported language and several registers per language — neutral, warmer, older, younger, with regional accents where they matter — and chooses which one to use for each call the stack places or receives.
The choice is not random. A new contact gets a reasonable default based on language and region. Over time the system A/B tests voices against the metric that matters for this product — pickup-to-booking, pickup-to-close, pickup-to-renewal, pickup-to-registration — and settles on whichever voice that individual contact picks up for the most, listens to the longest and converts through. Two years into a programme, no two contacts necessarily hear the same voice.
Every voice is reviewed for cadence, pronunciation of domain terms and handling of pauses. This is not a hobbyist TTS — a B2B sales call cannot mangle a product name, a collections call cannot sound robotic, a healthcare call cannot botch clinical vocabulary. The platform maintains a vetted shortlist per tenant and retires voices that stop performing.
Everything VoicePersona handles for you
-
Multilingual bank
Every language AIM supports, each in several registers — formal, warm, technical, concise.
-
Per-contact A/B
System converges each contact to the voice that maximises their own pickup-to-conversion rate.
-
Domain-term hygiene
Pronunciation of industry vocabulary reviewed per language; new terms flagged for audit.
-
Cadence preservation
Natural pauses, soft hand-offs and disfluencies kept intact — no uncanny, stacked-syllable delivery.
-
Quick persona swap
An operator can retire a voice or A/B a new one without touching the rest of the stack.
The voice layer of every outbound call
VoicePersona is a specialised surface — it renders speech into VoiceDialer, takes instructions from AgentBuilder, and is tuned per tenant by the platform admin.
- VoiceDialer plays the rendered audio into the live phone call.
- AgentBuilder drives what the voice says and when; persona renders, agent composes.
- SuperAdmin manages the persona bank, retires voices and controls which personas are active per tenant.
Wire VoicePersona into your product today
Book a consultation with our founders and we'll walk you through the whole microservice stack — not just this one — live on your domain.