Pro-Voice (Add-on)

Studio‑grade realism trained only on your recordings—no shared data, no guesswork.

What it is

Pro Voice builds a dedicated voice model from your recordings - exclusively your voice. Pro Voice learns solely from your speech for a richer, more authentic sound. They have much much more fine tuning and use significantly greater sample lengths to achieve a hyper realistic outcome.

Highlights

Exclusively yours: Model trained only on your audio.
Richer detail: Provide ≥30 minutes (ideal ~60 minutes) to capture nuances.
Studio polish: Advanced noise reduction & high‑fidelity synthesis for broadcast‑quality calls.
Consistent over time: Long calls and streams keep the same clear tone start‑to‑finish.

Not quite right even after setup? Email [email protected] and we’ll partner with you to dial it in.

Why upgrade

Exact match to your natural voice & accent
Higher clarity in noisy listening environments
Consistency across long sessions and varying prompts

Requirements

Audio duration: Minimum 30 minutes; 60 minutes recommended.
Quality: One clean session in a quiet space (same mic, same room, steady tone).
Format: WAV/MP3 preferred.

How to upgrade

Go to Studio → Voice and click settings gear icon (top‑right).
In the Pro Voice banner, click Upgrade Now.
Click Continue to Payment and complete checkout ($150 / month).
After payment, email [email protected] with your training audio:
- Record or upload at least 10 minutes (30 minutes ideal).
- Follow the recording guidelines below.
- Include your Delphi handle and any pronunciation notes.

We’ll handle training and notify you when your dedicated model is live.

Recording guidelines (capture once, capture well)

Room: Silent, low‑echo space; turn off HVAC, devices, notifications.
Mic: Prefer XLR mic + interface (e.g., AT‑2020 or Rode NT1 with Focusrite). USB is OK; avoid Bluetooth and Zoom/call captures.
Distance: ~2 fists from the mic with a pop filter.
Levels: Aim −23 dB to −18 dB RMS, peaks < −3 dB.
Performance: Steady pace, single speaker, one language.
Editing: Trim long silences & filler words if you want a polished tone.

What to read (suggested 30‑min script plan)

Intro & bio (3–5 min): Who you are, domains, typical questions.
Explainers (10–12 min): Teach 3–4 concepts in your natural style.
Q&A (10–12 min): Answer common questions out loud.
Pronunciation list (2–3 min): Proper nouns, names, brands; include variants.

After training

Test in Voice Playground, then place a live call for a true check.
Fine‑tune Stability / Similarity / Speed.
Use Custom Pronunciations for tricky names.
Re‑record and retrain only if your gear/space changes significantly.

FAQs

How long does training take? We’ll email when it’s ready (timing depends on queue and audio length/quality).
Can I send multiple clips? Yes—but record them with the same mic/room for consistency. A single session is best.
What about accents? Pro Voice captures your natural accent more faithfully than defaults.
Can I downgrade later? Yes—disable the add‑on on the Add‑ons page; your default voice settings remain.
Is my audio shared? No—Pro Voice models are trained only on your submitted recordings.

Pre‑submit checklist

≥30 minutes of clean audio (ideally ~60 minutes)
One mic/room; steady tone; single speaker; one language
Levels in range (−23 to −18 dB RMS; peaks < −3 dB)
File(s) WAV/MP3 named with your Delphi handle
Email sent to [email protected] with any pronunciation notes

PreviousVoice Playground NextPro-Voice Recording Guide

Last updated 5 days ago

Good night

hashtagWhat it is

hashtagWhy upgrade

hashtagRequirements

hashtagHow to upgrade

hashtagRecording guidelines (capture once, capture well)

hashtagWhat to read (suggested 30‑min script plan)

hashtagAfter training

hashtagFAQs

hashtagPre‑submit checklist