
We have open-sourced Fish Audio S2, a next-generation expressive text-to-speech (TTS) system that allows you to direct voices using natural language instructions. Add cues like [whisper] or [nervous laugh], generate multi-speaker dialogues in a single pass, and create ultra-realistic voices in over 80 languages.
Audio / Voix