๐๏ธ DMOSpeech 2: Zero-Shot Text-to-Speech
Generate natural speech in any voice with just a short reference audio!
NOTE: The entire space was generated by Claude for demo purposes and may not demostrate the model's real performance because it can contain glitches/bugs
This space will retire when a better and more cleaned up space comes up later.
0 2
Show detailed generation steps
๐ก Quick Tips:
- Auto-transcription: Leave reference text empty to auto-transcribe
- Student Only: Fastest (4 steps), good quality
- Teacher-Guided: Best balance (8 steps), recommended
- High Diversity: More natural prosody (16 steps)
- Custom Mode: Fine-tune all parameters
๐ Expected RTF (Real-Time Factor):
- Student Only: ~0.05x (20x faster than real-time)
- Teacher-Guided: ~0.10x (10x faster)
- High Diversity: ~0.20x (5x faster)