by mimba
Open source · 274 downloads · 0 likes
The *speecht5-ngiemboon* model is a refined version of *microsoft/speecht5_tts*, specifically designed for speech synthesis. It converts text into natural-sounding speech, delivering a clear and expressive voice suitable for various contexts. Its primary use cases include generating automated audio content, assisting visually impaired individuals, and creating voices for virtual assistants. What sets it apart is its ability to produce intonation and fluidity that closely resemble human speech while remaining accessible through simple tools. Its training on specific datasets ensures optimized performance for natural language applications.
This model is a fine-tuned version of microsoft/speecht5_tts on an unknown dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 1.5579 | 0.7252 | 500 | 0.7537 |
| 1.3604 | 1.4496 | 1000 | 0.6883 |
| 1.2814 | 2.1740 | 1500 | 0.6343 |
| 1.2308 | 2.8992 | 2000 | 0.6164 |
| 1.2133 | 3.6236 | 2500 | 0.6057 |
| 1.1829 | 4.3481 | 3000 | 0.5962 |
| 1.1787 | 5.0725 | 3500 | 0.5967 |
| 1.1692 | 5.7977 | 4000 | 0.5852 |
| 1.1472 | 6.5221 | 4500 | 0.5811 |
| 1.1374 | 7.2466 | 5000 | 0.5768 |
| 1.1643 | 7.9717 | 5500 | 0.5711 |
| 1.1385 | 8.6962 | 6000 | 0.5713 |
| 1.1334 | 9.4206 | 6500 | 0.5670 |
| 1.1564 | 10.1450 | 7000 | 0.5684 |
| 1.1158 | 10.8702 | 7500 | 0.5622 |
| 1.1158 | 11.5946 | 8000 | 0.5628 |
| 1.1149 | 12.3191 | 8500 | 0.5611 |
| 1.1088 | 13.0435 | 9000 | 0.5597 |
| 1.1191 | 13.7687 | 9500 | 0.5615 |
| 1.1097 | 14.4931 | 10000 | 0.5604 |