by DeepDiveDev
Open source · 150 downloads · 0 likes
This model is a fine-tuned version of SpeechT5, specifically adapted for Bengali text-to-speech synthesis. It converts text into natural, fluent speech by leveraging the vocal generation capabilities of the base model. Its primary use cases include generating audio content for educational applications, voice assistants, or reading services for the visually impaired. What sets it apart is its specialization in Bengali, a language underrepresented in TTS models, providing a solution tailored to the linguistic needs of the region.
This model is a fine-tuned version of microsoft/speecht5_tts on an unknown dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 6.1441 | 1.9422 | 100 | 0.7127 |
| 5.5876 | 3.8988 | 200 | 0.6550 |
| 5.2451 | 5.8554 | 300 | 0.6514 |
| 5.1514 | 7.8120 | 400 | 0.6227 |
| 4.9727 | 9.7687 | 500 | 0.6220 |
| 4.9797 | 11.7253 | 600 | 0.6190 |