AI ExplorerAI Explorer
ToolsCategoriesSitesLLMsCompareAI QuizAlternativesPremium

—

AI Tools

—

Sites & Blogs

—

LLMs & Models

—

Categories

AI Explorer

Find and compare the best artificial intelligence tools for your projects.

Made within France

Explore

  • All tools
  • Sites & Blogs
  • LLMs & Models
  • Compare
  • Chatbots
  • AI Images
  • Code & Dev

Company

  • Premium
  • About
  • Contact
  • Blog

Legal

  • Legal notice
  • Privacy
  • Terms

© 2026 AI Explorer. All rights reserved.

HomeLLMsurdu speecht5 finetuned

urdu speecht5 finetuned

by ahmedjaved812

Open source · 1k downloads · 0 likes

0.0
(0 reviews)AudioAPI & Local
About

This model is a fine-tuned version of SpeechT5, specifically adapted for Urdu speech synthesis. It converts text into natural and fluent speech in Urdu, with appropriate intonation and prosody. Designed for applications requiring realistic Urdu voices, it is particularly suited for voice assistants, audiobooks, or accessibility tools for Urdu speakers. What sets it apart is its ability to generate clear and expressive speech while remaining faithful to the linguistic nuances of Urdu. Its training on specific data ensures better performance than generic models for this language.

Documentation

urdu-speecht5-finetuned

This model is a fine-tuned version of microsoft/speecht5_tts on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8700

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 6
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 48
  • total_eval_batch_size: 4
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 70
  • mixed_precision_training: Native AMP

Training results

Training LossEpochStepValidation Loss
4.64161.51985001.0342
4.24083.039510000.9665
4.07764.559315000.9377
4.02576.079020000.9284
3.93887.598825000.9060
3.86389.118530000.9002
3.824010.638335000.8884
3.770112.158140000.8894
3.758713.677845000.8772
3.712015.197650000.8787
3.687116.717355000.8724
3.693618.237160000.8732
3.668119.756865000.8782
3.639721.276670000.8798
3.628922.796475000.8654
3.612024.316180000.8669
3.605925.835985000.8608
3.593327.355690000.8610
3.550728.875495000.8674
3.552230.3951100000.8633
3.567431.9149105000.8654
3.546933.4347110000.8605
3.553834.9544115000.8577
3.526236.4742120000.8677
3.530737.9939125000.8621
3.524839.5137130000.8601
3.520941.0334135000.8564
3.511342.5532140000.8597
3.508344.0729145000.8650
3.534245.5927150000.8595
3.496247.1125155000.8660
3.492348.6322160000.8640
3.488250.1520165000.8669
3.489451.6717170000.8677
3.474853.1915175000.8645
3.471054.7112180000.8662
3.475556.2310185000.8673
3.479557.7508190000.8628
3.452859.2705195000.8697
3.480260.7903200000.8746
3.458262.3100205000.8695
3.455963.8298210000.8697
3.433365.3495215000.8690
3.469966.8693220000.8696
3.459568.3891225000.8700
3.462569.9088230000.8700

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.8.3
  • Tokenizers 0.22.2
Capabilities & Tags
transformerssafetensorsspeecht5text-to-audiogenerated_from_trainerendpoints_compatible
Links & Resources
Specifications
CategoryAudio
AccessAPI & Local
LicenseOpen Source
PricingOpen Source
Rating
0.0

Try urdu speecht5 finetuned

Access the model directly