AI/EXPLORER
ToolsCategoriesSitesLLMsCompareAI QuizAlternativesPremium
—AI Tools
—Sites & Blogs
—LLMs & Models
—Categories
AI Explorer

Find and compare the best artificial intelligence tools for your projects.

Made within France

Explore

  • ›All tools
  • ›Sites & Blogs
  • ›LLMs & Models
  • ›Compare
  • ›Chatbots
  • ›AI Images
  • ›Code & Dev

Company

  • ›Premium
  • ›About
  • ›Contact
  • ›Blog

Legal

  • ›Legal notice
  • ›Privacy
  • ›Terms

© 2026 AI Explorer·All rights reserved.

HomeLLMsspeecht5 finetuned

speecht5 finetuned

by mustafoyev202

Open source · 264 downloads · 1 likes

0.4
(1 reviews)AudioAPI & Local
About

This model is a refined version of *SpeechT5*, specialized in text-to-speech synthesis. It converts written sentences into natural and intelligible speech, with improved vocal quality compared to the base version. Its primary use cases include creating voiceovers, assisting visually impaired individuals, and generating audio content for multimedia applications. What sets it apart is its training on an unspecified dataset, optimized to minimize quality loss while maintaining natural fluency and expressiveness. Its robust architecture and high performance make it a versatile tool for projects requiring efficient text-to-speech conversion.

Documentation

speecht5_finetuned

This model is a fine-tuned version of microsoft/speecht5_tts on an unknown dataset. It achieves the following results on the evaluation set:

  • eval_loss: 0.3355
  • eval_runtime: 5.4808
  • eval_samples_per_second: 40.87
  • eval_steps_per_second: 10.218
  • epoch: 559.5238
  • step: 35250

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 300
  • training_steps: 100000
  • mixed_precision_training: Native AMP

Framework versions

  • Transformers 4.52.3
  • Pytorch 2.7.0+cu118
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Capabilities & Tags
transformerstensorboardsafetensorsspeecht5text-to-audiogenerated_from_trainerendpoints_compatible
Links & Resources
Specifications
CategoryAudio
AccessAPI & Local
LicenseOpen Source
PricingOpen Source
Rating
0.4

Try speecht5 finetuned

Access the model directly