AI/EXPLORER
ToolsCategoriesSitesLLMsCompareAI QuizAlternativesPremium
—AI Tools
—Sites & Blogs
—LLMs & Models
—Categories
AI Explorer

Find and compare the best artificial intelligence tools for your projects.

Made within France

Explore

  • ›All tools
  • ›Sites & Blogs
  • ›LLMs & Models
  • ›Compare
  • ›Chatbots
  • ›AI Images
  • ›Code & Dev

Company

  • ›Premium
  • ›About
  • ›Contact
  • ›Blog

Legal

  • ›Legal notice
  • ›Privacy
  • ›Terms

© 2026 AI Explorer·All rights reserved.

HomeLLMsspeecht5 finetuned

speecht5 finetuned

by schalor

Open source · 486 downloads · 0 likes

0.0
(0 reviews)AudioAPI & Local
About

This model is a refined version of *microsoft/speecht5_tts*, specifically optimized for speech synthesis. It converts text into natural and expressive speech, with improved sound quality thanks to training on specialized data. Its primary use cases include creating voiceovers, assisting visually impaired individuals, and generating automated audio content. What sets it apart is its ability to produce more natural intonation tailored to various contexts while maintaining the robustness of the base model.

Documentation

speecht5_finetuned

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5239

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 25
  • mixed_precision_training: Native AMP

Training results

Training LossEpochStepValidation Loss
No log1.13641000.7136
0.8262.27272000.5898
0.8263.40913000.5733
0.62544.54554000.5604
0.62545.68185000.5542
0.6036.81826000.5490
0.6037.95457000.5450
0.59249.09098000.5432
0.592410.22739000.5403
0.584111.363610000.5378
0.584112.511000.5336
0.57813.636412000.5357
0.57814.772713000.5321
0.572415.909114000.5293
0.572417.045515000.5287
0.570418.181816000.5272
0.570419.318217000.5281
0.565320.454518000.5239
0.565321.590919000.5276
0.562322.727320000.5260
0.562323.863621000.5233
0.562825.022000.5239

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.7.1+cu118
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Capabilities & Tags
transformerstensorboardsafetensorsspeecht5text-to-audiogenerated_from_trainerendpoints_compatible
Links & Resources
Specifications
CategoryAudio
AccessAPI & Local
LicenseOpen Source
PricingOpen Source
Rating
0.0

Try speecht5 finetuned

Access the model directly