dgo-tts-training-data-speecht5-a

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0521

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 3407
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 4000
training_steps: 40000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.0864	6.4959	1000	0.0616
0.0723	12.9918	2000	0.0563
0.0655	19.4829	3000	0.0578
0.066	25.9788	4000	0.0527
0.0606	32.4698	5000	0.0539
0.0578	38.9657	6000	0.0519
0.0566	45.4568	7000	0.0531
0.0579	51.9527	8000	0.0534
0.0519	58.4437	9000	0.0521
0.0514	64.9396	10000	0.0544
0.0497	71.4307	11000	0.0578
0.0484	77.9266	12000	0.0524
0.0474	84.4176	13000	0.0526
0.0457	90.9135	14000	0.0517
0.0461	97.4046	15000	0.0523
0.0456	103.9005	16000	0.0530
0.0436	110.3915	17000	0.0517
0.042	116.8874	18000	0.0515
0.0411	123.3785	19000	0.0520
0.043	129.8744	20000	0.0514
0.0384	136.3654	21000	0.0529
0.0383	142.8613	22000	0.0516
0.0383	149.3524	23000	0.0518
0.0395	155.8483	24000	0.0520
0.038	162.3393	25000	0.0522
0.0383	168.8352	26000	0.0520
0.0363	175.3263	27000	0.0520
0.0378	181.8222	28000	0.0529
0.0373	188.3132	29000	0.0517
0.0364	194.8091	30000	0.0515
0.0362	201.3002	31000	0.0522
0.0365	207.7961	32000	0.0520
0.0339	214.2871	33000	0.0520
0.035	220.7830	34000	0.0514
0.0358	227.2741	35000	0.0522
0.0333	233.7700	36000	0.0525
0.0348	240.2610	37000	0.0524
0.0372	246.7569	38000	0.0519
0.0349	253.2480	39000	0.0521
0.0372	259.7439	40000	0.0521

Framework versions

Transformers 4.57.1
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

dgo-tts-training-data-speecht5-a

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0521

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001

train_batch_size: 8

eval_batch_size: 8

seed: 3407

gradient_accumulation_steps: 4

total_train_batch_size: 32

optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments

lr_scheduler_type: cosine

lr_scheduler_warmup_steps: 4000

training_steps: 40000

mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.0864	6.4959	1000	0.0616
0.0723	12.9918	2000	0.0563
0.0655	19.4829	3000	0.0578
0.066	25.9788	4000	0.0527
0.0606	32.4698	5000	0.0539
0.0578	38.9657	6000	0.0519
0.0566	45.4568	7000	0.0531
0.0579	51.9527	8000	0.0534
0.0519	58.4437	9000	0.0521
0.0514	64.9396	10000	0.0544
0.0497	71.4307	11000	0.0578
0.0484	77.9266	12000	0.0524
0.0474	84.4176	13000	0.0526
0.0457	90.9135	14000	0.0517
0.0461	97.4046	15000	0.0523
0.0456	103.9005	16000	0.0530
0.0436	110.3915	17000	0.0517
0.042	116.8874	18000	0.0515
0.0411	123.3785	19000	0.0520
0.043	129.8744	20000	0.0514
0.0384	136.3654	21000	0.0529
0.0383	142.8613	22000	0.0516
0.0383	149.3524	23000	0.0518
0.0395	155.8483	24000	0.0520
0.038	162.3393	25000	0.0522
0.0383	168.8352	26000	0.0520
0.0363	175.3263	27000	0.0520
0.0378	181.8222	28000	0.0529
0.0373	188.3132	29000	0.0517
0.0364	194.8091	30000	0.0515
0.0362	201.3002	31000	0.0522
0.0365	207.7961	32000	0.0520
0.0339	214.2871	33000	0.0520
0.035	220.7830	34000	0.0514
0.0358	227.2741	35000	0.0522
0.0333	233.7700	36000	0.0525
0.0348	240.2610	37000	0.0524
0.0372	246.7569	38000	0.0519
0.0349	253.2480	39000	0.0521
0.0372	259.7439	40000	0.0521

Framework versions

Transformers 4.57.1

Pytorch 2.8.0+cu128

Datasets 4.2.0

Tokenizers 0.22.1

dgo tts training data speecht5 a