senga-nt-asr-inferred-force-aligned-speecht5-MAT-ACT

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1760

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 3407
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 200
num_epochs: 600.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.1869	30.3030	1000	0.1695
0.1612	60.6061	2000	0.1583
0.1399	90.9091	3000	0.1664
0.1301	121.2121	4000	0.1640
0.1208	151.5152	5000	0.1699
0.1161	181.8182	6000	0.1746
0.108	212.1212	7000	0.1673
0.0945	242.4242	8000	0.1804
0.1044	272.7273	9000	0.1787
0.0929	303.0303	10000	0.1756
0.0845	333.3333	11000	0.1701
0.0894	363.6364	12000	0.1739
0.0813	393.9394	13000	0.1667
0.0818	424.2424	14000	0.1740
0.0769	454.5455	15000	0.1719
0.0788	484.8485	16000	0.1780
0.0759	515.1515	17000	0.1745
0.0933	545.4545	18000	0.1754
0.0764	575.7576	19000	0.1760

Framework versions

Transformers 4.57.1
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

senga-nt-asr-inferred-force-aligned-speecht5-MAT-ACT

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1760

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001

train_batch_size: 8

eval_batch_size: 8

seed: 3407

gradient_accumulation_steps: 4

total_train_batch_size: 32

optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments

lr_scheduler_type: cosine

lr_scheduler_warmup_steps: 200

num_epochs: 600.0

mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.1869	30.3030	1000	0.1695
0.1612	60.6061	2000	0.1583
0.1399	90.9091	3000	0.1664
0.1301	121.2121	4000	0.1640
0.1208	151.5152	5000	0.1699
0.1161	181.8182	6000	0.1746
0.108	212.1212	7000	0.1673
0.0945	242.4242	8000	0.1804
0.1044	272.7273	9000	0.1787
0.0929	303.0303	10000	0.1756
0.0845	333.3333	11000	0.1701
0.0894	363.6364	12000	0.1739
0.0813	393.9394	13000	0.1667
0.0818	424.2424	14000	0.1740
0.0769	454.5455	15000	0.1719
0.0788	484.8485	16000	0.1780
0.0759	515.1515	17000	0.1745
0.0933	545.4545	18000	0.1754
0.0764	575.7576	19000	0.1760

Framework versions

Transformers 4.57.1

Pytorch 2.8.0+cu128

Datasets 4.2.0

Tokenizers 0.22.1

senga nt asr inferred force aligned speecht5 MAT ACT