t5-summarization-one-shot-base-random

This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.0562
Rouge: {'rouge1': 44.8059, 'rouge2': 22.6975, 'rougeL': 22.1144, 'rougeLsum': 22.1144}
Bert Score: 0.8826
Bleurt 20: -0.6909
Gen Len: 14.41

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 2
eval_batch_size: 2
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge	Bert Score	Bleurt 20	Gen Len
2.1793	1.0	601	1.9158	{'rouge1': 43.6025, 'rouge2': 20.0332, 'rougeL': 20.768, 'rougeLsum': 20.768}	0.8742	-0.8256	15.065
1.7262	2.0	1202	1.8547	{'rouge1': 44.219, 'rouge2': 21.2046, 'rougeL': 21.6413, 'rougeLsum': 21.6413}	0.8822	-0.7138	14.25
1.5456	3.0	1803	1.8743	{'rouge1': 43.0548, 'rouge2': 20.0052, 'rougeL': 21.7232, 'rougeLsum': 21.7232}	0.8799	-0.7351	14.215
1.3864	4.0	2404	1.8856	{'rouge1': 44.0749, 'rouge2': 22.1873, 'rougeL': 22.3386, 'rougeLsum': 22.3386}	0.8824	-0.721	14.265
1.1726	5.0	3005	1.9377	{'rouge1': 42.5277, 'rouge2': 21.1762, 'rougeL': 22.0771, 'rougeLsum': 22.0771}	0.8861	-0.6948	13.84
1.0126	6.0	3606	2.0305	{'rouge1': 43.3254, 'rouge2': 21.2322, 'rougeL': 21.9138, 'rougeLsum': 21.9138}	0.8812	-0.6905	14.17
0.9277	7.0	4207	2.0788	{'rouge1': 44.4591, 'rouge2': 21.485, 'rougeL': 22.1901, 'rougeLsum': 22.1901}	0.8842	-0.6869	14.15
0.8581	8.0	4808	2.1667	{'rouge1': 43.3585, 'rouge2': 22.0956, 'rougeL': 22.6725, 'rougeLsum': 22.6725}	0.8853	-0.697	13.99
0.7611	9.0	5409	2.2544	{'rouge1': 45.7618, 'rouge2': 22.8349, 'rougeL': 22.4909, 'rougeLsum': 22.4909}	0.8826	-0.6682	14.54
0.7624	10.0	6010	2.3085	{'rouge1': 44.6569, 'rouge2': 21.8496, 'rougeL': 22.2368, 'rougeLsum': 22.2368}	0.8818	-0.684	14.42
0.5815	11.0	6611	2.4558	{'rouge1': 44.248, 'rouge2': 22.0111, 'rougeL': 22.4011, 'rougeLsum': 22.4011}	0.884	-0.6895	14.095
0.5842	12.0	7212	2.5537	{'rouge1': 44.4124, 'rouge2': 22.0939, 'rougeL': 22.2455, 'rougeLsum': 22.2455}	0.8846	-0.6832	14.355
0.5936	13.0	7813	2.5306	{'rouge1': 44.3422, 'rouge2': 22.7948, 'rougeL': 22.3682, 'rougeLsum': 22.3682}	0.8838	-0.7135	14.255
0.4445	14.0	8414	2.7685	{'rouge1': 45.4309, 'rouge2': 23.2292, 'rougeL': 23.2752, 'rougeLsum': 23.2752}	0.8826	-0.6563	14.77
0.3908	15.0	9015	2.8443	{'rouge1': 44.6809, 'rouge2': 22.1492, 'rougeL': 21.9333, 'rougeLsum': 21.9333}	0.8828	-0.6801	14.43
0.4475	16.0	9616	2.8570	{'rouge1': 45.6488, 'rouge2': 22.8303, 'rougeL': 22.3293, 'rougeLsum': 22.3293}	0.8846	-0.6545	14.6
0.3963	17.0	10217	2.8927	{'rouge1': 45.3239, 'rouge2': 22.4719, 'rougeL': 22.3093, 'rougeLsum': 22.3093}	0.8838	-0.6512	14.4
0.4013	18.0	10818	3.0375	{'rouge1': 44.663, 'rouge2': 22.4292, 'rougeL': 21.7939, 'rougeLsum': 21.7939}	0.8845	-0.6964	14.47
0.3355	19.0	11419	3.0206	{'rouge1': 45.1714, 'rouge2': 23.0105, 'rougeL': 22.1146, 'rougeLsum': 22.1146}	0.8829	-0.6828	14.435
0.385	20.0	12020	3.0562	{'rouge1': 44.8059, 'rouge2': 22.6975, 'rougeL': 22.1144, 'rougeLsum': 22.1144}	0.8826	-0.6909	14.41

Framework versions

Transformers 4.35.2
Pytorch 2.1.0+cu121
Datasets 2.16.1
Tokenizers 0.15.0

veronica-girolimetti
/

t5-summarization-one-shot-base-random

t5-summarization-one-shot-base-random

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for veronica-girolimetti/t5-summarization-one-shot-base-random

Evaluation results