t5-summarization-one-shot-base-random
This model is a fine-tuned version of google/flan-t5-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 3.0562
- Rouge: {'rouge1': 44.8059, 'rouge2': 22.6975, 'rougeL': 22.1144, 'rougeLsum': 22.1144}
- Bert Score: 0.8826
- Bleurt 20: -0.6909
- Gen Len: 14.41
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge | Bert Score | Bleurt 20 | Gen Len |
---|---|---|---|---|---|---|---|
2.1793 | 1.0 | 601 | 1.9158 | {'rouge1': 43.6025, 'rouge2': 20.0332, 'rougeL': 20.768, 'rougeLsum': 20.768} | 0.8742 | -0.8256 | 15.065 |
1.7262 | 2.0 | 1202 | 1.8547 | {'rouge1': 44.219, 'rouge2': 21.2046, 'rougeL': 21.6413, 'rougeLsum': 21.6413} | 0.8822 | -0.7138 | 14.25 |
1.5456 | 3.0 | 1803 | 1.8743 | {'rouge1': 43.0548, 'rouge2': 20.0052, 'rougeL': 21.7232, 'rougeLsum': 21.7232} | 0.8799 | -0.7351 | 14.215 |
1.3864 | 4.0 | 2404 | 1.8856 | {'rouge1': 44.0749, 'rouge2': 22.1873, 'rougeL': 22.3386, 'rougeLsum': 22.3386} | 0.8824 | -0.721 | 14.265 |
1.1726 | 5.0 | 3005 | 1.9377 | {'rouge1': 42.5277, 'rouge2': 21.1762, 'rougeL': 22.0771, 'rougeLsum': 22.0771} | 0.8861 | -0.6948 | 13.84 |
1.0126 | 6.0 | 3606 | 2.0305 | {'rouge1': 43.3254, 'rouge2': 21.2322, 'rougeL': 21.9138, 'rougeLsum': 21.9138} | 0.8812 | -0.6905 | 14.17 |
0.9277 | 7.0 | 4207 | 2.0788 | {'rouge1': 44.4591, 'rouge2': 21.485, 'rougeL': 22.1901, 'rougeLsum': 22.1901} | 0.8842 | -0.6869 | 14.15 |
0.8581 | 8.0 | 4808 | 2.1667 | {'rouge1': 43.3585, 'rouge2': 22.0956, 'rougeL': 22.6725, 'rougeLsum': 22.6725} | 0.8853 | -0.697 | 13.99 |
0.7611 | 9.0 | 5409 | 2.2544 | {'rouge1': 45.7618, 'rouge2': 22.8349, 'rougeL': 22.4909, 'rougeLsum': 22.4909} | 0.8826 | -0.6682 | 14.54 |
0.7624 | 10.0 | 6010 | 2.3085 | {'rouge1': 44.6569, 'rouge2': 21.8496, 'rougeL': 22.2368, 'rougeLsum': 22.2368} | 0.8818 | -0.684 | 14.42 |
0.5815 | 11.0 | 6611 | 2.4558 | {'rouge1': 44.248, 'rouge2': 22.0111, 'rougeL': 22.4011, 'rougeLsum': 22.4011} | 0.884 | -0.6895 | 14.095 |
0.5842 | 12.0 | 7212 | 2.5537 | {'rouge1': 44.4124, 'rouge2': 22.0939, 'rougeL': 22.2455, 'rougeLsum': 22.2455} | 0.8846 | -0.6832 | 14.355 |
0.5936 | 13.0 | 7813 | 2.5306 | {'rouge1': 44.3422, 'rouge2': 22.7948, 'rougeL': 22.3682, 'rougeLsum': 22.3682} | 0.8838 | -0.7135 | 14.255 |
0.4445 | 14.0 | 8414 | 2.7685 | {'rouge1': 45.4309, 'rouge2': 23.2292, 'rougeL': 23.2752, 'rougeLsum': 23.2752} | 0.8826 | -0.6563 | 14.77 |
0.3908 | 15.0 | 9015 | 2.8443 | {'rouge1': 44.6809, 'rouge2': 22.1492, 'rougeL': 21.9333, 'rougeLsum': 21.9333} | 0.8828 | -0.6801 | 14.43 |
0.4475 | 16.0 | 9616 | 2.8570 | {'rouge1': 45.6488, 'rouge2': 22.8303, 'rougeL': 22.3293, 'rougeLsum': 22.3293} | 0.8846 | -0.6545 | 14.6 |
0.3963 | 17.0 | 10217 | 2.8927 | {'rouge1': 45.3239, 'rouge2': 22.4719, 'rougeL': 22.3093, 'rougeLsum': 22.3093} | 0.8838 | -0.6512 | 14.4 |
0.4013 | 18.0 | 10818 | 3.0375 | {'rouge1': 44.663, 'rouge2': 22.4292, 'rougeL': 21.7939, 'rougeLsum': 21.7939} | 0.8845 | -0.6964 | 14.47 |
0.3355 | 19.0 | 11419 | 3.0206 | {'rouge1': 45.1714, 'rouge2': 23.0105, 'rougeL': 22.1146, 'rougeLsum': 22.1146} | 0.8829 | -0.6828 | 14.435 |
0.385 | 20.0 | 12020 | 3.0562 | {'rouge1': 44.8059, 'rouge2': 22.6975, 'rougeL': 22.1144, 'rougeLsum': 22.1144} | 0.8826 | -0.6909 | 14.41 |
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.0
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for veronica-girolimetti/t5-summarization-one-shot-base-random
Base model
google/flan-t5-base