t5-vi-instruct-hf
This model is a fine-tuned version of google/flan-t5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.6943
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.3267 | 0.0085 | 100 | 1.9629 |
2.0179 | 0.0171 | 200 | 1.7591 |
1.831 | 0.0256 | 300 | 1.6275 |
1.7046 | 0.0342 | 400 | 1.5287 |
1.7428 | 0.0427 | 500 | 1.4256 |
1.5482 | 0.0513 | 600 | 1.3708 |
1.6759 | 0.0598 | 700 | 1.3091 |
1.3827 | 0.0684 | 800 | 1.2703 |
1.3831 | 0.0769 | 900 | 1.2452 |
1.3291 | 0.0855 | 1000 | 1.2163 |
1.4892 | 0.0940 | 1100 | 1.1667 |
1.252 | 0.1026 | 1200 | 1.1614 |
1.2592 | 0.1111 | 1300 | 1.1390 |
1.24 | 0.1197 | 1400 | 1.1300 |
1.2874 | 0.1282 | 1500 | 1.1025 |
1.1944 | 0.1368 | 1600 | 1.1121 |
1.1848 | 0.1453 | 1700 | 1.0787 |
1.1624 | 0.1538 | 1800 | 1.0771 |
1.1606 | 0.1624 | 1900 | 1.0543 |
1.1327 | 0.1709 | 2000 | 1.0570 |
1.1515 | 0.1795 | 2100 | 1.0289 |
1.1878 | 0.1880 | 2200 | 1.0197 |
1.1357 | 0.1966 | 2300 | 1.0111 |
1.1731 | 0.2051 | 2400 | 1.0004 |
1.2001 | 0.2137 | 2500 | 0.9900 |
1.1556 | 0.2222 | 2600 | 0.9905 |
1.0853 | 0.2308 | 2700 | 0.9713 |
1.0645 | 0.2393 | 2800 | 0.9911 |
1.067 | 0.2479 | 2900 | 0.9769 |
1.0773 | 0.2564 | 3000 | 0.9609 |
1.0611 | 0.2650 | 3100 | 0.9627 |
1.0406 | 0.2735 | 3200 | 0.9590 |
1.0593 | 0.2821 | 3300 | 0.9496 |
1.0488 | 0.2906 | 3400 | 0.9362 |
1.0286 | 0.2991 | 3500 | 0.9402 |
1.0386 | 0.3077 | 3600 | 0.9375 |
1.0311 | 0.3162 | 3700 | 0.9409 |
1.0082 | 0.3248 | 3800 | 0.9225 |
1.0254 | 0.3333 | 3900 | 0.9102 |
1.0138 | 0.3419 | 4000 | 0.9152 |
1.027 | 0.3504 | 4100 | 0.9093 |
1.0049 | 0.3590 | 4200 | 0.9113 |
1.0084 | 0.3675 | 4300 | 0.8997 |
0.9844 | 0.3761 | 4400 | 0.8973 |
1.0041 | 0.3846 | 4500 | 0.8924 |
1.0071 | 0.3932 | 4600 | 0.8991 |
0.9672 | 0.4017 | 4700 | 0.8899 |
0.9882 | 0.4103 | 4800 | 0.8971 |
0.9994 | 0.4188 | 4900 | 0.8927 |
0.9825 | 0.4274 | 5000 | 0.8883 |
0.9643 | 0.4359 | 5100 | 0.8765 |
0.987 | 0.4444 | 5200 | 0.8773 |
0.977 | 0.4530 | 5300 | 0.8685 |
0.9527 | 0.4615 | 5400 | 0.8775 |
0.9825 | 0.4701 | 5500 | 0.8643 |
0.97 | 0.4786 | 5600 | 0.8657 |
0.9454 | 0.4872 | 5700 | 0.8697 |
0.946 | 0.4957 | 5800 | 0.8585 |
0.9617 | 0.5043 | 5900 | 0.8570 |
0.9469 | 0.5128 | 6000 | 0.8540 |
0.9455 | 0.5214 | 6100 | 0.8465 |
0.9516 | 0.5299 | 6200 | 0.8444 |
0.9383 | 0.5385 | 6300 | 0.8491 |
0.9211 | 0.5470 | 6400 | 0.8485 |
0.9347 | 0.5556 | 6500 | 0.8425 |
0.938 | 0.5641 | 6600 | 0.8475 |
0.9274 | 0.5726 | 6700 | 0.8405 |
0.9257 | 0.5812 | 6800 | 0.8435 |
0.9161 | 0.5897 | 6900 | 0.8314 |
0.9268 | 0.5983 | 7000 | 0.8335 |
0.9321 | 0.6068 | 7100 | 0.8310 |
0.9168 | 0.6154 | 7200 | 0.8340 |
0.9141 | 0.6239 | 7300 | 0.8271 |
0.9077 | 0.6325 | 7400 | 0.8236 |
0.9189 | 0.6410 | 7500 | 0.8210 |
0.8914 | 0.6496 | 7600 | 0.8214 |
0.9012 | 0.6581 | 7700 | 0.8241 |
0.8942 | 0.6667 | 7800 | 0.8180 |
0.8981 | 0.6752 | 7900 | 0.8211 |
0.8997 | 0.6838 | 8000 | 0.8163 |
0.909 | 0.6923 | 8100 | 0.8156 |
0.8908 | 0.7009 | 8200 | 0.8193 |
0.8938 | 0.7094 | 8300 | 0.8198 |
0.912 | 0.7179 | 8400 | 0.8087 |
0.8627 | 0.7265 | 8500 | 0.8064 |
0.8917 | 0.7350 | 8600 | 0.8048 |
0.8924 | 0.7436 | 8700 | 0.8077 |
0.9046 | 0.7521 | 8800 | 0.8110 |
0.8906 | 0.7607 | 8900 | 0.8074 |
0.8838 | 0.7692 | 9000 | 0.7980 |
0.8928 | 0.7778 | 9100 | 0.8033 |
0.8911 | 0.7863 | 9200 | 0.8047 |
0.9032 | 0.7949 | 9300 | 0.8000 |
0.8749 | 0.8034 | 9400 | 0.8011 |
0.8782 | 0.8120 | 9500 | 0.7997 |
0.8711 | 0.8205 | 9600 | 0.7930 |
0.8996 | 0.8291 | 9700 | 0.7918 |
0.8957 | 0.8376 | 9800 | 0.7973 |
0.893 | 0.8462 | 9900 | 0.7901 |
0.861 | 0.8547 | 10000 | 0.7907 |
0.8679 | 0.8632 | 10100 | 0.7889 |
0.8671 | 0.8718 | 10200 | 0.7882 |
0.8719 | 0.8803 | 10300 | 0.7876 |
0.8798 | 0.8889 | 10400 | 0.7817 |
0.8753 | 0.8974 | 10500 | 0.7832 |
0.8583 | 0.9060 | 10600 | 0.7863 |
0.8586 | 0.9145 | 10700 | 0.7856 |
0.8585 | 0.9231 | 10800 | 0.7836 |
0.8626 | 0.9316 | 10900 | 0.7839 |
0.8549 | 0.9402 | 11000 | 0.7844 |
0.8791 | 0.9487 | 11100 | 0.7825 |
0.86 | 0.9573 | 11200 | 0.7777 |
0.85 | 0.9658 | 11300 | 0.7744 |
0.8609 | 0.9744 | 11400 | 0.7832 |
0.8424 | 0.9829 | 11500 | 0.7813 |
0.8628 | 0.9915 | 11600 | 0.7712 |
0.8572 | 1.0 | 11700 | 0.7776 |
0.8584 | 1.0085 | 11800 | 0.7719 |
0.828 | 1.0171 | 11900 | 0.7729 |
0.8605 | 1.0256 | 12000 | 0.7716 |
0.8294 | 1.0342 | 12100 | 0.7733 |
0.8371 | 1.0427 | 12200 | 0.7654 |
0.8464 | 1.0513 | 12300 | 0.7690 |
0.8388 | 1.0598 | 12400 | 0.7672 |
0.8545 | 1.0684 | 12500 | 0.7696 |
0.8369 | 1.0769 | 12600 | 0.7625 |
0.853 | 1.0855 | 12700 | 0.7625 |
0.839 | 1.0940 | 12800 | 0.7638 |
0.8383 | 1.1026 | 12900 | 0.7604 |
0.8434 | 1.1111 | 13000 | 0.7689 |
0.8449 | 1.1197 | 13100 | 0.7633 |
0.8581 | 1.1282 | 13200 | 0.7584 |
0.8313 | 1.1368 | 13300 | 0.7590 |
0.8475 | 1.1453 | 13400 | 0.7600 |
0.8231 | 1.1538 | 13500 | 0.7591 |
0.8294 | 1.1624 | 13600 | 0.7565 |
0.8068 | 1.1709 | 13700 | 0.7579 |
0.8467 | 1.1795 | 13800 | 0.7592 |
0.8329 | 1.1880 | 13900 | 0.7581 |
0.8255 | 1.1966 | 14000 | 0.7539 |
0.8353 | 1.2051 | 14100 | 0.7614 |
0.8233 | 1.2137 | 14200 | 0.7548 |
0.8551 | 1.2222 | 14300 | 0.7493 |
0.8267 | 1.2308 | 14400 | 0.7533 |
0.8366 | 1.2393 | 14500 | 0.7512 |
0.8242 | 1.2479 | 14600 | 0.7453 |
0.8414 | 1.2564 | 14700 | 0.7496 |
0.8375 | 1.2650 | 14800 | 0.7490 |
0.8544 | 1.2735 | 14900 | 0.7522 |
0.8444 | 1.2821 | 15000 | 0.7476 |
0.82 | 1.2906 | 15100 | 0.7489 |
0.8174 | 1.2991 | 15200 | 0.7499 |
0.8187 | 1.3077 | 15300 | 0.7476 |
0.8277 | 1.3162 | 15400 | 0.7426 |
0.8227 | 1.3248 | 15500 | 0.7454 |
0.8222 | 1.3333 | 15600 | 0.7457 |
0.8147 | 1.3419 | 15700 | 0.7402 |
0.8165 | 1.3504 | 15800 | 0.7399 |
0.8203 | 1.3590 | 15900 | 0.7423 |
0.8223 | 1.3675 | 16000 | 0.7436 |
0.8227 | 1.3761 | 16100 | 0.7437 |
0.8149 | 1.3846 | 16200 | 0.7405 |
0.8035 | 1.3932 | 16300 | 0.7397 |
0.8217 | 1.4017 | 16400 | 0.7378 |
0.8057 | 1.4103 | 16500 | 0.7418 |
0.8091 | 1.4188 | 16600 | 0.7479 |
0.811 | 1.4274 | 16700 | 0.7361 |
0.8214 | 1.4359 | 16800 | 0.7368 |
0.8231 | 1.4444 | 16900 | 0.7370 |
0.8024 | 1.4530 | 17000 | 0.7351 |
0.8205 | 1.4615 | 17100 | 0.7391 |
0.8006 | 1.4701 | 17200 | 0.7382 |
0.8047 | 1.4786 | 17300 | 0.7344 |
0.7927 | 1.4872 | 17400 | 0.7337 |
0.8115 | 1.4957 | 17500 | 0.7330 |
0.8108 | 1.5043 | 17600 | 0.7351 |
0.7982 | 1.5128 | 17700 | 0.7364 |
0.821 | 1.5214 | 17800 | 0.7326 |
0.7849 | 1.5299 | 17900 | 0.7277 |
0.8058 | 1.5385 | 18000 | 0.7284 |
0.8154 | 1.5470 | 18100 | 0.7303 |
0.8199 | 1.5556 | 18200 | 0.7342 |
0.8146 | 1.5641 | 18300 | 0.7274 |
0.8004 | 1.5726 | 18400 | 0.7294 |
0.8061 | 1.5812 | 18500 | 0.7308 |
0.8081 | 1.5897 | 18600 | 0.7254 |
0.7907 | 1.5983 | 18700 | 0.7293 |
0.8093 | 1.6068 | 18800 | 0.7304 |
0.7959 | 1.6154 | 18900 | 0.7257 |
0.7905 | 1.6239 | 19000 | 0.7249 |
0.8009 | 1.6325 | 19100 | 0.7228 |
0.7964 | 1.6410 | 19200 | 0.7268 |
0.7851 | 1.6496 | 19300 | 0.7275 |
0.7882 | 1.6581 | 19400 | 0.7231 |
0.8029 | 1.6667 | 19500 | 0.7241 |
0.7821 | 1.6752 | 19600 | 0.7206 |
0.7779 | 1.6838 | 19700 | 0.7244 |
0.8038 | 1.6923 | 19800 | 0.7229 |
0.7965 | 1.7009 | 19900 | 0.7234 |
0.8075 | 1.7094 | 20000 | 0.7244 |
0.7901 | 1.7179 | 20100 | 0.7223 |
0.7817 | 1.7265 | 20200 | 0.7212 |
0.8065 | 1.7350 | 20300 | 0.7240 |
0.791 | 1.7436 | 20400 | 0.7190 |
0.8007 | 1.7521 | 20500 | 0.7173 |
0.7936 | 1.7607 | 20600 | 0.7195 |
0.7983 | 1.7692 | 20700 | 0.7192 |
0.802 | 1.7778 | 20800 | 0.7177 |
0.8083 | 1.7863 | 20900 | 0.7195 |
0.7753 | 1.7949 | 21000 | 0.7179 |
0.8003 | 1.8034 | 21100 | 0.7188 |
0.7912 | 1.8120 | 21200 | 0.7194 |
0.7641 | 1.8205 | 21300 | 0.7190 |
0.7941 | 1.8291 | 21400 | 0.7192 |
0.8073 | 1.8376 | 21500 | 0.7174 |
0.789 | 1.8462 | 21600 | 0.7177 |
0.7816 | 1.8547 | 21700 | 0.7156 |
0.784 | 1.8632 | 21800 | 0.7182 |
0.7758 | 1.8718 | 21900 | 0.7152 |
0.7668 | 1.8803 | 22000 | 0.7179 |
0.7873 | 1.8889 | 22100 | 0.7155 |
0.7835 | 1.8974 | 22200 | 0.7151 |
0.7857 | 1.9060 | 22300 | 0.7131 |
0.7847 | 1.9145 | 22400 | 0.7128 |
0.7866 | 1.9231 | 22500 | 0.7118 |
0.8015 | 1.9316 | 22600 | 0.7104 |
0.7802 | 1.9402 | 22700 | 0.7139 |
0.788 | 1.9487 | 22800 | 0.7113 |
0.7813 | 1.9573 | 22900 | 0.7134 |
0.768 | 1.9658 | 23000 | 0.7117 |
0.7963 | 1.9744 | 23100 | 0.7113 |
0.7598 | 1.9829 | 23200 | 0.7117 |
0.7892 | 1.9915 | 23300 | 0.7104 |
0.769 | 2.0 | 23400 | 0.7138 |
0.7943 | 2.0085 | 23500 | 0.7099 |
0.7883 | 2.0171 | 23600 | 0.7092 |
0.77 | 2.0256 | 23700 | 0.7122 |
0.7731 | 2.0342 | 23800 | 0.7102 |
0.7842 | 2.0427 | 23900 | 0.7072 |
0.7737 | 2.0513 | 24000 | 0.7097 |
0.7776 | 2.0598 | 24100 | 0.7121 |
0.763 | 2.0684 | 24200 | 0.7107 |
0.7835 | 2.0769 | 24300 | 0.7074 |
0.7834 | 2.0855 | 24400 | 0.7081 |
0.7823 | 2.0940 | 24500 | 0.7064 |
0.7809 | 2.1026 | 24600 | 0.7089 |
0.7764 | 2.1111 | 24700 | 0.7073 |
0.7756 | 2.1197 | 24800 | 0.7094 |
0.7605 | 2.1282 | 24900 | 0.7064 |
0.7649 | 2.1368 | 25000 | 0.7051 |
0.7724 | 2.1453 | 25100 | 0.7058 |
0.7506 | 2.1538 | 25200 | 0.7060 |
0.7682 | 2.1624 | 25300 | 0.7061 |
0.7797 | 2.1709 | 25400 | 0.7086 |
0.7712 | 2.1795 | 25500 | 0.7051 |
0.7538 | 2.1880 | 25600 | 0.7073 |
0.7748 | 2.1966 | 25700 | 0.7043 |
0.7687 | 2.2051 | 25800 | 0.7061 |
0.7813 | 2.2137 | 25900 | 0.7049 |
0.7553 | 2.2222 | 26000 | 0.7043 |
0.7655 | 2.2308 | 26100 | 0.7059 |
0.7795 | 2.2393 | 26200 | 0.7048 |
0.7788 | 2.2479 | 26300 | 0.7069 |
0.7999 | 2.2564 | 26400 | 0.7034 |
0.7841 | 2.2650 | 26500 | 0.7034 |
0.7777 | 2.2735 | 26600 | 0.7007 |
0.7755 | 2.2821 | 26700 | 0.7021 |
0.7761 | 2.2906 | 26800 | 0.7048 |
0.7844 | 2.2991 | 26900 | 0.7043 |
0.7724 | 2.3077 | 27000 | 0.7041 |
0.7784 | 2.3162 | 27100 | 0.7045 |
0.769 | 2.3248 | 27200 | 0.7013 |
0.7606 | 2.3333 | 27300 | 0.7028 |
0.7801 | 2.3419 | 27400 | 0.7015 |
0.7731 | 2.3504 | 27500 | 0.7023 |
0.7534 | 2.3590 | 27600 | 0.7009 |
0.7628 | 2.3675 | 27700 | 0.7006 |
0.7614 | 2.3761 | 27800 | 0.7002 |
0.7508 | 2.3846 | 27900 | 0.7017 |
0.7667 | 2.3932 | 28000 | 0.7021 |
0.7782 | 2.4017 | 28100 | 0.7016 |
0.7637 | 2.4103 | 28200 | 0.7003 |
0.7508 | 2.4188 | 28300 | 0.6993 |
0.7764 | 2.4274 | 28400 | 0.6995 |
0.7614 | 2.4359 | 28500 | 0.7035 |
0.7724 | 2.4444 | 28600 | 0.6979 |
0.7631 | 2.4530 | 28700 | 0.6992 |
0.778 | 2.4615 | 28800 | 0.6996 |
0.7612 | 2.4701 | 28900 | 0.6993 |
0.7767 | 2.4786 | 29000 | 0.6965 |
0.7543 | 2.4872 | 29100 | 0.6993 |
0.7781 | 2.4957 | 29200 | 0.6974 |
0.7747 | 2.5043 | 29300 | 0.6978 |
0.7715 | 2.5128 | 29400 | 0.6973 |
0.7719 | 2.5214 | 29500 | 0.6996 |
0.7493 | 2.5299 | 29600 | 0.6984 |
0.7718 | 2.5385 | 29700 | 0.7002 |
0.7701 | 2.5470 | 29800 | 0.6993 |
0.7747 | 2.5556 | 29900 | 0.6965 |
0.778 | 2.5641 | 30000 | 0.7004 |
0.7506 | 2.5726 | 30100 | 0.6994 |
0.7429 | 2.5812 | 30200 | 0.6987 |
0.7605 | 2.5897 | 30300 | 0.6978 |
0.7632 | 2.5983 | 30400 | 0.7002 |
0.7532 | 2.6068 | 30500 | 0.6996 |
0.7725 | 2.6154 | 30600 | 0.6970 |
0.7528 | 2.6239 | 30700 | 0.6969 |
0.7692 | 2.6325 | 30800 | 0.6967 |
0.7806 | 2.6410 | 30900 | 0.6961 |
0.7779 | 2.6496 | 31000 | 0.6964 |
0.7606 | 2.6581 | 31100 | 0.6972 |
0.7438 | 2.6667 | 31200 | 0.6958 |
0.7687 | 2.6752 | 31300 | 0.6952 |
0.7429 | 2.6838 | 31400 | 0.6964 |
0.7737 | 2.6923 | 31500 | 0.6968 |
0.7559 | 2.7009 | 31600 | 0.6953 |
0.764 | 2.7094 | 31700 | 0.6955 |
0.7669 | 2.7179 | 31800 | 0.6959 |
0.7793 | 2.7265 | 31900 | 0.6949 |
0.7676 | 2.7350 | 32000 | 0.6958 |
0.7611 | 2.7436 | 32100 | 0.6951 |
0.7945 | 2.7521 | 32200 | 0.6935 |
0.7597 | 2.7607 | 32300 | 0.6942 |
0.7653 | 2.7692 | 32400 | 0.6951 |
0.7427 | 2.7778 | 32500 | 0.6952 |
0.7886 | 2.7863 | 32600 | 0.6937 |
0.7571 | 2.7949 | 32700 | 0.6954 |
0.7546 | 2.8034 | 32800 | 0.6961 |
0.7696 | 2.8120 | 32900 | 0.6944 |
0.7579 | 2.8205 | 33000 | 0.6943 |
0.7723 | 2.8291 | 33100 | 0.6943 |
0.7667 | 2.8376 | 33200 | 0.6945 |
0.7665 | 2.8462 | 33300 | 0.6941 |
0.7537 | 2.8547 | 33400 | 0.6944 |
0.7446 | 2.8632 | 33500 | 0.6952 |
0.7619 | 2.8718 | 33600 | 0.6942 |
0.7628 | 2.8803 | 33700 | 0.6946 |
0.7692 | 2.8889 | 33800 | 0.6946 |
0.7517 | 2.8974 | 33900 | 0.6946 |
0.747 | 2.9060 | 34000 | 0.6947 |
0.7682 | 2.9145 | 34100 | 0.6945 |
0.7541 | 2.9231 | 34200 | 0.6942 |
0.7539 | 2.9316 | 34300 | 0.6947 |
0.7733 | 2.9402 | 34400 | 0.6945 |
0.77 | 2.9487 | 34500 | 0.6942 |
0.7589 | 2.9573 | 34600 | 0.6943 |
0.7552 | 2.9658 | 34700 | 0.6944 |
0.7587 | 2.9744 | 34800 | 0.6943 |
0.7687 | 2.9829 | 34900 | 0.6943 |
0.7518 | 2.9915 | 35000 | 0.6944 |
0.7695 | 3.0 | 35100 | 0.6943 |
Framework versions
- Transformers 4.52.3
- Pytorch 2.7.0+cu128
- Datasets 3.6.0
- Tokenizers 0.21.1
- Downloads last month
- 448
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for kienhoang123/t5-vi-instruct-hf
Base model
google/flan-t5-small