Se124M100KInfPrompt_NT

This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3899

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
2.9983 0.0082 20 2.6302
2.9256 0.0164 40 2.6331
2.9534 0.0246 60 2.6305
2.9277 0.0327 80 2.6052
2.8694 0.0409 100 2.5836
2.879 0.0491 120 2.5278
2.7972 0.0573 140 2.4722
2.7112 0.0655 160 2.4048
2.5739 0.0737 180 2.3244
2.4522 0.0819 200 2.2167
2.3121 0.0901 220 2.0842
2.1652 0.0982 240 1.9278
2.0135 0.1064 260 1.7658
1.8352 0.1146 280 1.5877
1.6331 0.1228 300 1.3988
1.4721 0.1310 320 1.2257
1.3347 0.1392 340 1.0901
1.202 0.1474 360 0.9639
1.125 0.1555 380 0.8691
1.002 0.1637 400 0.8003
0.9698 0.1719 420 0.7525
0.8963 0.1801 440 0.7148
0.8571 0.1883 460 0.6803
0.7983 0.1965 480 0.6542
0.7838 0.2047 500 0.6332
0.7689 0.2129 520 0.6118
0.7256 0.2210 540 0.5931
0.7146 0.2292 560 0.5799
0.686 0.2374 580 0.5673
0.6729 0.2456 600 0.5565
0.6628 0.2538 620 0.5445
0.6525 0.2620 640 0.5406
0.6298 0.2702 660 0.5328
0.6345 0.2783 680 0.5237
0.6171 0.2865 700 0.5169
0.6052 0.2947 720 0.5113
0.5862 0.3029 740 0.5066
0.5767 0.3111 760 0.5021
0.5777 0.3193 780 0.4966
0.5689 0.3275 800 0.4939
0.5677 0.3357 820 0.4894
0.5567 0.3438 840 0.4878
0.5547 0.3520 860 0.4817
0.5516 0.3602 880 0.4808
0.5577 0.3684 900 0.4787
0.5461 0.3766 920 0.4740
0.5449 0.3848 940 0.4712
0.5301 0.3930 960 0.4711
0.5313 0.4011 980 0.4682
0.5278 0.4093 1000 0.4676
0.518 0.4175 1020 0.4643
0.531 0.4257 1040 0.4621
0.5302 0.4339 1060 0.4624
0.5238 0.4421 1080 0.4581
0.5179 0.4503 1100 0.4572
0.5167 0.4585 1120 0.4577
0.5181 0.4666 1140 0.4534
0.5207 0.4748 1160 0.4536
0.5037 0.4830 1180 0.4533
0.5117 0.4912 1200 0.4517
0.5066 0.4994 1220 0.4500
0.5023 0.5076 1240 0.4487
0.4903 0.5158 1260 0.4470
0.4916 0.5239 1280 0.4462
0.4908 0.5321 1300 0.4460
0.4956 0.5403 1320 0.4443
0.5059 0.5485 1340 0.4438
0.4908 0.5567 1360 0.4427
0.4978 0.5649 1380 0.4416
0.4861 0.5731 1400 0.4410
0.4865 0.5813 1420 0.4404
0.4916 0.5894 1440 0.4381
0.4832 0.5976 1460 0.4352
0.4811 0.6058 1480 0.4381
0.4779 0.6140 1500 0.4364
0.4792 0.6222 1520 0.4381
0.4755 0.6304 1540 0.4346
0.4797 0.6386 1560 0.4358
0.4769 0.6467 1580 0.4321
0.4682 0.6549 1600 0.4323
0.4797 0.6631 1620 0.4338
0.4754 0.6713 1640 0.4332
0.4687 0.6795 1660 0.4325
0.4629 0.6877 1680 0.4330
0.478 0.6959 1700 0.4312
0.4693 0.7041 1720 0.4291
0.4746 0.7122 1740 0.4305
0.4626 0.7204 1760 0.4300
0.4641 0.7286 1780 0.4317
0.4606 0.7368 1800 0.4287
0.4678 0.7450 1820 0.4278
0.4736 0.7532 1840 0.4267
0.4739 0.7614 1860 0.4270
0.4627 0.7695 1880 0.4269
0.4596 0.7777 1900 0.4247
0.4617 0.7859 1920 0.4245
0.4663 0.7941 1940 0.4238
0.4569 0.8023 1960 0.4243
0.4683 0.8105 1980 0.4229
0.4664 0.8187 2000 0.4231
0.4711 0.8269 2020 0.4203
0.4712 0.8350 2040 0.4201
0.4579 0.8432 2060 0.4186
0.4688 0.8514 2080 0.4221
0.4566 0.8596 2100 0.4222
0.4573 0.8678 2120 0.4179
0.4606 0.8760 2140 0.4183
0.456 0.8842 2160 0.4189
0.4684 0.8923 2180 0.4180
0.4522 0.9005 2200 0.4183
0.4591 0.9087 2220 0.4171
0.457 0.9169 2240 0.4194
0.4714 0.9251 2260 0.4160
0.4637 0.9333 2280 0.4173
0.4454 0.9415 2300 0.4190
0.4579 0.9497 2320 0.4133
0.4567 0.9578 2340 0.4153
0.4479 0.9660 2360 0.4152
0.4523 0.9742 2380 0.4138
0.4559 0.9824 2400 0.4147
0.4493 0.9906 2420 0.4131
0.4568 0.9988 2440 0.4145
0.4494 1.0070 2460 0.4120
0.4549 1.0151 2480 0.4120
0.4491 1.0233 2500 0.4130
0.454 1.0315 2520 0.4143
0.4474 1.0397 2540 0.4134
0.4541 1.0479 2560 0.4134
0.4458 1.0561 2580 0.4117
0.4469 1.0643 2600 0.4108
0.4502 1.0725 2620 0.4120
0.4447 1.0806 2640 0.4102
0.445 1.0888 2660 0.4107
0.4496 1.0970 2680 0.4080
0.445 1.1052 2700 0.4097
0.4549 1.1134 2720 0.4071
0.4476 1.1216 2740 0.4095
0.4427 1.1298 2760 0.4111
0.4412 1.1379 2780 0.4091
0.441 1.1461 2800 0.4111
0.4465 1.1543 2820 0.4080
0.4427 1.1625 2840 0.4076
0.4417 1.1707 2860 0.4080
0.4409 1.1789 2880 0.4080
0.4573 1.1871 2900 0.4078
0.443 1.1953 2920 0.4067
0.4412 1.2034 2940 0.4079
0.4384 1.2116 2960 0.4079
0.4426 1.2198 2980 0.4083
0.4407 1.2280 3000 0.4056
0.4487 1.2362 3020 0.4059
0.4421 1.2444 3040 0.4064
0.4412 1.2526 3060 0.4057
0.4354 1.2607 3080 0.4073
0.4454 1.2689 3100 0.4056
0.4376 1.2771 3120 0.4064
0.4469 1.2853 3140 0.4043
0.4437 1.2935 3160 0.4038
0.4412 1.3017 3180 0.4031
0.4354 1.3099 3200 0.4053
0.4413 1.3181 3220 0.4050
0.4344 1.3262 3240 0.4048
0.4471 1.3344 3260 0.4022
0.4347 1.3426 3280 0.4049
0.4367 1.3508 3300 0.4019
0.4391 1.3590 3320 0.4033
0.4424 1.3672 3340 0.4019
0.4391 1.3754 3360 0.4009
0.4377 1.3835 3380 0.4014
0.4413 1.3917 3400 0.4015
0.4382 1.3999 3420 0.4006
0.4298 1.4081 3440 0.4015
0.4503 1.4163 3460 0.4019
0.4413 1.4245 3480 0.4015
0.4343 1.4327 3500 0.3996
0.4373 1.4409 3520 0.4002
0.4338 1.4490 3540 0.4016
0.4292 1.4572 3560 0.4000
0.4444 1.4654 3580 0.4004
0.4342 1.4736 3600 0.3996
0.4339 1.4818 3620 0.4004
0.4291 1.4900 3640 0.4006
0.435 1.4982 3660 0.3993
0.445 1.5063 3680 0.3999
0.4389 1.5145 3700 0.4009
0.4316 1.5227 3720 0.3988
0.4363 1.5309 3740 0.3994
0.4384 1.5391 3760 0.3995
0.4355 1.5473 3780 0.4006
0.436 1.5555 3800 0.3983
0.4384 1.5637 3820 0.3981
0.4394 1.5718 3840 0.3985
0.4392 1.5800 3860 0.3978
0.4456 1.5882 3880 0.3991
0.4359 1.5964 3900 0.3984
0.4328 1.6046 3920 0.4004
0.4272 1.6128 3940 0.3992
0.4352 1.6210 3960 0.3993
0.4262 1.6291 3980 0.3994
0.4406 1.6373 4000 0.3979
0.4291 1.6455 4020 0.3991
0.4262 1.6537 4040 0.3975
0.4337 1.6619 4060 0.3978
0.4404 1.6701 4080 0.3964
0.4408 1.6783 4100 0.3983
0.4378 1.6865 4120 0.3977
0.4322 1.6946 4140 0.3973
0.4343 1.7028 4160 0.3970
0.43 1.7110 4180 0.3961
0.4343 1.7192 4200 0.3958
0.4308 1.7274 4220 0.3965
0.4355 1.7356 4240 0.3952
0.4371 1.7438 4260 0.3966
0.4342 1.7519 4280 0.3956
0.4364 1.7601 4300 0.3962
0.434 1.7683 4320 0.3953
0.4335 1.7765 4340 0.3965
0.4317 1.7847 4360 0.3953
0.4298 1.7929 4380 0.3954
0.4307 1.8011 4400 0.3942
0.4345 1.8093 4420 0.3952
0.433 1.8174 4440 0.3943
0.4261 1.8256 4460 0.3955
0.4338 1.8338 4480 0.3950
0.4263 1.8420 4500 0.3944
0.4263 1.8502 4520 0.3939
0.436 1.8584 4540 0.3943
0.432 1.8666 4560 0.3946
0.4302 1.8747 4580 0.3942
0.4333 1.8829 4600 0.3936
0.4316 1.8911 4620 0.3936
0.4294 1.8993 4640 0.3938
0.4265 1.9075 4660 0.3936
0.4294 1.9157 4680 0.3943
0.4319 1.9239 4700 0.3942
0.4391 1.9321 4720 0.3933
0.4243 1.9402 4740 0.3944
0.4325 1.9484 4760 0.3930
0.4343 1.9566 4780 0.3924
0.4287 1.9648 4800 0.3938
0.4322 1.9730 4820 0.3933
0.4283 1.9812 4840 0.3926
0.4309 1.9894 4860 0.3935
0.4238 1.9975 4880 0.3922
0.4217 2.0057 4900 0.3925
0.425 2.0139 4920 0.3926
0.4389 2.0221 4940 0.3925
0.4346 2.0303 4960 0.3920
0.4254 2.0385 4980 0.3931
0.4223 2.0467 5000 0.3919
0.4268 2.0549 5020 0.3930
0.4228 2.0630 5040 0.3929
0.4325 2.0712 5060 0.3928
0.4255 2.0794 5080 0.3928
0.4305 2.0876 5100 0.3922
0.4333 2.0958 5120 0.3919
0.4332 2.1040 5140 0.3927
0.4261 2.1122 5160 0.3929
0.429 2.1203 5180 0.3916
0.4274 2.1285 5200 0.3921
0.4277 2.1367 5220 0.3928
0.4356 2.1449 5240 0.3913
0.4268 2.1531 5260 0.3921
0.4297 2.1613 5280 0.3921
0.4272 2.1695 5300 0.3915
0.4337 2.1777 5320 0.3922
0.4312 2.1858 5340 0.3911
0.426 2.1940 5360 0.3917
0.4305 2.2022 5380 0.3925
0.4373 2.2104 5400 0.3919
0.4319 2.2186 5420 0.3914
0.43 2.2268 5440 0.3921
0.4307 2.2350 5460 0.3910
0.4352 2.2431 5480 0.3912
0.4323 2.2513 5500 0.3907
0.4255 2.2595 5520 0.3905
0.4286 2.2677 5540 0.3913
0.4271 2.2759 5560 0.3916
0.4319 2.2841 5580 0.3915
0.4175 2.2923 5600 0.3911
0.424 2.3005 5620 0.3914
0.4365 2.3086 5640 0.3907
0.4322 2.3168 5660 0.3906
0.4227 2.3250 5680 0.3910
0.4308 2.3332 5700 0.3909
0.4268 2.3414 5720 0.3910
0.4352 2.3496 5740 0.3911
0.4274 2.3578 5760 0.3898
0.4255 2.3659 5780 0.3901
0.4277 2.3741 5800 0.3903
0.4209 2.3823 5820 0.3905
0.4221 2.3905 5840 0.3911
0.4247 2.3987 5860 0.3911
0.4263 2.4069 5880 0.3910
0.4284 2.4151 5900 0.3912
0.4251 2.4233 5920 0.3910
0.4275 2.4314 5940 0.3908
0.4271 2.4396 5960 0.3904
0.4333 2.4478 5980 0.3904
0.4237 2.4560 6000 0.3903
0.4351 2.4642 6020 0.3903
0.4313 2.4724 6040 0.3902
0.4243 2.4806 6060 0.3910
0.4289 2.4887 6080 0.3907
0.4299 2.4969 6100 0.3909
0.428 2.5051 6120 0.3903
0.4202 2.5133 6140 0.3902
0.4291 2.5215 6160 0.3899
0.4344 2.5297 6180 0.3899
0.4256 2.5379 6200 0.3902
0.4227 2.5460 6220 0.3904
0.43 2.5542 6240 0.3907
0.4252 2.5624 6260 0.3900
0.4224 2.5706 6280 0.3909
0.4207 2.5788 6300 0.3909
0.4265 2.5870 6320 0.3906
0.4341 2.5952 6340 0.3907
0.4228 2.6034 6360 0.3903
0.4196 2.6115 6380 0.3904
0.4216 2.6197 6400 0.3897
0.4339 2.6279 6420 0.3904
0.4255 2.6361 6440 0.3903
0.4261 2.6443 6460 0.3905
0.43 2.6525 6480 0.3906
0.4265 2.6607 6500 0.3907
0.4279 2.6688 6520 0.3904
0.4298 2.6770 6540 0.3901
0.4312 2.6852 6560 0.3901
0.4199 2.6934 6580 0.3898
0.4288 2.7016 6600 0.3902
0.4325 2.7098 6620 0.3905
0.4246 2.7180 6640 0.3903
0.4281 2.7262 6660 0.3899
0.4296 2.7343 6680 0.3903
0.4247 2.7425 6700 0.3898
0.4252 2.7507 6720 0.3905
0.4255 2.7589 6740 0.3904
0.4282 2.7671 6760 0.3902
0.4225 2.7753 6780 0.3900
0.4251 2.7835 6800 0.3900
0.4201 2.7916 6820 0.3903
0.4252 2.7998 6840 0.3905
0.427 2.8080 6860 0.3907
0.428 2.8162 6880 0.3907
0.437 2.8244 6900 0.3900
0.4257 2.8326 6920 0.3901
0.4239 2.8408 6940 0.3905
0.4276 2.8490 6960 0.3902
0.4274 2.8571 6980 0.3897
0.4327 2.8653 7000 0.3902
0.4313 2.8735 7020 0.3896
0.4277 2.8817 7040 0.3904
0.4289 2.8899 7060 0.3904
0.4321 2.8981 7080 0.3900
0.4232 2.9063 7100 0.3902
0.4274 2.9144 7120 0.3901
0.4339 2.9226 7140 0.3901
0.4226 2.9308 7160 0.3904
0.4184 2.9390 7180 0.3902
0.4242 2.9472 7200 0.3901
0.4259 2.9554 7220 0.3902
0.4297 2.9636 7240 0.3897
0.4268 2.9718 7260 0.3900
0.4281 2.9799 7280 0.3900
0.4234 2.9881 7300 0.3901
0.4196 2.9963 7320 0.3900

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu118
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for augustocsc/Se124M100KInfPrompt_NT

Adapter
(1671)
this model