smangrul commited on
Commit
0c3cfa2
·
1 Parent(s): 601c5e3

End of training

Browse files
Files changed (1) hide show
  1. README.md +22 -26
README.md CHANGED
@@ -6,7 +6,6 @@ tags:
6
  model-index:
7
  - name: peft-lora-starcoder15B-v2-personal-copilot-A100-40GB-colab
8
  results: []
9
- library_name: peft
10
  ---
11
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -16,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [bigcode/starcoder](https://huggingface.co/bigcode/starcoder) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.3700
20
 
21
  ## Model description
22
 
@@ -32,18 +31,6 @@ More information needed
32
 
33
  ## Training procedure
34
 
35
-
36
- The following `bitsandbytes` quantization config was used during training:
37
- - quant_method: bitsandbytes
38
- - load_in_8bit: False
39
- - load_in_4bit: True
40
- - llm_int8_threshold: 6.0
41
- - llm_int8_skip_modules: None
42
- - llm_int8_enable_fp32_cpu_offload: False
43
- - llm_int8_has_fp16_weight: False
44
- - bnb_4bit_quant_type: nf4
45
- - bnb_4bit_use_double_quant: True
46
- - bnb_4bit_compute_dtype: bfloat16
47
  ### Training hyperparameters
48
 
49
  The following hyperparameters were used during training:
@@ -56,27 +43,36 @@ The following hyperparameters were used during training:
56
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
57
  - lr_scheduler_type: cosine
58
  - lr_scheduler_warmup_steps: 30
59
- - training_steps: 1000
60
 
61
  ### Training results
62
 
63
  | Training Loss | Epoch | Step | Validation Loss |
64
  |:-------------:|:-----:|:----:|:---------------:|
65
- | 0.6438 | 0.1 | 100 | 0.5596 |
66
- | 0.5992 | 0.2 | 200 | 0.4886 |
67
- | 0.6327 | 0.3 | 300 | 0.4319 |
68
- | 0.5269 | 0.4 | 400 | 0.4077 |
69
- | 0.4533 | 0.5 | 500 | 0.3990 |
70
- | 0.4755 | 0.6 | 600 | 0.3852 |
71
- | 0.4445 | 0.7 | 700 | 0.3735 |
72
- | 0.5179 | 0.8 | 800 | 0.3721 |
73
- | 0.4185 | 0.9 | 900 | 0.3706 |
74
- | 0.4807 | 1.0 | 1000 | 0.3700 |
 
 
 
 
 
 
 
 
 
 
75
 
76
 
77
  ### Framework versions
78
 
79
- - PEFT 0.5.0.dev0
80
  - Transformers 4.32.0.dev0
81
  - Pytorch 2.0.1+cu118
82
  - Datasets 2.14.4
 
6
  model-index:
7
  - name: peft-lora-starcoder15B-v2-personal-copilot-A100-40GB-colab
8
  results: []
 
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
15
 
16
  This model is a fine-tuned version of [bigcode/starcoder](https://huggingface.co/bigcode/starcoder) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 0.3096
19
 
20
  ## Model description
21
 
 
31
 
32
  ## Training procedure
33
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
 
43
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
  - lr_scheduler_type: cosine
45
  - lr_scheduler_warmup_steps: 30
46
+ - training_steps: 2000
47
 
48
  ### Training results
49
 
50
  | Training Loss | Epoch | Step | Validation Loss |
51
  |:-------------:|:-----:|:----:|:---------------:|
52
+ | 0.6439 | 0.05 | 100 | 0.5595 |
53
+ | 0.6009 | 0.1 | 200 | 0.4901 |
54
+ | 0.6335 | 0.15 | 300 | 0.4320 |
55
+ | 0.5266 | 0.2 | 400 | 0.4082 |
56
+ | 0.4543 | 0.25 | 500 | 0.4012 |
57
+ | 0.4808 | 0.3 | 600 | 0.3911 |
58
+ | 0.461 | 0.35 | 700 | 0.4364 |
59
+ | 0.5246 | 0.4 | 800 | 0.3720 |
60
+ | 0.408 | 0.45 | 900 | 0.3655 |
61
+ | 0.469 | 0.5 | 1000 | 0.3504 |
62
+ | 0.4257 | 0.55 | 1100 | 0.3396 |
63
+ | 0.4229 | 0.6 | 1200 | 0.3195 |
64
+ | 0.3267 | 0.65 | 1300 | 0.3147 |
65
+ | 0.4682 | 0.7 | 1400 | 0.3110 |
66
+ | 0.3244 | 0.75 | 1500 | 0.3091 |
67
+ | 0.6782 | 0.8 | 1600 | 0.3085 |
68
+ | 0.3123 | 0.85 | 1700 | 0.3084 |
69
+ | 0.3545 | 0.9 | 1800 | 0.3094 |
70
+ | 0.2818 | 0.95 | 1900 | 0.3095 |
71
+ | 0.397 | 1.0 | 2000 | 0.3096 |
72
 
73
 
74
  ### Framework versions
75
 
 
76
  - Transformers 4.32.0.dev0
77
  - Pytorch 2.0.1+cu118
78
  - Datasets 2.14.4