Text Generation
Transformers
Safetensors
English
ddllama
conversational
custom_code
File size: 6,765 Bytes
6c7b28c
821b086
 
6c7b28c
 
 
 
 
 
821b086
 
 
 
6c7b28c
 
 
 
 
 
 
 
 
 
 
821b086
 
6c7b28c
 
 
 
 
821b086
6c7b28c
 
821b086
 
6c7b28c
 
821b086
6c7b28c
821b086
6c7b28c
 
 
821b086
6c7b28c
821b086
 
6c7b28c
 
 
 
 
 
 
821b086
6c7b28c
 
 
 
 
 
 
 
 
821b086
 
 
6c7b28c
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
f (pretrained=../,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: 5, batch_size: 16
|                 Tasks                 |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|---------------------------------------|------:|------|-----:|------|---|-----:|---|-----:|
|mmlu                                   |      2|none  |      |acc   ||0.6634|±  |0.0038|
| - humanities                          |      2|none  |      |acc   ||0.6077|±  |0.0068|
|  - formal_logic                       |      1|none  |     5|acc   ||0.5238|±  |0.0447|
|  - high_school_european_history       |      1|none  |     5|acc   ||0.7455|±  |0.0340|
|  - high_school_us_history             |      1|none  |     5|acc   ||0.8480|±  |0.0252|
|  - high_school_world_history          |      1|none  |     5|acc   ||0.8312|±  |0.0244|
|  - international_law                  |      1|none  |     5|acc   ||0.7769|±  |0.0380|
|  - jurisprudence                      |      1|none  |     5|acc   ||0.7870|±  |0.0396|
|  - logical_fallacies                  |      1|none  |     5|acc   ||0.7546|±  |0.0338|
|  - moral_disputes                     |      1|none  |     5|acc   ||0.7457|±  |0.0234|
|  - moral_scenarios                    |      1|none  |     5|acc   ||0.4458|±  |0.0166|
|  - philosophy                         |      1|none  |     5|acc   ||0.7460|±  |0.0247|
|  - prehistory                         |      1|none  |     5|acc   ||0.7407|±  |0.0244|
|  - professional_law                   |      1|none  |     5|acc   ||0.4752|±  |0.0128|
|  - world_religions                    |      1|none  |     5|acc   ||0.8187|±  |0.0295|
| - other                               |      2|none  |      |acc   ||0.7441|±  |0.0076|
|  - business_ethics                    |      1|none  |     5|acc   ||0.6800|±  |0.0469|
|  - clinical_knowledge                 |      1|none  |     5|acc   ||0.7509|±  |0.0266|
|  - college_medicine                   |      1|none  |     5|acc   ||0.6647|±  |0.0360|
|  - global_facts                       |      1|none  |     5|acc   ||0.4700|±  |0.0502|
|  - human_aging                        |      1|none  |     5|acc   ||0.7265|±  |0.0299|
|  - management                         |      1|none  |     5|acc   ||0.8155|±  |0.0384|
|  - marketing                          |      1|none  |     5|acc   ||0.8974|±  |0.0199|
|  - medical_genetics                   |      1|none  |     5|acc   ||0.8300|±  |0.0378|
|  - miscellaneous                      |      1|none  |     5|acc   ||0.8519|±  |0.0127|
|  - nutrition                          |      1|none  |     5|acc   ||0.7582|±  |0.0245|
|  - professional_accounting            |      1|none  |     5|acc   ||0.5426|±  |0.0297|
|  - professional_medicine              |      1|none  |     5|acc   ||0.7426|±  |0.0266|
|  - virology                           |      1|none  |     5|acc   ||0.5422|±  |0.0388|
| - social sciences                     |      2|none  |      |acc   ||0.7595|±  |0.0075|
|  - econometrics                       |      1|none  |     5|acc   ||0.5263|±  |0.0470|
|  - high_school_geography              |      1|none  |     5|acc   ||0.8283|±  |0.0269|
|  - high_school_government_and_politics|      1|none  |     5|acc   ||0.9171|±  |0.0199|
|  - high_school_macroeconomics         |      1|none  |     5|acc   ||0.6615|±  |0.0240|
|  - high_school_microeconomics         |      1|none  |     5|acc   ||0.7395|±  |0.0285|
|  - high_school_psychology             |      1|none  |     5|acc   ||0.8587|±  |0.0149|
|  - human_sexuality                    |      1|none  |     5|acc   ||0.7786|±  |0.0364|
|  - professional_psychology            |      1|none  |     5|acc   ||0.7026|±  |0.0185|
|  - public_relations                   |      1|none  |     5|acc   ||0.6545|±  |0.0455|
|  - security_studies                   |      1|none  |     5|acc   ||0.7143|±  |0.0289|
|  - sociology                          |      1|none  |     5|acc   ||0.8507|±  |0.0252|
|  - us_foreign_policy                  |      1|none  |     5|acc   ||0.8400|±  |0.0368|
| - stem                                |      2|none  |      |acc   ||0.5734|±  |0.0084|
|  - abstract_algebra                   |      1|none  |     5|acc   ||0.3700|±  |0.0485|
|  - anatomy                            |      1|none  |     5|acc   ||0.7037|±  |0.0394|
|  - astronomy                          |      1|none  |     5|acc   ||0.7566|±  |0.0349|
|  - college_biology                    |      1|none  |     5|acc   ||0.8264|±  |0.0317|
|  - college_chemistry                  |      1|none  |     5|acc   ||0.4800|±  |0.0502|
|  - college_computer_science           |      1|none  |     5|acc   ||0.5300|±  |0.0502|
|  - college_mathematics                |      1|none  |     5|acc   ||0.3500|±  |0.0479|
|  - college_physics                    |      1|none  |     5|acc   ||0.4902|±  |0.0497|
|  - computer_security                  |      1|none  |     5|acc   ||0.7600|±  |0.0429|
|  - conceptual_physics                 |      1|none  |     5|acc   ||0.5957|±  |0.0321|
|  - electrical_engineering             |      1|none  |     5|acc   ||0.6069|±  |0.0407|
|  - elementary_mathematics             |      1|none  |     5|acc   ||0.4656|±  |0.0257|
|  - high_school_biology                |      1|none  |     5|acc   ||0.8032|±  |0.0226|
|  - high_school_chemistry              |      1|none  |     5|acc   ||0.5567|±  |0.0350|
|  - high_school_computer_science       |      1|none  |     5|acc   ||0.6900|±  |0.0465|
|  - high_school_mathematics            |      1|none  |     5|acc   ||0.3704|±  |0.0294|
|  - high_school_physics                |      1|none  |     5|acc   ||0.4702|±  |0.0408|
|  - high_school_statistics             |      1|none  |     5|acc   ||0.5694|±  |0.0338|
|  - machine_learning                   |      1|none  |     5|acc   ||0.4554|±  |0.0473|

|      Groups      |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|------------------|------:|------|------|------|---|-----:|---|-----:|
|mmlu              |      2|none  |      |acc   ||0.6634|±  |0.0038|
| - humanities     |      2|none  |      |acc   ||0.6077|±  |0.0068|
| - other          |      2|none  |      |acc   ||0.7441|±  |0.0076|
| - social sciences|      2|none  |      |acc   ||0.7595|±  |0.0075|
| - stem           |      2|none  |      |acc   ||0.5734|±  |0.0084|