hf (pretrained=../,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 16 | |
| Tasks |Version| Filter |n-shot|Metric| |Value | |Stderr| | |
|---------|------:|-----------|-----:|------|---|-----:|---|-----:| | |
|humaneval| 1|create_test| 0|pass@1| |0.3232|± |0.0366| |