distilled-qwen3-0.6b-full-mmlu / checkpoint-7

Ctrl+K

1 contributor

History: 2 commits

CarlOwOs

Upload distilled Qwen model (Full) - α=0.7, T=4.0

46ab699 verified 3 months ago

added_tokens.json

707 Bytes

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
chat_template.jinja

4.12 kB

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
config.json

726 Bytes

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
generation_config.json

117 Bytes

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
merges.txt

1.67 MB

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
model.safetensors

3.01 GB
LFS

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
model.safetensors.index.json

20.9 kB

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
optimizer.pt
Detected Pickle imports (3)
- "torch.FloatStorage",
- "collections.OrderedDict",
- "torch._utils._rebuild_tensor_v2"
What is a pickle import?
4.77 GB
LFS

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
pytorch_model.bin
Detected Pickle imports (3)
- "torch._utils._rebuild_tensor_v2",
- "torch.FloatStorage",
- "collections.OrderedDict"
What is a pickle import?
2.38 GB
LFS

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
rng_state.pth
Detected Pickle imports (7)
- "collections.OrderedDict",
- "numpy.dtype",
- "_codecs.encode",
- "numpy.core.multiarray._reconstruct",
- "torch._utils._rebuild_tensor_v2",
- "numpy.ndarray",
- "torch.ByteStorage"
How to fix it?
14.2 kB
LFS

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
scaler.pt
Pickle imports
- No problematic imports detected
What is a pickle import?
988 Bytes
LFS

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
scheduler.pt
Pickle imports
- No problematic imports detected
What is a pickle import?
1.06 kB
LFS

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
special_tokens_map.json

616 Bytes

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
tokenizer.json

11.4 MB
LFS

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
tokenizer_config.json

5.41 kB

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
trainer_state.json

762 Bytes

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago
vocab.json

2.78 MB

Upload distilled Qwen model (Full) - α=0.7, T=4.0 3 months ago

Detected Pickle imports (3)

Detected Pickle imports (3)

Detected Pickle imports (7)

Pickle imports

Pickle imports