---
license: mit
base_model:
- meta-llama/Llama-3.1-8B
---

The model is derived from Llama-3.1-8B through pruning using LLM-Streamline **(Streamlining Redundant Layers to Compress Large Language Models, ICLR 2025 Spotlight)**. The entire training process required only 1.3B tokens.

Below are the results of the evaluation using lm-eval:
|                | arc_c | arc_e | boolq | hellaswag | openbookqa | rte  | winogrande | Avg  |
|----------------|-------|-------|-------|-----------|------------|------|------------|------|
| Llama-3.1-8B   | 50.4  | 80.3  | 81.2  | 60.2      | 34.8       | 67.9 | 73.0       | 64.0 |
| Llama-3.1-5.4B | 42.1  | 72.2  | 78.0  | 54.3      | 27.2       | 62.8 | 71.0       | 58.2 |