YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Eagle-3 Speculator for Llama-3.1-8B-Instruct

This is an Eagle-3 speculator checkpoint converted to the speculators format.

Model Details

  • Base Model: meta-llama/Meta-Llama-3.1-8B-Instruct
  • Speculator Type: Eagle-3
  • Draft Vocabulary Size: 32,000
  • Target Vocabulary Size: 128,256
  • Architecture: Single-layer transformer with vocabulary mapping

Key Features

  • Vocabulary Mapping: Maps between draft (32K) and target (128K) vocabularies
  • Custom Attention: Modified attention layer accepting 2×hidden_size input
  • Fusion Layer: Processes 3 verifier layers (3×4096 → 4096)
  • Layer Normalization: Applied before residual connection (HF checkpoint style)

Usage

from speculators.models.eagle3 import Eagle3Speculator, Eagle3SpeculatorConfig
from transformers import AutoModelForCausalLM

# Load verifier model
verifier = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")

# Load Eagle-3 speculator
speculator = Eagle3Speculator.from_pretrained(
    "nm-testing/eagle3-llama3.1-8b-instruct-speculators",
    verifier=verifier
)

Configuration

This model uses the Eagle-3 architecture with:

  • Hidden size: 4096
  • Attention heads: 32
  • Key-value heads: 8
  • Intermediate size: 14336
  • RMS norm epsilon: 1e-05

Citation

Based on the Eagle-3 paper: https://arxiv.org/abs/2503.01840

License

Please refer to the base Llama-3.1 model license.

Downloads last month
150
Safetensors
Model size
950M params
Tensor type
I64
·
F32
·
BOOL
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support