Residual stream SAEs for Llama-3.1-8B-Instruct.

These SAEs were trained using a blend of chat (lmsys/lmsys-chat-1m) and pretraining data (monology/pile-uncopyrighted), and also a small amount of emergent misalignment data.

Each SAE is trained using BatchTopK. For each layer, we train 4 SAEs, with k=32,64,128,256.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support