Papers
arxiv:2504.13146

Antidistillation Sampling

Published on Apr 17
· Submitted by schwarzschild on Apr 18
#2 Paper of the day
Authors:
,
,
,
,

Abstract

Frontier models that generate extended reasoning traces inadvertently produce rich token sequences that can facilitate model distillation. Recognizing this vulnerability, model owners may seek sampling strategies that limit the effectiveness of distillation without compromising model performance. Antidistillation sampling provides exactly this capability. By strategically modifying a model's next-token probability distribution, antidistillation sampling poisons reasoning traces, rendering them significantly less effective for distillation while preserving the model's practical utility. For further details, see https://antidistillation.com.

Community

Really nice paper!!
After reading, I have one question here.
What is the underlying intuition for the definition of Eq.6? In my view, the anti-distillation aims to maximize the delta term in Eq.5 or its final simple form in Eq.11. But I do not understand the relation between the delta term and Eq.6. I would greatly appreciate it if you could provide a more in-depth explanation. Thank you! :D

·
Paper author

Thanks! In Eq. 6, we simply take the teacher’s softmax logits and nudge them in the direction that most poisons the student. The vanilla term (1/τ) · log p_T keeps teacher‐preferred tokens likely, and the added λ·Δ term up‐weights exactly those tokens whose fine‐tuning update would increase the student’s loss.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2504.13146 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2504.13146 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2504.13146 in a Space README.md to link it from this page.

Collections including this paper 3