Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
etemizΒ 
posted an update 5 days ago
Post
289
I've tested many fine tunes. They were all getting lower scores than base in AHA.

Yesterday I found one fine tune (abliteration) which made the model go from 28 to 46: huihui-ai/Huihui-gpt-oss-120b-BF16-abliterated

Is there a correlation between censorship and being not human aligned?

This is a really good question, and for a long time I have suspected that this is the case. I would be curious as to how Jinx-Qwen3-32B scores on AHA vs the base.

Β·

Interesting, I could test that as well.

Do you know their method of uncensoring? Are they fine tuning or doing vector operations?

I may upload a Qwen3 fine tune for AHA soon (would u like to merge others with it?).

Hi @etemiz ,

Nice to see you here again (we just spoke about the medical dataset).
There was someone who abliterated models when it was still fairly new and he had similar findings.
His repo is: https://huggingface.co/byroneverson

Check out his readme from this model in particular:

https://huggingface.co/byroneverson/Yi-1.5-9B-Chat-16K-abliterated

Β·

Hi Doctor Chad, nice to see you too

Thanks for sharing,

I will test that. Yi 1.5 has the second place on my leaderboard!