We need to invent a new way to benchmark models. The actual practice is fishy.

Salim Belhaddad
salym
·
AI & ML interests
Knowledge Graphs
Recent Activity
replied to
philschmid's
post
25 days ago
Gemini 2.5 Pro, thinking by default! We excited launch our best Gemini model for reasoning, multimodal and coding yet! #1 on LMSYS, Humanity’s Last Exam, AIME and GPQA and more!
TL;DR:
- 💻 Best Gemini coding model yet, particularly for web development (excels on LiveCodeBench).
- 🧠 Default "Thinking" with up to 64k token output
- 🌌 1 Million multimodal input context for text, image, video, audio, and pdf
- 🛠️ Function calling, structured output, google search & code execution.
- 🏆 #1 on LMArena & sota on AIME, GPQA, Humanity's Last Exam
- 💡 Knowledge cut of January 2025
- 🤗 Available for free as Experimental in AI Studio, Gemini API & Gemini APP
- 🚀 Rate limits - Free 2 RPM 50 req/day
Try it ⬇️
https://aistudio.google.com/?model=gemini-2.5-pro-exp-03-25
updated
a model
about 1 month ago
salym/ppo-LunarLander-v2
Organizations
salym's activity

replied to
philschmid's
post
25 days ago

reacted to
mlabonne's
post with 🚀
about 1 month ago
Post
6123
✂️ Gemma 3 Abliterated
I noticed that Gemma 3 was much more resilient to refusal removal than other models like Qwen 2.5.
I experimented with different recipes and improved the abliteration technique I wrote about last year.
It's still experimental but the refusal rate is super low in my tests. Enjoy!
mlabonne/gemma-3-4b-it-abliterated
mlabonne/gemma-3-12b-it-abliterated
mlabonne/gemma-3-27b-it-abliterated
I noticed that Gemma 3 was much more resilient to refusal removal than other models like Qwen 2.5.
I experimented with different recipes and improved the abliteration technique I wrote about last year.
It's still experimental but the refusal rate is super low in my tests. Enjoy!
mlabonne/gemma-3-4b-it-abliterated
mlabonne/gemma-3-12b-it-abliterated
mlabonne/gemma-3-27b-it-abliterated

reacted to
mlabonne's
post with 🔥
about 1 month ago
Post
8959
✂️ AutoAbliteration
I made a Colab notebook to automatically abliterate models.
It's quite general, so you can do interesting stuff like blocking a given language in the model outputs.
💻 Colab: https://colab.research.google.com/drive/1RmLv-pCMBBsQGXQIM8yF-OdCNyoylUR1?usp=sharing
I made a Colab notebook to automatically abliterate models.
It's quite general, so you can do interesting stuff like blocking a given language in the model outputs.
💻 Colab: https://colab.research.google.com/drive/1RmLv-pCMBBsQGXQIM8yF-OdCNyoylUR1?usp=sharing