0.5B Model

#9
by chrisvnz - opened

Great work, do you think an 0.5B model would produce reasonable results also?

Agentica org

Unsure! Happy to collaborate to see if an even smaller model can truly learn reasoning. Would most likely require distillation first from Deepseek-R1 ;)

Sign up or log in to comment