Still greatly underperforming baseline gemma12b
#3
by
concedo
- opened
perhaps 12b is harder to abliterate, or the right layers are not being picked. This model is still severely underperforming gemma12b. In particular, it seems to have an issue with early stopping - responses are often extremely short and cut off halfway through the steps.
The 27b v1 abliterated performs much better in comparison, as does the baseline 12b.
I also feel the same way, the responses are often very short and the understanding for my country's language is reduced