increased context length (at least 1k?) / multilingual gte / open source dataset?
#10 opened about 6 hours ago
by
aari1995

GGUF version?
#8 opened about 2 months ago
by
kalle07
Continue Pretraining
#7 opened 7 months ago
by
HuggySSO
Embedding from transformers
#6 opened 7 months ago
by
tillwenke

"[...] mixture of full fine-tuning and LoRA was used to provide better generalization."
#5 opened 8 months ago
by
bobox