Zoran
zokica
AI & ML interests
None yet
Recent Activity
new activity
11 days ago
unsloth/gemma-3-4b-it-unsloth-bnb-4bit:Does not work at all
new activity
12 days ago
ISTA-DASLab/gemma-3-4b-it-GPTQ-4b-128g:Does not work
new activity
9 months ago
google/gemma-2-9b:Gemma 2's Flash attention 2 implementation is strange...
Organizations
None yet
zokica's activity
Does not work at all
9
#1 opened about 1 month ago
by
zokica
Does not work
2
#1 opened 12 days ago
by
zokica
Gemma 2's Flash attention 2 implementation is strange...
61
#23 opened 10 months ago
by
GPT007

Problem with Lora finetuning, Out of memory
3
#13 opened 9 months ago
by
zokica
OOM when finetuning with lora.
5
#1 opened 9 months ago
by
zokica
Model repeating information and "spitting out" random characters
8
#14 opened 10 months ago
by
brazilianslib
Gemma2FlashAttention2 missing sliding_window variable
7
2
#8 opened 10 months ago
by
emozilla

why UMT5
6
#1 opened about 1 year ago
by
pszemraj

Something broken on last update
7
7
#85 opened about 1 year ago
by
Nayjest
Can't get it to generate the EOS token and beam search is not supported
2
#3 opened about 1 year ago
by
miguelcarv
How to fine-tune this? + Training code
13
44
#19 opened over 1 year ago
by
cekal

Generation after finetuning does not ends at EOS token
1
#123 opened about 1 year ago
by
zokica
Attention mask for generation function in the future?
21
#7 opened over 1 year ago
by
rchan26

guanaco-65b
6
#1 opened almost 2 years ago
by
bodaay
Speed on CPU
13
#8 opened almost 2 years ago
by
zokica
Will you make a 3B model as well?
4
#7 opened almost 2 years ago
by
zokica
How do you run this?
3
#2 opened almost 2 years ago
by
zokica
How to run this?
3
#13 opened almost 2 years ago
by
zokica
Does not work at all, i tried to calculate cola
11
#2 opened about 2 years ago
by
zokica
This works, but training does not work at all
6
#4 opened about 2 years ago
by
zokica