Zoran's picture

30

Zoran

zokica

·

AI & ML interests

None yet

Recent Activity

new activity 11 days ago

unsloth/gemma-3-4b-it-unsloth-bnb-4bit:Does not work at all

new activity 12 days ago

ISTA-DASLab/gemma-3-4b-it-GPTQ-4b-128g:Does not work

new activity 9 months ago

google/gemma-2-9b:Gemma 2's Flash attention 2 implementation is strange...

View all activity

Organizations

None yet

zokica's activity

New activity in unsloth/gemma-3-4b-it-unsloth-bnb-4bit 11 days ago

Does not work at all

#1 opened about 1 month ago by

New activity in ISTA-DASLab/gemma-3-4b-it-GPTQ-4b-128g 12 days ago

Does not work

#1 opened 12 days ago by

New activity in google/gemma-2-9b 9 months ago

Gemma 2's Flash attention 2 implementation is strange...

#23 opened 10 months ago by

New activity in google/gemma-2-2b 9 months ago

Problem with Lora finetuning, Out of memory

#13 opened 9 months ago by

New activity in unsloth/gemma-2-2b-bnb-4bit 9 months ago

OOM when finetuning with lora.

#1 opened 9 months ago by

New activity in google/gemma-2-9b 10 months ago

Model repeating information and "spitting out" random characters

#14 opened 10 months ago by

Gemma2FlashAttention2 missing sliding_window variable

#8 opened 10 months ago by

New activity in EleutherAI/pile-t5-large 10 months ago

why UMT5

#1 opened about 1 year ago by

New activity in microsoft/phi-1_5 12 months ago

Something broken on last update

#85 opened about 1 year ago by

New activity in rhysjones/phi-2-orange about 1 year ago

Can't get it to generate the EOS token and beam search is not supported

#3 opened about 1 year ago by

New activity in microsoft/phi-2 about 1 year ago

How to fine-tune this? + Training code

#19 opened over 1 year ago by

Generation after finetuning does not ends at EOS token

#123 opened about 1 year ago by

New activity in microsoft/phi-1_5 over 1 year ago

Attention mask for generation function in the future?

#7 opened over 1 year ago by

New activity in TheBloke/guanaco-33B-GPTQ almost 2 years ago

guanaco-65b

#1 opened almost 2 years ago by

New activity in mosaicml/mpt-7b almost 2 years ago

Speed on CPU

#8 opened almost 2 years ago by

Will you make a 3B model as well?

#7 opened almost 2 years ago by

New activity in Sosaka/Alpaca-native-4bit-ggml almost 2 years ago

How do you run this?

#2 opened almost 2 years ago by

New activity in openai-community/roberta-base-openai-detector almost 2 years ago

How to run this?

#13 opened almost 2 years ago by

New activity in sileod/deberta-v3-base-tasksource-nli almost 2 years ago

Does not work at all, i tried to calculate cola

#2 opened about 2 years ago by

New activity in tloen/alpaca-lora-7b almost 2 years ago

This works, but training does not work at all

#4 opened about 2 years ago by