DeepSeek-V2.5?
#6 opened 10 months ago
by
goodasdgood
Repetitive generation without additional EOS token
2
#5 opened about 1 year ago
by
amrothemich
Biomed Foundation Model
1
#4 opened about 1 year ago
by
amrothemich
Yi-34B AQLM?
#3 opened about 1 year ago
by
llama-anon

~8 tok/sec with ~5k context on vLLM with Flash Attention and `kv_cache_dtype="fp8"` on 3090TI 24GB VRAM
2
#2 opened about 1 year ago
by
ubergarm
