Deploy model A800
#4 opened 2 months ago
by
czqqq
Inference error: The current context does not support K-shift
#3 opened 3 months ago
by
lollmaolol
Tested Q6, uses 567Gb Ram
1
10
#2 opened 3 months ago
by
krustik
Using -ctk q4_0 -ctv q4_0 with llama.cpp server throws flash_attn error
#1 opened 3 months ago
by
softwareweaver
