Problem with CUDA or FlashAttention
#2
by
Mezigore
- opened
UI gives this error
Failed to generate text: FlashAttention2 has been toggled on, but it cannot be used due to the following error: Flash Attention 2 is not available on CPU. Please make sure torch can access a CUDA device.```
Sorry about the falldown, it is alive again with ZeroGPU:
https://huggingface.co/spaces/moonshotai/Kimi-VL-A3B-Thinking