Problem with CUDA or FlashAttention

#2
by Mezigore - opened

UI gives this error

Failed to generate text: FlashAttention2 has been toggled on, but it cannot be used due to the following error: Flash Attention 2 is not available on CPU. Please make sure torch can access a CUDA device.```
Moonshot AI org

Sorry about the falldown, it is alive again with ZeroGPU:
https://huggingface.co/spaces/moonshotai/Kimi-VL-A3B-Thinking

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment