[bugfix] Initialize attention bias on the same device as Query/Key/Value

by kenneth-doh - opened 6 days ago

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

-1

kenneth-doh

6 days ago

The attention bias in xformers is currently initialized on the default device, rather than the device of the Q/K/V tensors.
Thus, in a multi-GPU environment, the following error occurs:

Error: Attention bias and Query/Key/Value should be on the same device
            query.device: cuda:6
            attn_bias   : cuda:0

This PR resolved the above error.

Note: The same error occurred in vllm and was resolved by the following PR.
https://github.com/vllm-project/vllm/pull/13468

[bugfix] Initialize attention bias on the same device as Query/Key/Value8bdecf30

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment