The | |
idea is actually relatively simple: one defines outputs of an arbitrary size, and then applies cross-attention with the | |
last hidden states of the latents, using the outputs as queries, and the latents as keys and values. |
The | |
idea is actually relatively simple: one defines outputs of an arbitrary size, and then applies cross-attention with the | |
last hidden states of the latents, using the outputs as queries, and the latents as keys and values. |