These input embeddings are learnt positional encodings that the authors refer to as object queries, and similarly to | |
the encoder, they are added to the input of each attention layer. |
These input embeddings are learnt positional encodings that the authors refer to as object queries, and similarly to | |
the encoder, they are added to the input of each attention layer. |