Next, the | |
feature map is flattened and transposed to obtain a tensor of shape (batch_size, seq_len, d_model) = | |
(batch_size, width/32*height/32, 256). |
Next, the | |
feature map is flattened and transposed to obtain a tensor of shape (batch_size, seq_len, d_model) = | |
(batch_size, width/32*height/32, 256). |