File size: 337 Bytes
5fa1a76
 
 
 
1
2
3
4
In practice, the parameter config.axial_pos_embds_dim is set to a tuple \((d^1, d^2)\) which sum has to be
equal to config.hidden_size and config.axial_pos_shape is set to a tuple \((n_s^1, n_s^2)\) which
product has to be equal to config.max_embedding_size, which during training has to be equal to the sequence
length of the input_ids.