Dynamically padding | |
batches to the longest example is not recommended on TPU as it triggers a recompilation for every batch shape that is | |
encountered during training thus significantly slowing down the training. |
Dynamically padding | |
batches to the longest example is not recommended on TPU as it triggers a recompilation for every batch shape that is | |
encountered during training thus significantly slowing down the training. |