Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
Once the
columns have been added, you can stream batches from the dataset and add padding to each batch, which greatly
reduces the number of padding tokens compared to padding the entire dataset.