Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
To fine-tune LED on all 16384, gradient checkpointing can be enabled in case training leads to out-of-memory (OOM)
errors.