Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
This is super helpful when activation checkpointing is enabled and you want to keep the parameter in the forward recompute until the backward pass.