Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
Next, we want to make sure that a model with a specific head layer, such as
BrandNewBertForMaskedLM does not inherit from BrandNewBertModel, but rather uses BrandNewBertModel
as a component that can be called in its forward pass to keep the level of abstraction low.