[PreTrainedModel] and [TFPreTrainedModel] also implement a few methods which | |
are common among all the models to: | |
resize the input token embeddings when new tokens are added to the vocabulary | |
prune the attention heads of the model. |
[PreTrainedModel] and [TFPreTrainedModel] also implement a few methods which | |
are common among all the models to: | |
resize the input token embeddings when new tokens are added to the vocabulary | |
prune the attention heads of the model. |