File size: 1,269 Bytes
5fa1a76 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
To finetune a model in TensorFlow, start by setting up an optimizer function, learning rate schedule, and some training hyperparameters: from transformers import create_optimizer batch_size = 16 num_train_epochs = 3 num_train_steps = (len(tokenized_wnut["train"]) // batch_size) * num_train_epochs optimizer, lr_schedule = create_optimizer( init_lr=2e-5, num_train_steps=num_train_steps, weight_decay_rate=0.01, num_warmup_steps=0, ) Then you can load DistilBERT with [TFAutoModelForTokenClassification] along with the number of expected labels, and the label mappings: from transformers import TFAutoModelForTokenClassification model = TFAutoModelForTokenClassification.from_pretrained( "distilbert/distilbert-base-uncased", num_labels=13, id2label=id2label, label2id=label2id ) Convert your datasets to the tf.data.Dataset format with [~transformers.TFPreTrainedModel.prepare_tf_dataset]: tf_train_set = model.prepare_tf_dataset( tokenized_wnut["train"], shuffle=True, batch_size=16, collate_fn=data_collator, ) tf_validation_set = model.prepare_tf_dataset( tokenized_wnut["validation"], shuffle=False, batch_size=16, collate_fn=data_collator, ) Configure the model for training with compile. |