File size: 1,269 Bytes
5fa1a76
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
To finetune a model in TensorFlow, start by setting up an optimizer function, learning rate schedule, and some training hyperparameters:

from transformers import create_optimizer
batch_size = 16
num_train_epochs = 3
num_train_steps = (len(tokenized_wnut["train"]) // batch_size) * num_train_epochs
optimizer, lr_schedule = create_optimizer(
     init_lr=2e-5,
     num_train_steps=num_train_steps,
     weight_decay_rate=0.01,
     num_warmup_steps=0,
 )

Then you can load DistilBERT with [TFAutoModelForTokenClassification] along with the number of expected labels, and the label mappings:

from transformers import TFAutoModelForTokenClassification
model = TFAutoModelForTokenClassification.from_pretrained(
     "distilbert/distilbert-base-uncased", num_labels=13, id2label=id2label, label2id=label2id
 )

Convert your datasets to the tf.data.Dataset format with [~transformers.TFPreTrainedModel.prepare_tf_dataset]:

tf_train_set = model.prepare_tf_dataset(
     tokenized_wnut["train"],
     shuffle=True,
     batch_size=16,
     collate_fn=data_collator,
 )
tf_validation_set = model.prepare_tf_dataset(
     tokenized_wnut["validation"],
     shuffle=False,
     batch_size=16,
     collate_fn=data_collator,
 )

Configure the model for training with compile.