remove_idx = [590, 821, 822, 875, 876, 878, 879] | |
keep = [i for i in range(len(cppe5["train"])) if i not in remove_idx] | |
cppe5["train"] = cppe5["train"].select(keep) | |
Preprocess the data | |
To finetune a model, you must preprocess the data you plan to use to match precisely the approach used for the pre-trained model. |