Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
from datasets import load_dataset
food = load_dataset("food101", split="train[:5000]")
Split the dataset's train split into a train and test set with the [~datasets.Dataset.train_test_split] method:
food = food.train_test_split(test_size=0.2)
Then take a look at an example:
food["train"][0]
{'image': ,
'label': 79}
Each example in the dataset has two fields:
image: a PIL image of the food item
label: the label class of the food item
To make it easier for the model to get the label name from the label id, create a dictionary that maps the label name
to an integer and vice versa:
labels = food["train"].features["label"].names
label2id, id2label = dict(), dict()
for i, label in enumerate(labels):
label2id[label] = str(i)
id2label[str(i)] = label
Now you can convert the label id to a label name:
id2label[str(79)]
'prime_rib'
Preprocess
The next step is to load a ViT image processor to process the image into a tensor:
from transformers import AutoImageProcessor
checkpoint = "google/vit-base-patch16-224-in21k"
image_processor = AutoImageProcessor.from_pretrained(checkpoint)
Apply some image transformations to the images to make the model more robust against overfitting.