Start by loading the dataset: | |
from datasets import load_dataset | |
cppe5 = load_dataset("cppe-5") | |
cppe5 | |
DatasetDict({ | |
train: Dataset({ | |
features: ['image_id', 'image', 'width', 'height', 'objects'], | |
num_rows: 1000 | |
}) | |
test: Dataset({ | |
features: ['image_id', 'image', 'width', 'height', 'objects'], | |
num_rows: 29 | |
}) | |
}) | |
You'll see that this dataset already comes with a training set containing 1000 images and a test set with 29 images. |