The zero-index is replaced by 255 so it's ignored by SegFormer's loss function:

from transformers import AutoImageProcessor
checkpoint = "nvidia/mit-b0"
image_processor = AutoImageProcessor.from_pretrained(checkpoint, reduce_labels=True)

It is common to apply some data augmentations to an image dataset to make a model more robust against overfitting.