File size: 485 Bytes
5fa1a76
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
def collate_fn(batch):
     pixel_values = [item["pixel_values"] for item in batch]
     encoding = image_processor.pad(pixel_values, return_tensors="pt")
     labels = [item["labels"] for item in batch]
     batch = {}
     batch["pixel_values"] = encoding["pixel_values"]
     batch["pixel_mask"] = encoding["pixel_mask"]
     batch["labels"] = labels
     return batch

Multimodal
For tasks involving multimodal inputs, you'll need a processor to prepare your dataset for the model.