File size: 897 Bytes
5fa1a76
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
',
 'question_type': 'none of the above',
 'question_id': 262148000,
 'image_id': '/root/.cache/huggingface/datasets/downloads/extracted/ca733e0e000fb2d7a09fbcc94dbfe7b5a30750681d0e965f8e0a23b1c2f98c75/val2014/COCO_val2014_000000262148.jpg',
 'answer_type': 'other',
 'label': {'ids': ['at table', 'down', 'skateboard', 'table'],
  'weights': [0.30000001192092896,
   1.0,
   0.30000001192092896,
   0.30000001192092896]}}

The features relevant to the task include: 
* question: the question to be answered from the image
* image_id: the path to the image the question refers to
* label: the annotations
We can remove the rest of the features as they won't be necessary: 
 

dataset = dataset.remove_columns(['question_type', 'question_id', 'answer_type'])

As you can see, the label feature contains several answers to the same question (called ids here) collected by different human annotators.