File size: 604 Bytes
5fa1a76
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
To later instantiate the model with an appropriate classification head, let's create two dictionaries: one that maps 
the label name to an integer and vice versa:

import itertools
labels = [item['ids'] for item in dataset['label']]
flattened_labels = list(itertools.chain(*labels))
unique_labels = list(set(flattened_labels))
label2id = {label: idx for idx, label in enumerate(unique_labels)}
id2label = {idx: label for label, idx in label2id.items()} 

Now that we have the mappings, we can replace the string answers with their ids, and flatten the dataset for a more convenient further preprocessing.