File size: 475 Bytes
5fa1a76 |
1 2 3 4 5 6 7 8 9 10 |
).convert("RGB") encoding = processor( image, return_tensors="pt" ) # you can also add all tokenizer parameters here such as padding, truncation print(encoding.keys()) dict_keys(['input_ids', 'token_type_ids', 'attention_mask', 'bbox', 'image']) Use case 2: document image classification (training, inference) + token classification (inference), apply_ocr=False In case one wants to do OCR themselves, one can initialize the image processor with apply_ocr set to False. |