File size: 579 Bytes
5fa1a76 |
1 2 3 4 5 6 7 8 9 10 |
dataset["train"].features Here's what the individual fields represent: * id: the example's id * image: a PIL.Image.Image object containing the document image * query: the question string - natural language asked question, in several languages * answers: a list of correct answers provided by human annotators * words and bounding_boxes: the results of OCR, which we will not use here * answer: an answer matched by a different model which we will not use here Let's leave only English questions, and drop the answer feature which appears to contain predictions by another model. |