If you want to use Pix2Struct for image captioning, you should use the model fine tuned on the natural images captioning dataset and so on. |
If you want to use Pix2Struct for image captioning, you should use the model fine tuned on the natural images captioning dataset and so on. |