", return_tensors="pt").to(0)
predictions = model.generate(**inputs, max_new_tokens=512)
print(processor.decode(predictions[0], skip_special_tokens=True))

Fine-tuning
To fine-tune MatCha, refer to the pix2struct fine-tuning notebook.