You can check whether you require the uroman package for your language by inspecting the is_uroman attribute of 
the pre-trained tokenizer:
thon
from transformers import VitsTokenizer
tokenizer = VitsTokenizer.from_pretrained("facebook/mms-tts-eng")
print(tokenizer.is_uroman)

If required, you should apply the uroman package to your text inputs prior to passing them to the VitsTokenizer, 
since currently the tokenizer does not support performing the pre-processing itself.