Tokenizer is lowercase
#9
by
nshmyrevgmail
- opened
By default BertTokenizer applies lowercase unless you add corresponding tokenizer_config.json. The vocab is mixed case in this model, so it doesn't seem the model is lowercase only.