5fa1a76
1
2
This means, for example, Perceiver IO can do BERT-style masked language modeling directly using bytes instead of tokenized inputs.