File size: 191 Bytes
5fa1a76
 
 
1
2
3
We carefully characterize the trade-offs in terms of parameter count,
training FLOPs, and inference speed, and show that byte-level models are competitive with their token-level
counterparts.