These models also tend to have a large number of learnable parameters (e.g.