The abstract from the Phi-1.5 paper is the following: | |
We continue the investigation into the power of smaller Transformer-based language models as | |
initiated by TinyStories – a 10 million parameter model that can produce coherent English – and | |
the follow-up work on phi-1, a 1.3 billion parameter model with Python coding performance close | |
to the state-of-the-art. |