File size: 442 Bytes
5fa1a76
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
The format is: Salesforce/codegen-{size}-{data}, where
size: 350M, 2B, 6B, 16B
data: 
nl: Pre-trained on the Pile
multi: Initialized with nl, then further pre-trained on multiple programming languages data
mono: Initialized with multi, then further pre-trained on Python data

For example, Salesforce/codegen-350M-mono offers a 350 million-parameter checkpoint pre-trained sequentially on the Pile, multiple programming languages, and Python.