Usage tips To load GPT-J in float32 one would need at least 2x model size RAM: 1x for initial weights and another 1x to load the checkpoint.