To overcome this limitation, you can either explicitly pass the desired dtype using torch_dtype argument: python model = T5ForConditionalGeneration.from_pretrained("t5", torch_dtype=torch.float16) or, if you want the model to always load in the most optimal memory pattern, you can use the special value "auto", and then dtype will be automatically derived from the model's weights: python model = T5ForConditionalGeneration.from_pretrained("t5", torch_dtype="auto") Models instantiated from scratch can also be told which dtype to use with: python config = T5Config.from_pretrained("t5") model = AutoModel.from_config(config) Due to Pytorch design, this functionality is only available for floating dtypes.