Run inference with decoder-only models with the text-generation pipeline: thon from transformers import pipeline import torch torch.manual_seed(0) # doctest: +IGNORE_RESULT generator = pipeline('text-generation', model = 'openai-community/gpt2') prompt = "Hello, I'm a language model" generator(prompt, max_length = 30) [{'generated_text': "Hello, I'm a language model expert, so I'm a big believer in the concept that I know very well and then I try to look into"}] To run inference with an encoder-decoder, use the text2text-generation pipeline: thon text2text_generator = pipeline("text2text-generation", model = 'google/flan-t5-base') prompt = "Translate from English to French: I'm very happy to see you" text2text_generator(prompt) [{'generated_text': 'Je suis très heureuse de vous rencontrer.'}]