tokens = tokenizer(prompt, return_tensors="pt").to("cuda") use the model to generate new tokens.