If there are several sentences you want to preprocess, pass them as a list to the tokenizer: batch_sentences = [ "But what about second breakfast?