Pretraining methods involve a self-supervised objective, which can be reading the text and trying to predict the next word (see causal language modeling) or masking some words and trying to predict them (see masked language modeling).