Here we have the loss since we passed along labels, but we don't have hidden_states and attentions because we didn't pass output_hidden_states=True or output_attentions=True.