Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 over 1 year ago

293 Bytes

	Remove any columns you don't need:

	tokenized_eli5 = eli5.map(
	preprocess_function,
	batched=True,
	num_proc=4,
	remove_columns=eli5["train"].column_names,
	)

	This dataset contains the token sequences, but some of these are longer than the maximum input length for the model.