Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 over 1 year ago

464 Bytes

	prefix = "summarize: "
	def preprocess_function(examples):
	inputs = [prefix + doc for doc in examples["text"]]
	model_inputs = tokenizer(inputs, max_length=1024, truncation=True)

	labels = tokenizer(text_target=examples["summary"], max_length=128, truncation=True)
	model_inputs["labels"] = labels["input_ids"]
	return model_inputs

	To apply the preprocessing function over the entire dataset, use 🤗 Datasets [~datasets.Dataset.map] method.