Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 over 1 year ago

326 Bytes

	Preprocess

	For masked language modeling, the next step is to load a DistilRoBERTa tokenizer to process the text subfield:

	from transformers import AutoTokenizer
	tokenizer = AutoTokenizer.from_pretrained("distilbert/distilroberta-base")

	You'll notice from the example above, the text field is actually nested inside answers.