Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
updated_dataset = updated_dataset.filter(lambda x: len(x["words"]) + len(x["question"].split()) < 512)
At this point let's also remove the OCR features from this dataset.