Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 over 1 year ago

428 Bytes

	Corrupts the inputs by using random masking, more precisely, during pretraining, a given percentage of tokens (usually 15%) is masked by:

	a special mask token with probability 0.8
	a random token different from the one masked with probability 0.1
	the same token with probability 0.1

	The model must predict the original sentence, but has a second objective: inputs are two sentences A and B (with a separation token in between).