Spaces:

Ahmadzei
/

RAG

Runtime error

App Files Files Community

RAG / knowledge_base /model_doc_mt5.txt

Ahmadzei

update 1

57bdca5 over 1 year ago

raw

history blame contribute delete

2.51 kB


	mT5

	Overview
	The mT5 model was presented in mT5: A massively multilingual pre-trained text-to-text transformer by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya
	Siddhant, Aditya Barua, Colin Raffel.
	The abstract from the paper is the following:
	The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain
	state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, we introduce mT5, a
	multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages. We detail
	the design and modified training of mT5 and demonstrate its state-of-the-art performance on many multilingual
	benchmarks. We also describe a simple technique to prevent "accidental translation" in the zero-shot setting, where a
	generative model chooses to (partially) translate its prediction into the wrong language. All of the code and model
	checkpoints used in this work are publicly available.
	Note: mT5 was only pre-trained on mC4 excluding any supervised training.
	Therefore, this model has to be fine-tuned before it is usable on a downstream task, unlike the original T5 model.
	Since mT5 was pre-trained unsupervisedly, there's no real advantage to using a task prefix during single-task
	fine-tuning. If you are doing multi-task fine-tuning, you should use a prefix.
	Google has released the following variants:

	google/mt5-small

	google/mt5-base

	google/mt5-large

	google/mt5-xl

	google/mt5-xxl.

	This model was contributed by patrickvonplaten. The original code can be
	found here.
	Resources

	Translation task guide
	Summarization task guide

	MT5Config
	[[autodoc]] MT5Config
	MT5Tokenizer
	[[autodoc]] MT5Tokenizer
	See [T5Tokenizer] for all details.
	MT5TokenizerFast
	[[autodoc]] MT5TokenizerFast
	See [T5TokenizerFast] for all details.

	MT5Model
	[[autodoc]] MT5Model
	MT5ForConditionalGeneration
	[[autodoc]] MT5ForConditionalGeneration
	MT5EncoderModel
	[[autodoc]] MT5EncoderModel
	MT5ForSequenceClassification
	[[autodoc]] MT5ForSequenceClassification
	MT5ForTokenClassification
	[[autodoc]] MT5ForTokenClassification
	MT5ForQuestionAnswering
	[[autodoc]] MT5ForQuestionAnswering

	TFMT5Model
	[[autodoc]] TFMT5Model
	TFMT5ForConditionalGeneration
	[[autodoc]] TFMT5ForConditionalGeneration
	TFMT5EncoderModel
	[[autodoc]] TFMT5EncoderModel

	FlaxMT5Model
	[[autodoc]] FlaxMT5Model
	FlaxMT5ForConditionalGeneration
	[[autodoc]] FlaxMT5ForConditionalGeneration
	FlaxMT5EncoderModel
	[[autodoc]] FlaxMT5EncoderModel