Spaces:

Ahmadzei
/

RAG

Runtime error

App Files Files Community

RAG / knowledge_base /model_doc_camembert.txt

Ahmadzei

update 1

57bdca5 over 1 year ago

raw

history blame contribute delete

2.93 kB


	CamemBERT
	Overview
	The CamemBERT model was proposed in CamemBERT: a Tasty French Language Model by
	Louis Martin, Benjamin Muller, Pedro Javier Ortiz Suárez, Yoann Dupont, Laurent Romary, Éric Villemonte de la
	Clergerie, Djamé Seddah, and Benoît Sagot. It is based on Facebook's RoBERTa model released in 2019. It is a model
	trained on 138GB of French text.
	The abstract from the paper is the following:
	Pretrained language models are now ubiquitous in Natural Language Processing. Despite their success, most available
	models have either been trained on English data or on the concatenation of data in multiple languages. This makes
	practical use of such models --in all languages except English-- very limited. Aiming to address this issue for French,
	we release CamemBERT, a French version of the Bi-directional Encoders for Transformers (BERT). We measure the
	performance of CamemBERT compared to multilingual models in multiple downstream tasks, namely part-of-speech tagging,
	dependency parsing, named-entity recognition, and natural language inference. CamemBERT improves the state of the art
	for most of the tasks considered. We release the pretrained model for CamemBERT hoping to foster research and
	downstream applications for French NLP.
	This model was contributed by the ALMAnaCH team (Inria). The original code can be found here.

	This implementation is the same as RoBERTa. Refer to the documentation of RoBERTa for usage examples as well
	as the information relative to the inputs and outputs.

	Resources

	Text classification task guide
	Token classification task guide
	Question answering task guide
	Causal language modeling task guide
	Masked language modeling task guide
	Multiple choice task guide

	CamembertConfig
	[[autodoc]] CamembertConfig
	CamembertTokenizer
	[[autodoc]] CamembertTokenizer
	- build_inputs_with_special_tokens
	- get_special_tokens_mask
	- create_token_type_ids_from_sequences
	- save_vocabulary
	CamembertTokenizerFast
	[[autodoc]] CamembertTokenizerFast

	CamembertModel
	[[autodoc]] CamembertModel
	CamembertForCausalLM
	[[autodoc]] CamembertForCausalLM
	CamembertForMaskedLM
	[[autodoc]] CamembertForMaskedLM
	CamembertForSequenceClassification
	[[autodoc]] CamembertForSequenceClassification
	CamembertForMultipleChoice
	[[autodoc]] CamembertForMultipleChoice
	CamembertForTokenClassification
	[[autodoc]] CamembertForTokenClassification
	CamembertForQuestionAnswering
	[[autodoc]] CamembertForQuestionAnswering

	TFCamembertModel
	[[autodoc]] TFCamembertModel
	TFCamembertForCasualLM
	[[autodoc]] TFCamembertForCausalLM
	TFCamembertForMaskedLM
	[[autodoc]] TFCamembertForMaskedLM
	TFCamembertForSequenceClassification
	[[autodoc]] TFCamembertForSequenceClassification
	TFCamembertForMultipleChoice
	[[autodoc]] TFCamembertForMultipleChoice
	TFCamembertForTokenClassification
	[[autodoc]] TFCamembertForTokenClassification
	TFCamembertForQuestionAnswering
	[[autodoc]] TFCamembertForQuestionAnswering