Spaces:

Ahmadzei
/

RAG

Runtime error

App Files Files Community

RAG / knowledge_base /model_doc_cpm.txt

Ahmadzei

update 1

57bdca5 over 1 year ago

raw

history blame contribute delete

1.82 kB


	CPM
	Overview
	The CPM model was proposed in CPM: A Large-scale Generative Chinese Pre-trained Language Model by Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin,
	Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen,
	Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun.
	The abstract from the paper is the following:
	Pre-trained Language Models (PLMs) have proven to be beneficial for various downstream NLP tasks. Recently, GPT-3,
	with 175 billion parameters and 570GB training data, drew a lot of attention due to the capacity of few-shot (even
	zero-shot) learning. However, applying GPT-3 to address Chinese NLP tasks is still challenging, as the training corpus
	of GPT-3 is primarily English, and the parameters are not publicly available. In this technical report, we release the
	Chinese Pre-trained Language Model (CPM) with generative pre-training on large-scale Chinese training data. To the best
	of our knowledge, CPM, with 2.6 billion parameters and 100GB Chinese training data, is the largest Chinese pre-trained
	language model, which could facilitate several downstream Chinese NLP tasks, such as conversation, essay generation,
	cloze test, and language understanding. Extensive experiments demonstrate that CPM achieves strong performance on many
	NLP tasks in the settings of few-shot (even zero-shot) learning.
	This model was contributed by canwenxu. The original implementation can be found
	here: https://github.com/TsinghuaAI/CPM-Generate

	CPM's architecture is the same as GPT-2, except for tokenization method. Refer to GPT-2 documentation for
	API reference information.

	CpmTokenizer
	[[autodoc]] CpmTokenizer
	CpmTokenizerFast
	[[autodoc]] CpmTokenizerFast