Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
pietrolesci
/
tokenisers
like
0
Model card
Files
Files and versions
xet
Community
pietrolesci
commited on
Feb 27
Commit
a4b467c
·
verified
·
1 Parent(s):
cacd0e5
Create README.md
Browse files
Files changed (1)
hide
show
README.md
+1
-0
README.md
ADDED
Viewed
@@ -0,0 +1 @@
1
+
Tokenisers trained on the MiniPile. The `_raw_tokenisers` folder contains the original tokenisers trained with a vocabulary size of 320k. Then, each folder is a `transformers`-compatible tokeniser of a smaller size.