hamishivi
/

tess2-v0.1-symbolic

Model card Files Files and versions Community

hamishivi commited on Jan 8

Commit

e2d48e4

·

verified ·

1 Parent(s): 1274943

Create README.md

Files changed (1) hide show

README.md +28 -0

README.md ADDED Viewed

	@@ -0,0 +1,28 @@

+---
+license: apache-2.0
+datasets:
+- hamishivi/gsm8k-symbolic
+language:
+- en
+base_model:
+- hamishivi/tess2_base
+---
+# TESS 2 - A Generalist Instruction Tuned Diffusion LM
+This model is the TESS 2 model trained on GSM8k symbolic data found [here](https://huggingface.co/datasets/hamishivi/gsm8k-symbolic), adapted from [here](https://github.com/HKUNLP/diffusion-of-thoughts). This model is a simplex-based diffusion model adapted from Mistral v0.1 7B, further trained on Dolma 1.7 and Tulu 2 SFT data.
+For more details, please check out our paper [TESS-2: A Large-Scale, Generalist Diffusion Language Model](https://todo).
+This model will only work with our custom codebase found [here](https://github.com/armancohan/simplex-diffusion) -- please go there to see details on how to run training and inference.
+## Using this model
+To run this model, first clone https://github.com/armancohan/simplex-diffusion.
+Then, after creating a python environment with the correct packages, you can run inference via a ui with:
+```sh
+./shell_scripts/run_interactive_demo.sh hamishivi/tess2
+```
+This allows you to directly interact with the model, and shows the diffusion generation process.
+For training or other evaluations, please see our main repository.