MarxistLeninist
/

AGILLMMark-2

Model card Files Files and versions Community

MarxistLeninist commited on 8 days ago

Commit

294a88d

·

verified ·

1 Parent(s): 1ef05e3

Create README.md

Files changed (1) hide show

README.md +14 -0

README.md ADDED Viewed

	@@ -0,0 +1,14 @@

+switched too Qwen 3 235B A22B Thinking 2507 qwen3-235b-a22b-thinking-2507 for the vocab tokenizer as it currently the smartest AGI LLM that is open source; which make it incompitable with Mark1 as it uses deepseek-r1-0528 token/vocab
+Activated conda/uv virtual environment at /venv/main
+(main) root@C.25031464:/workspace$ python3 5p10.py train --preset small --amp --x2 --fresh \
+  --block 1024 \
+  --save_dir /workspace/ckpts_qwen3_small_x2_1024 \
+  --save_every_sec 259200
+tokenizer_config.json: 10.8kB [00:00, 31.5MB/s]
+vocab.json: 2.78MB [00:00, 22.9MB/s]
+merges.txt: 1.67MB [00:00, 26.0MB/s]
+tokenizer.json: 7.03MB [00:00, 39.6MB/s]
+[auto-steps] 3,229,687 training steps (@ 1024 tokens/step)
+Resolving data files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 59166/59166 [00:22<00:00, 2673.56it/s]
+Resolving data files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 31428/31428 [00:00<00:00, 259629.53it/s]
+Resolving data files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 31411/31411 [00:00<00:00, 242792.61it/