MarxistLeninist/AGILLMMark-2

switched too Qwen 3 235B A22B Thinking 2507 qwen3-235b-a22b-thinking-2507 for the vocab tokenizer as it currently the smartest AGI LLM that is open source; which make it incompitable with Mark1 as it uses deepseek-r1-0528 token/vocab Activated conda/uv virtual environment at /venv/main (main) root@C.25031464:/workspace$ python3 5p10.py train --preset small --amp --x2 --fresh
--block 1024
--save_dir /workspace/ckpts_qwen3_small_x2_1024
--save_every_sec 259200 tokenizer_config.json: 10.8kB [00:00, 31.5MB/s] vocab.json: 2.78MB [00:00, 22.9MB/s] merges.txt: 1.67MB [00:00, 26.0MB/s] tokenizer.json: 7.03MB [00:00, 39.6MB/s] [auto-steps] 3,229,687 training steps (@ 1024 tokens/step) Resolving data files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 59166/59166 [00:22<00:00, 2673.56it/s] Resolving data files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 31428/31428 [00:00<00:00, 259629.53it/s] Resolving data files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 31411/31411 [00:00<00:00, 242792.61it/