AGI LLM experiments use a3 file to inference the ar and nat; trained on 100 million token with 120 epochs; while that follow kaplan law, it undertrained by chinilla standards
The AGI-LLM experiments use an A3 file to run inference on AR and NAT. The model was trained on 100 million tokens over 120 epochs. While this complies with Kaplan’s scaling law, it is still under-trained by Chinchilla standards.
PS C:\Users\Scott> python C:\Users\Scott\Downloads\a3.py infer --mode ar --ckpt C:\Users\Scott\Downloads\ar_ep100.pt --prompt "The economic basis of society determines" --max_new 120 --preset small tokenizer_config.json: 3.96kB [00:00, ?B/s] tokenizer.json: 7.85MB [00:00, 20.7MB/s] The economic basis of society determines a speed of the ships of the main Italian fleet and was primarily occupied with training exercises . At the start of the Italo @-@ Turkish War in September 1911 , she was assigned to the Red Sea Squadron in Italian Eritrea . She bombarded Ottoman positions in the Arabian Peninsula and took part in a blockade of the coast . Worn out by the end of the war in October 1912 , Aretusa was sold for scrap that December and broken up . = = 20th century = = = = = = = =
[96 new tokens in 20.00s] pretty cool