🦙📚 LlamaTales
From the paper 'Readability ≠ Learnability: Rethinking the Role of Simplicity in Training Small Language Models' (COLM 2025)
Viewer • Updated • 2M • 3Note Short stories generated by `nvidia/Llama-3.1-Nemotron-70B-Instruct`.
ivnle/llamatales-jr-70b
Viewer • Updated • 3.56M • 6Note Children's stories generated by `nvidia/Llama-3.1-Nemotron-70B-Instruct`.
ivnle/llamatales-gre
Viewer • Updated • 2.02M • 10Note Short stories generated by `meta-llama/Llama-3.1-8B-Instruct`.
ivnle/llamatales-jr
Viewer • Updated • 3.59M • 11Note Children's stories generated by `meta-llama/Llama-3.1-8B-Instruct`.
ivnle/tinystories
Viewer • Updated • 4.97M • 28Note Source: https://huggingface.co/datasets/roneneldan/TinyStories/blob/main/TinyStories_all_data.tar.gz
ivnle/fineweb
Viewer • Updated • 2.03M • 53Note 1B token sample of FineWeb-Edu https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu Model checkpoints below. Naming format is [training data]-[layers]-[hidden size]-[heads]-[non-embedding parameter count].
ivnle/llamatales_gre_8b-lay8-hs512-hd8-33M
Text Generation • 0.2B • Updated • 4ivnle/llamatales_gre_8b-lay8-hs384-hd6-18M
Text Generation • 0.1B • Updated • 3ivnle/llamatales_gre_8b-lay4-hs384-hd6-9M
Text Generation • 0.1B • Updated • 3ivnle/llamatales_gre_8b-lay4-hs128-hd2-1M
Text Generation • 0.0B • Updated • 4ivnle/llamatales_gre_8b-lay2-hs128-hd2-524K
Text Generation • 0.0B • Updated • 3ivnle/llamatales_gre_8b-lay1-hs128-hd2-262K
Text Generation • 0.0B • Updated • 5ivnle/fineweb-lay8-hs512-hd8-33M
Text Generation • 0.2B • Updated • 3ivnle/fineweb-lay8-hs384-hd6-18M
Text Generation • 0.1B • Updated • 4ivnle/fineweb-lay4-hs384-hd6-9M
Text Generation • 0.1B • Updated • 3ivnle/fineweb-lay4-hs128-hd2-1M
Text Generation • 0.0B • Updated • 3ivnle/fineweb-lay2-hs128-hd2-524K
Text Generation • 0.0B • Updated • 3ivnle/fineweb-lay1-hs128-hd2-262K
Text Generation • 0.0B • Updated • 3ivnle/llamatales_jr_8b-lay8-hs512-hd8-33M
Text Generation • 0.2B • Updated • 3ivnle/llamatales_jr_8b-lay8-hs384-hd6-18M
Text Generation • 0.1B • Updated • 4ivnle/llamatales_jr_8b-lay4-hs384-hd6-9M
Text Generation • 0.1B • Updated • 3ivnle/llamatales_jr_8b-lay4-hs128-hd2-1M
Text Generation • 0.0B • Updated • 3ivnle/llamatales_jr_8b-lay2-hs128-hd2-524K
Text Generation • 0.0B • Updated • 3ivnle/llamatales_jr_8b-lay1-hs128-hd2-262K
Text Generation • 0.0B • Updated • 3ivnle/tinystories-lay8-hs512-hd8-33M
Text Generation • 0.2B • Updated • 3ivnle/tinystories-lay8-hs384-hd6-18M
Text Generation • 0.1B • Updated • 3ivnle/tinystories-lay4-hs384-hd6-9M
Text Generation • 0.1B • Updated • 3ivnle/tinystories-lay4-hs128-hd2-1M
Text Generation • 0.0B • Updated • 4ivnle/tinystories-lay2-hs128-hd2-524K
Text Generation • 0.0B • Updated • 3ivnle/tinystories-lay1-hs128-hd2-262K
Text Generation • 0.0B • Updated • 3