BabyLM Challenge

community

https://babylm.github.io/

babyLMchallenge

AI & ML interests

Pretraining data constrained and cognitively relevant baby LLMs

Recent Activity

momergul updated a model about 3 hours ago

BabyLM-community/babylm-multimodal-baseline-git

juletxara authored a paper 6 days ago

Instructing Large Language Models for Low-Resource Languages: A Systematic Study for Basque

negar-foroutan authored a paper about 1 month ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

View all activity

momergul

updated a model about 3 hours ago

BabyLM-community/babylm-multimodal-baseline-git

Image-to-Text • 0.2B • Updated about 3 hours ago • 66

momergul

updated a model 3 days ago

BabyLM-community/babylm-multimodal-baseline-flamingo

Text Generation • 0.3B • Updated 3 days ago • 107

juletxara

authored a paper 6 days ago

Instructing Large Language Models for Low-Resource Languages: A Systematic Study for Basque

Paper • 2506.07597 • Published Jun 9

negar-foroutan

authored a paper about 1 month ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26 • 64

suchirsalhan

authored a paper about 1 month ago

ByteSpan: Information-Driven Subword Tokenisation

Paper • 2506.18639 • Published Jun 23 • 3

juletxara

authored 2 papers 2 months ago

Lessons from the Trenches on Reproducible Evaluation of Language Models

Paper • 2405.14782 • Published May 23, 2024

Truth Knows No Language: Evaluating Truthfulness Beyond English

Paper • 2502.09387 • Published Feb 13 • 1

seyoungsong

authored 4 papers 2 months ago

MUG-Eval: A Proxy Evaluation Framework for Multilingual Generation Capabilities in Any Language

Paper • 2505.14395 • Published May 20 • 6

When Does Classical Chinese Help? Quantifying Cross-Lingual Transfer in Hanja and Kanbun

Paper • 2411.04822 • Published Nov 7, 2024

LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation

Paper • 2503.07237 • Published Mar 10

HERITAGE: An End-to-End Web Platform for Processing Korean Historical Documents in Hanja

Paper • 2501.11951 • Published Jan 21

borgr

authored a paper 4 months ago

Pretraining Language Models for Diachronic Linguistic Change Discovery

Paper • 2504.05523 • Published Apr 7 • 6

suchirsalhan

authored a paper 4 months ago

Less is More: Pre-Training Cross-Lingual Small-Scale Language Models with Cognitively-Plausible Curriculum Learning Strategies

Paper • 2410.22886 • Published Oct 30, 2024 • 2

bbunzeck

authored 2 papers 5 months ago

Do Construction Distributions Shape Formal Language Learning In German BabyLMs?

Paper • 2503.11593 • Published Mar 14 • 1

Subword models struggle with word learning, but surprisal hides it

Paper • 2502.12835 • Published Feb 18

bbunzeck

authored a paper 7 months ago

Small Language Models Also Work With Small Vocabularies: Probing the Linguistic Abilities of Grapheme- and Phoneme-Based Baby Llamas

Paper • 2410.01487 • Published Oct 2, 2024

borgr

authored 4 papers 8 months ago

SemEval 2019 Shared Task: Cross-lingual Semantic Parsing with UCCA - Call for Participation

Paper • 1805.12386 • Published May 31, 2018

tinyBenchmarks: evaluating LLMs with fewer examples

Paper • 2402.14992 • Published Feb 22, 2024 • 15

Asymmetry in Low-Rank Adapters of Foundation Models

Paper • 2402.16842 • Published Feb 26, 2024 • 2

Efficient multi-prompt evaluation of LLMs

Paper • 2405.17202 • Published May 27, 2024 • 3