119 13 18

Omkar Pangarkar

omkarenator

AI & ML interests

None yet

Recent Activity

upvoted a paper 16 days ago

Essential-Web v1.0: 24T tokens of organized web data

upvoted an article about 1 month ago

nanoJAXGPT: A pedagogical introduction to JAX/Equinox

liked a Space 3 months ago

nanotron/predict_memory

View all activity

Organizations

liked a Space 3 months ago

Predict Memory

🧮

Analyze and visualize memory usage from model configurations

liked a dataset 4 months ago

WebOrganizer/Corpus-200B

Preview • Updated Feb 19 • 36k • 9

liked a Space 4 months ago

118

TxT360: Trillion Extracted Text

📖

Create a large-scale deduplicated text dataset for LLM training

liked a model 6 months ago

mlfoundations/fasttext-oh-eli5

Updated Aug 1, 2024 • 25

liked a Space 6 months ago

3.1k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

liked a Space 7 months ago

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

📝

Evaluate multilingual models using FineTasks

liked a dataset 10 months ago

LLM360/TxT360

Updated May 26 • 8.75k • 238

liked a Space 12 months ago

1.04k

FineWeb: decanting the web for the finest text data at scale

🍷

Generate high-quality web text data for LLM training

liked a dataset 12 months ago

Trelis/touch-rugby-rules-memorisation

Viewer • Updated Feb 28, 2024 • 363 • 24 • 2

liked a dataset about 1 year ago

commoncrawl/statistics

Viewer • Updated 4 days ago • 584k • 144 • 22

liked 2 models over 1 year ago

bigcode/starencoder

Updated May 10, 2023 • 285 • 53

microsoft/phi-2

Text Generation • 3B • Updated Apr 29, 2024 • 759k • 3.39k

liked 4 models almost 2 years ago

liked 2 models about 2 years ago

mosaicml/mpt-7b-chat

Text Generation • Updated Mar 5, 2024 • 82.4k • 517

stanfordnlp/backpack-gpt2

Text Generation • Updated Aug 14, 2023 • 12 • 16