Lize Pirenne

Inversta

Pangasius

AI & ML interests

LLMs, RL

Recent Activity

upvoted a paper 4 days ago

BitNet b1.58 2B4T Technical Report

upvoted a paper 4 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

upvoted a paper 7 days ago

GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation

View all activity

Organizations

None yet

Inversta's activity

upvoted 2 papers 4 days ago

BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published 6 days ago • 62

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 8 days ago • 239

upvoted 2 papers 7 days ago

GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation

Paper • 2504.08736 • Published 11 days ago • 47

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

Paper • 2504.08685 • Published 11 days ago • 120

upvoted a paper 8 days ago

Hogwild! Inference: Parallel LLM Generation via Concurrent Attention

Paper • 2504.06261 • Published 14 days ago • 103

updated a collection 8 days ago

closed_qa

Collection

11 items • Updated 8 days ago

upvoted a paper 14 days ago

Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?

Paper • 2504.00509 • Published 21 days ago • 21

upvoted a paper 27 days ago

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 120

upvoted 6 papers about 1 month ago

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 141

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 158

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12 • 69

EuroBERT: Scaling Multilingual Encoders for European Languages

Paper • 2503.05500 • Published Mar 7 • 77

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6 • 94

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Paper • 2503.01743 • Published Mar 3 • 84

upvoted 2 papers about 2 months ago

Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published Feb 25 • 48

Predictive Data Selection: The Data That Predicts Is the Data That Teaches

Paper • 2503.00808 • Published Mar 2 • 57

liked a Space about 2 months ago

2.5k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

upvoted 2 papers about 2 months ago

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published Feb 20 • 191

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Paper • 2502.09604 • Published Feb 13 • 36