4 23 3

Baifeng Shi

bfshi

https://bfshi.github.io

AI & ML interests

computer vision

Recent Activity

upvoted a paper about 2 hours ago

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

upvoted a paper about 18 hours ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

upvoted a paper 10 days ago

One-Minute Video Generation with Test-Time Training

View all activity

Organizations

bfshi's activity

upvoted a paper about 2 hours ago

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

Paper • 2504.13169 • Published 1 day ago • 26

upvoted a paper about 18 hours ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published 1 day ago • 67

upvoted 2 papers 10 days ago

One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published 11 days ago • 94

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 11 days ago • 161

authored a paper 23 days ago

Scaling Vision Pre-Training to 4K Resolution

Paper • 2503.19903 • Published 24 days ago • 39

upvoted a paper 23 days ago

Scaling Vision Pre-Training to 4K Resolution

Paper • 2503.19903 • Published 24 days ago • 39

commented a paper 24 days ago

Scaling Vision Pre-Training to 4K Resolution

Paper • 2503.19903 • Published 24 days ago • 39 •

published a dataset about 2 months ago

bfshi/vstar_bench_lmms_eval

Viewer • Updated Nov 1, 2024 • 191 • 66

New activity in Efficient-Large-Model/NVILA-8B-Video 2 months ago

What is the difference between the nvila 8b base model and video model?

#1 opened 2 months ago by

YoungjaeDev

upvoted 2 papers 3 months ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 120

An Empirical Study of Autoregressive Pre-training from Videos

Paper • 2501.05453 • Published Jan 9 • 42

New activity in Efficient-Large-Model/NVILA-15B 3 months ago

Ask about demo

#1 opened 4 months ago by

Lanbai44

liked 2 models 4 months ago

Efficient-Large-Model/NVILA-15B

Text Generation • Updated Jan 6 • 25.3k • 15

Efficient-Large-Model/NVILA-8B

Text Generation • Updated Jan 6 • 20.3k • 4

upvoted a collection 4 months ago

NVILA

Collection

9 items • Updated 2 days ago • 9

liked a Space 4 months ago

VILA

🏆

VILA Playground.

authored a paper 4 months ago

NVILA: Efficient Frontier Visual Language Models

Paper • 2412.04468 • Published Dec 5, 2024 • 60

upvoted a paper 4 months ago

NVILA: Efficient Frontier Visual Language Models

Paper • 2412.04468 • Published Dec 5, 2024 • 60

updated a dataset 6 months ago

bfshi/vstar_bench_lmms_eval

Viewer • Updated Nov 1, 2024 • 191 • 66

upvoted a paper 6 months ago

Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Dataset

Paper • 2410.22325 • Published Oct 29, 2024 • 10