zwgao's picture

30 7 8

zwgao

zwgao

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

upvoted a paper 26 days ago

Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy

updated a model 26 days ago

OpenGVLab/Mini-InternVL2-2B-DA-BDD

View all activity

Organizations

zwgao's activity

upvoted a paper 6 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 7 days ago • 232

upvoted a paper 26 days ago

Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy

Paper • 2503.19757 • Published 27 days ago • 50

updated 3 models 26 days ago

OpenGVLab/Mini-InternVL2-2B-DA-BDD

Image-Text-to-Text • Updated 26 days ago • 13 • 1

OpenGVLab/Mini-InternVL2-2B-DA-DriveLM

Image-Text-to-Text • Updated 26 days ago • 29

OpenGVLab/Mini-InternVL2-2B-DA-Medical

Image-Text-to-Text • Updated 26 days ago • 113

upvoted a paper about 1 month ago

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

Paper • 2503.10291 • Published Mar 13 • 34

liked a model 4 months ago

opensourcerelease/DeepSeek-V3-Base-bf16

Updated Dec 30, 2024 • 72 • 4

upvoted a paper 4 months ago

SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding

Paper • 2412.09604 • Published Dec 12, 2024 • 38

updated 6 models 4 months ago

OpenGVLab/Mini-InternVL2-1B-DA-Medical

Image-Text-to-Text • Updated Dec 9, 2024 • 32

OpenGVLab/Mini-InternVL2-4B-DA-Medical

Image-Text-to-Text • Updated Dec 9, 2024 • 150 • 4

OpenGVLab/Mini-InternVL2-1B-DA-DriveLM

Image-Text-to-Text • Updated Dec 9, 2024 • 59 • 1

OpenGVLab/Mini-InternVL2-4B-DA-DriveLM

Image-Text-to-Text • Updated Dec 9, 2024 • 69 • 3

OpenGVLab/Mini-InternVL2-4B-DA-BDD

Image-Text-to-Text • Updated Dec 9, 2024 • 38

OpenGVLab/Mini-InternVL2-1B-DA-BDD

Image-Text-to-Text • Updated Dec 9, 2024 • 2

liked a model 4 months ago

OpenGVLab/InternVL2_5-78B

Image-Text-to-Text • Updated 27 days ago • 9.58k • 191

upvoted a paper 4 months ago

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Paper • 2412.05271 • Published Dec 6, 2024 • 155

updated 4 models 4 months ago

OpenGVLab/InternVL2_5-1B

Image-Text-to-Text • Updated 27 days ago • 27.6k • 57

OpenGVLab/InternVL2_5-2B

Image-Text-to-Text • Updated 27 days ago • 10.1k • 29

OpenGVLab/InternVL2_5-4B

Image-Text-to-Text • Updated 27 days ago • 29.2k • 49

OpenGVLab/InternVL2_5-8B

Image-Text-to-Text • Updated 27 days ago • 34.4k • 88