Jiapeng Luo's picture

2 3 1

Jiapeng Luo

woolpeeker

·

AI & ML interests

None yet

Recent Activity

authored a paper 7 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

upvoted an article 10 days ago

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

commented on an article 10 days ago

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

View all activity

Organizations

None yet

woolpeeker's activity

authored a paper 7 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 8 days ago • 237

upvoted an article 10 days ago

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

By

•

Feb 11

• 21

commented on Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment 10 days ago

Great tutorial

about this equation, should the step-wise reward inside the sum of gradient, so the gradient of each step can multiply its reward?

upvoted a paper about 2 months ago

MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction

Paper • 2502.11663 • Published Feb 17 • 39

liked a model 4 months ago

OpenGVLab/InternVL2_5-78B

Image-Text-to-Text • Updated 28 days ago • 9.66k • 191

authored a paper 4 months ago

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Paper • 2412.05271 • Published Dec 6, 2024 • 155

upvoted a paper 12 months ago

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

Paper • 2404.16821 • Published Apr 25, 2024 • 60

authored a paper 12 months ago

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

Paper • 2404.16821 • Published Apr 25, 2024 • 60

updated a Space almost 2 years ago

Test Space

updated a model almost 2 years ago

woolpeeker/test_repo

Updated Jun 6, 2023