Yangyi Chen

YangyiYY

https://yangyi-chen.github.io/

AI & ML interests

Multimodal, Large Language Models

Recent Activity

upvoted a paper about 1 month ago

Perception-Aware Policy Optimization for Multimodal Reasoning

liked a Space 2 months ago

nanotron/ultrascale-playbook

upvoted a paper 3 months ago

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

View all activity

Organizations

None yet

upvoted a paper about 1 month ago

Perception-Aware Policy Optimization for Multimodal Reasoning

Paper • 2507.06448 • Published Jul 8 • 45

liked a Space 2 months ago

3.1k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

upvoted a paper 3 months ago

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 135

upvoted 2 papers 4 months ago

RM-R1: Reward Modeling as Reasoning

Paper • 2505.02387 • Published May 5 • 78

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 93

liked a dataset 5 months ago

xuehang/SyncBench

Viewer • Updated May 24 • 42.5k • 17 • 1

updated a model 5 months ago

YangyiYY/Qwen2.5_sft_tabmwp_textreason_RL

3B • Updated Mar 13 • 4

published 2 models 5 months ago

YangyiYY/Qwen2.5_sft_tabmwp_textreason_RL

3B • Updated Mar 13 • 4

YangyiYY/Qwen2.5-VL_sft_tabmwp_textreason_RL

Updated Mar 11

published 3 models 6 months ago

upvoted an article 6 months ago

Article

Putting RL back in RLHF

and 1 other •

Jun 12, 2024

• 100

upvoted a paper 7 months ago

Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks

Paper • 2501.11733 • Published Jan 20 • 29

upvoted a paper 8 months ago

OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis

Paper • 2501.04561 • Published Jan 8 • 16

updated a dataset 9 months ago

YangyiYY/VLM-SFT

Viewer • Updated Dec 3, 2024 • 1.13M • 43 • 1

updated 4 models about 1 year ago

YangyiYY/model_output_sft_llama_preferred_mixed

Text Generation • 8B • Updated Aug 12, 2024 • 3

YangyiYY/model_output_sft_llama_rejected

Text Generation • 8B • Updated Aug 12, 2024 • 3

YangyiYY/model_output_sft_llama_preferred

Text Generation • 8B • Updated Aug 10, 2024 • 4

YangyiYY/model_output_dpo_llama_data

Text Generation • 8B • Updated Aug 10, 2024 • 3