1 16 4

Yifan Zeng

yokey

https://xhmy.github.io/

AI & ML interests

Large Language Model, Agentic AI, Deep Learning

Recent Activity

upvoted a paper 11 days ago

VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

updated a collection about 1 month ago

LLM

upvoted a paper about 1 month ago

R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model

View all activity

Organizations

None yet

yokey's activity

upvoted a paper 11 days ago

VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Paper • 2504.05118 • Published 14 days ago • 24

updated a collection about 1 month ago

LLM

Collection

20 items • Updated Mar 16

upvoted 2 papers about 1 month ago

R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model

Paper • 2503.05132 • Published Mar 7 • 55

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12 • 68

upvoted a paper about 2 months ago

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25 • 74

upvoted a paper 3 months ago

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published Jan 10 • 66

liked a model 4 months ago

sfairXC/FsfairX-LLaMA3-RM-v0.1

Text Classification • Updated Oct 14, 2024 • 4.35k • 56

upvoted a paper 4 months ago

Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 47

upvoted a paper 5 months ago

Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published Nov 21, 2024 • 62

New activity in google/gemma-2-9b 5 months ago

RuntimeError: Index put requires the source and destination dtypes match, got BFloat16 for the destination and Float for the source.

#24 opened 9 months ago by

saireddy

upvoted 2 papers 6 months ago

Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Paper • 2410.22366 • Published Oct 28, 2024 • 82

OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization

Paper • 2410.19609 • Published Oct 25, 2024 • 17

authored a paper 6 months ago

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling

Paper • 2410.16033 • Published Oct 18, 2024

liked a model 6 months ago

nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

Text Generation • Updated 8 days ago • 32.1k • • 2.03k

commented a paper 6 months ago

A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement

Paper • 2410.13828 • Published Oct 17, 2024 • 4 •

authored 2 papers 6 months ago

A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement

Paper • 2410.13828 • Published Oct 17, 2024 • 4

LLM-RankFusion: Mitigating Intrinsic Inconsistency in LLM-based Ranking

Paper • 2406.00231 • Published May 31, 2024

upvoted a paper 6 months ago

A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement

Paper • 2410.13828 • Published Oct 17, 2024 • 4

updated a collection 6 months ago

LLM

Collection

20 items • Updated Mar 16