1 16 7

hcwei

AI & ML interests

Diffusion Model, Image Generation, ML, DL, CV

Recent Activity

upvoted a paper 5 days ago

VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

upvoted a paper 5 days ago

Antidistillation Sampling

upvoted a paper 5 days ago

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

View all activity

Organizations

None yet

hcwei's activity

upvoted 4 papers 5 days ago

VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

Paper • 2504.13122 • Published 5 days ago • 21

Antidistillation Sampling

Paper • 2504.13146 • Published 5 days ago • 59

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

Paper • 2504.13169 • Published 5 days ago • 38

Packing Input Frame Context in Next-Frame Prediction Models for Video Generation

Paper • 2504.12626 • Published 6 days ago • 45

liked a model 11 days ago

OpenGVLab/InternVL3-14B

Image-Text-to-Text • Updated 6 days ago • 31.7k • 40

upvoted a paper 12 days ago

Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?

Paper • 2504.06514 • Published 14 days ago • 39

updated a model 28 days ago

hcwei/FRANK-ZERO-38B

Updated 28 days ago • 10 • 2

liked a model about 1 month ago

hcwei/FRANK-ZERO-38B

Updated 28 days ago • 10 • 2

published a model about 1 month ago

hcwei/FRANK-ZERO-38B

Updated 28 days ago • 10 • 2

liked a model about 1 month ago

hy1111/CLIP-RS

Updated Feb 23 • 3

upvoted 2 papers 3 months ago

Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding

Paper • 2501.07888 • Published Jan 14 • 15

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 286

upvoted 2 papers 4 months ago

Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization

Paper • 2412.18525 • Published Dec 24, 2024 • 76

No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 44

upvoted 2 papers 7 months ago

Visual Context Window Extension: A New Perspective for Long Video Understanding

Paper • 2409.20018 • Published Sep 30, 2024 • 11

E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding

Paper • 2409.18111 • Published Sep 26, 2024 • 7

commented a paper 7 months ago

Visual Context Window Extension: A New Perspective for Long Video Understanding

Paper • 2409.20018 • Published Sep 30, 2024 • 11 •

authored 2 papers 7 months ago

Improving Generalization of Image Captioning with Unsupervised Prompt Learning

Paper • 2308.02862 • Published Aug 5, 2023

Visual Context Window Extension: A New Perspective for Long Video Understanding

Paper • 2409.20018 • Published Sep 30, 2024 • 11

updated a Space 7 months ago

LongLLaVA

🌖

Long Video Understanding