Haoning Wu, Teo PRO

teowu

https://teowu.github.io

AI & ML interests

Lead of Q-Future: https://github.com/Q-Future. I love MLLMs/LMMs/LVLMs/(any names you call them). Part of two great MoE VLMs as core contributors: Kimi-VL & Aria.

Recent Activity

new activity 1 day ago

mlx-community/Kimi-VL-A3B-Thinking-4bit:Update README.md

upvoted a collection 1 day ago

Kimi-VL Thinking

new activity 1 day ago

moonshotai/MoonViT-SO-400M:Add pipeline tag and library name

View all activity

Organizations

teowu's activity

upvoted a collection 1 day ago

Kimi-VL Thinking

Collection

3 items • Updated 1 day ago • 1

upvoted 2 papers 3 days ago

VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published 8 days ago • 39

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 4 days ago • 222

upvoted a paper 8 days ago

Kimi-VL Technical Report

Paper • 2504.07491 • Published 9 days ago • 113

upvoted a collection 9 days ago

Kimi-VL-A3B

Collection

Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 6 items • Updated 6 days ago • 61

upvoted 2 papers about 2 months ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 142

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 181

upvoted 2 papers 2 months ago

VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation

Paper • 2411.13281 • Published Nov 20, 2024 • 22

Redundancy Principles for MLLMs Benchmarks

Paper • 2501.13953 • Published Jan 20 • 29

upvoted 2 papers 3 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 382

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 113

upvoted a paper 4 months ago

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published Dec 6, 2024 • 48

upvoted 3 papers 5 months ago

VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation

Paper • 2412.00927 • Published Dec 1, 2024 • 28

Data Engineering for Scaling Language Models to 128K Context

Paper • 2402.10171 • Published Feb 15, 2024 • 26

AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark

Paper • 2410.03051 • Published Oct 4, 2024 • 6

upvoted a paper 6 months ago

Aria: An Open Multimodal Native Mixture-of-Experts Model

Paper • 2410.05993 • Published Oct 8, 2024 • 112

upvoted a paper 9 months ago

LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding

Paper • 2407.15754 • Published Jul 22, 2024 • 20

upvoted a paper 10 months ago

CMC-Bench: Towards a New Paradigm of Visual Signal Compression

Paper • 2406.09356 • Published Jun 13, 2024 • 5

upvoted a collection 10 months ago

Visual Evaluation Benchmarks!

Collection

Q-Bench (ICLR24' Spotlight), Q-Bench-Pair (TPAMI), and A-Bench in HuggingFace Format. Support auto-load as `dataset = load_dataset("q-future/**-HF")` • 3 items • Updated Aug 27, 2024 • 1

upvoted a paper 10 months ago

A-Bench: Are LMMs Masters at Evaluating AI-generated Images?

Paper • 2406.03070 • Published Jun 5, 2024 • 2