Reza Sayar's picture

Reza Sayar PRO

Reza2kn

·

AI & ML interests

None yet

Recent Activity

liked a model 2 days ago

facebook/PE-Spatial-G14-448

liked a model 2 days ago

facebook/PE-Lang-G14-448

liked a model 2 days ago

facebook/PE-Core-G14-448

View all activity

Organizations

Reza2kn's activity

upvoted a collection 2 days ago

Perception Encoder

9 items • Updated 3 days ago • 17

upvoted an article 7 days ago

Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

Feb 4

• 143

upvoted a collection 9 days ago

InternVL3

34 items • Updated about 6 hours ago • 50

upvoted a paper 9 days ago

OmniCaptioner: One Captioner to Rule Them All

Paper • 2504.07089 • Published 11 days ago • 19

upvoted 2 papers 11 days ago

An Empirical Study of GPT-4o Image Generation Capabilities

Paper • 2504.05979 • Published 12 days ago • 59

OmniSVG: A Unified Scalable Vector Graphics Generation Model

Paper • 2504.06263 • Published 12 days ago • 144

upvoted a paper 12 days ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 13 days ago • 162

upvoted a collection 12 days ago

Black Swan (Abductive and Defeasible Reasoning)

Data for CVPR 2025 paper, "Black Swan: Abductive and Defeasible Video Reasoning in Unpredictable Events" • 3 items • Updated 29 days ago • 2

upvoted a paper 13 days ago

MedSAM2: Segment Anything in 3D Medical Images and Videos

Paper • 2504.03600 • Published 16 days ago • 8

upvoted a collection 13 days ago

MedSAM2

MedSAM2: Segment Anything in 3D Medical Images and Videos • 4 items • Updated 8 days ago • 3

upvoted an article 13 days ago

Article

The NLP Course is becoming the LLM Course!

18 days ago

• 79

upvoted a collection 14 days ago

Orpheus TTS

TTS Towards Human-Sounding Speech • 2 items • Updated Mar 18 • 58

upvoted 4 papers 14 days ago

Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages

Paper • 2503.23542 • Published 21 days ago • 10

Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme

Paper • 2504.02587 • Published 17 days ago • 30

GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation

Paper • 2504.02782 • Published 17 days ago • 55

MegaTTS 3: Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis

Paper • 2502.18924 • Published Feb 26 • 12

upvoted a paper 26 days ago

TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting

Paper • 2503.17032 • Published about 1 month ago • 25