Mahdi Pourmirzaei's picture

86

Mahdi Pourmirzaei

Mahdip72

·

AI & ML interests

None yet

Recent Activity

updated a model about 1 month ago

Mahdip72/prot2token

upvoted a paper about 1 month ago

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

upvoted a paper about 1 month ago

Transformers without Normalization

View all activity

Organizations

None yet

Mahdip72's activity

upvoted 2 papers about 1 month ago

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

Paper • 2501.13926 • Published Jan 23 • 42

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 157

upvoted a paper about 2 months ago

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published Feb 20 • 103

upvoted a paper 2 months ago

Analyze Feature Flow to Enhance Interpretation and Steering in Language Models

Paper • 2502.03032 • Published Feb 5 • 60

upvoted 2 papers 3 months ago

Video Depth Anything: Consistent Depth Estimation for Super-Long Videos

Paper • 2501.12375 • Published Jan 21 • 22

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 384

upvoted 4 papers 4 months ago

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 71

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 101

Multimodal Latent Language Modeling with Next-Token Diffusion

Paper • 2412.08635 • Published Dec 11, 2024 • 45

SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration

Paper • 2411.10958 • Published Nov 17, 2024 • 56

upvoted 2 papers 5 months ago

The Surprising Effectiveness of Test-Time Training for Abstract Reasoning

Paper • 2411.07279 • Published Nov 11, 2024 • 3

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 49

upvoted 3 papers 6 months ago

Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation

Paper • 2410.13848 • Published Oct 17, 2024 • 35

Quantifying Generalization Complexity for Large Language Models

Paper • 2410.01769 • Published Oct 2, 2024 • 14

Loong: Generating Minute-level Long Videos with Autoregressive Language Models

Paper • 2410.02757 • Published Oct 3, 2024 • 37

upvoted 5 papers 7 months ago

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 150

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

Paper • 2410.02073 • Published Oct 2, 2024 • 42

MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published Sep 26, 2024 • 54

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27, 2024 • 95

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 61