6 2 15

hanzlajavaid

hanzla

AI & ML interests

Direct Preference Optimization, Supervised Finetuning, Stable Diffusion

Recent Activity

updated a model about 12 hours ago

hanzla/Qwen2-0.5B-GRPO-summary_test_v4

updated a model about 14 hours ago

hanzla/Qwen2-0.5B-GRPO-summary_test_v3

published a model about 16 hours ago

hanzla/Qwen2-0.5B-GRPO-summary_test_v4

View all activity

Organizations

hanzla's activity

updated a model about 12 hours ago

hanzla/Qwen2-0.5B-GRPO-summary_test_v4

Updated about 12 hours ago

updated a model about 14 hours ago

hanzla/Qwen2-0.5B-GRPO-summary_test_v3

Updated about 14 hours ago

published 2 models about 16 hours ago

hanzla/Qwen2-0.5B-GRPO-summary_test_v4

Updated about 12 hours ago

hanzla/Qwen2-0.5B-GRPO-summary_test_v3

Updated about 14 hours ago

updated 2 models 2 days ago

hanzla/Qwen2-0.5B-GRPO-summary_test_v2

Updated 2 days ago

hanzla/Qwen2-0.5B-GRPO-summary_test_v2

Updated 2 days ago

published a model 3 days ago

hanzla/Qwen2-0.5B-GRPO-summary_test_v2

Updated 2 days ago

updated a model 4 days ago

hanzla/Qwen2-0.5B-GRPO-summary_test

Updated 4 days ago

published 2 models 4 days ago

hanzla/Qwen2-0.5B-GRPO-summary_test

Updated 4 days ago

hanzla/Qwen2-0.5B-GRPO-test

Updated 4 days ago

liked a model 5 days ago

nvidia/Llama-3.1-Nemotron-8B-UltraLong-4M-Instruct

Text Generation • Updated 4 days ago • 994 • 81

published a Space 16 days ago

VR Experience Report

👁

Visualize patient VR session data with audio and GIFs

updated a Space 16 days ago

VR Experience Report

👁

Visualize patient VR session data with audio and GIFs

New activity in hanzla/Falcon3-Mamba-R1-v0 22 days ago

Ollama support

#1 opened 23 days ago by

ayan4m1

liked a model 25 days ago

Qwen/Qwen2.5-Omni-7B

Any-to-Any • Updated 6 days ago • 167k • 1.44k

posted an update 27 days ago

Post

1994

Hi community,

Few days back, I posted about my ongoing research on making reasoning mamba models and I found great insights from the community.

Today, I am announcing an update to the model weights. With newer checkpoints, the Falcon3 Mamba R1 model now outperforms very large transformer based LLMs (including Gemini) for Formal Logic questions of MMLU. It scores 60% on formal logic which is considered a tough subset of questions in MMLU.

I would highly appreciate your insights and suggestions on this new checkpoint.

Model Repo: hanzla/Falcon3-Mamba-R1-v0

Chat space: hanzla/Falcon3MambaReasoner

liked a model 28 days ago

manycore-research/SpatialLM-Llama-1B

Text Generation • Updated about 1 month ago • 19.7k • 955

updated 2 Spaces 28 days ago

Falcon3MambaReasoner

📊

Chat with an advanced reasoning model

Falcon3MambaReasoner

📊

Chat with an advanced reasoning model

updated a model 28 days ago

hanzla/Falcon3-Mamba-R1-v0-4bit

Text Generation • Updated 28 days ago • 19