A L

ALYTV

donetskscum

AI & ML interests

Optimus & overall humanoid enjoyer

Recent Activity

liked a dataset 8 days ago

manu/project_gutenberg

updated a model 15 days ago

ALYTV/Qwen3-4B-Instruct-2507-MLX

published a model 15 days ago

ALYTV/Qwen3-4B-Instruct-2507-MLX

View all activity

Organizations

liked a dataset 8 days ago

manu/project_gutenberg

Viewer • Updated Sep 7, 2023 • 75.6k • 1.65k • 61

updated a model 15 days ago

ALYTV/Qwen3-4B-Instruct-2507-MLX

Text Generation • 0.6B • Updated 15 days ago • 43

published a model 15 days ago

ALYTV/Qwen3-4B-Instruct-2507-MLX

Text Generation • 0.6B • Updated 15 days ago • 43

updated a model 15 days ago

ALYTV/Qwen3-4B-Thinking-2507-mlx-4bit

Text Generation • 0.6B • Updated 15 days ago • 46

published a model 15 days ago

ALYTV/Qwen3-4B-Thinking-2507-mlx-4bit

Text Generation • 0.6B • Updated 15 days ago • 46

reacted to mitkox's post with 🔥 20 days ago

Post

2561

We’ve reached a point where on device AI coding that is free, offline, and capable isn’t just a theoretical possibility; it’s sitting on my lap, barely warming my thighs.
My local MacBook Air setup includes a Qwen3 Coder Flash with a 1M context, Cline in a VSCode IDE. No internet, no cloud, no ID verification- this is the forbidden tech.
Current stats:
All agentic tools work great local, sandboxed, and MCP
OK model output precision
17 tokens/sec. Not great, not terrible
65K tokens context, the model can do 1M, but let’s be real, my MacBook Air would probably achieve fusion before hitting that smoothly
Standard backend and cache off for the test
All inference and function calling happen locally, offline, untethered. The cloud didn’t even get a memo.