30 4 35

Fidite Nemini PRO

FiditeNemini

FiditeNemini2023

AI & ML interests

Prompt engineering, unalignment, MLX, model merging, diffusion models

Recent Activity

reacted to bartowski's post with 👍 7 days ago

Access requests enabled for latest GLM models While a fix is being implemented (https://github.com/ggml-org/llama.cpp/pull/12957) I want to leave the models up for visibility and continued discussion, but want to prevent accidental downloads of known broken models (even though there are settings that could fix it at runtime for now) With this goal, I've enabled access requests. I don't really want your data, so I'm sorry that I don't think there's a way around that? But that's what I'm gonna do for now, and I'll remove the gate when a fix is up and verified and I have a chance to re-convert and quantize! Hope you don't mind in the mean time :D

liked a model 15 days ago

HiDream-ai/HiDream-I1-Full

reacted to merterbak's post with 🔥 18 days ago

Meta has unveiled its Llama 4 🦙 family of models, featuring native multimodality and mixture-of-experts architecture. Two model families are available now: Models🤗: https://huggingface.co/collections/meta-llama/llama-4-67f0c30d9fe03840bc9d0164 Blog Post: https://ai.meta.com/blog/llama-4-multimodal-intelligence/ HF's Blog Post: https://huggingface.co/blog/llama4-release - 🧠 Native Multimodality - Process text and images in a unified architecture - 🔍 Mixture-of-Experts - First Llama models using MoE for incredible efficiency - 📏 Super Long Context - Up to 10M tokens - 🌐 Multilingual Power - Trained on 200 languages with 10x more multilingual tokens than Llama 3 (including over 100 languages with over 1 billion tokens each) 🔹 Llama 4 Scout - 17B active parameters (109B total) - 16 experts architecture - 10M context window - Fits on a single H100 GPU - Beats Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 🔹 Llama 4 Maverick - 17B active parameters (400B total) - 128 experts architecture - It can fit perfectly on DGX H100(8x H100) - 1M context window - Outperforms GPT-4o and Gemini 2.0 Flash - ELO score of 1417 on LMArena currently second best model on arena 🔹 Llama 4 Behemoth (Coming Soon) - 288B active parameters (2T total) - 16 experts architecture - Teacher model for Scout and Maverick - Outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM benchmarks

View all activity

Organizations

FiditeNemini's activity

New activity in TheDrummer/Fallen-Llama-3.3-R1-70B-v1-GGUF about 2 months ago

Wrong gguf's in repo?

#1 opened about 2 months ago by

FiditeNemini

New activity in huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2 3 months ago

1M token context version is out

#2 opened 3 months ago by

FiditeNemini

New activity in mkurman/Qwen2.5-14B-DeepSeek-R1-1M 3 months ago

Merge strategy

#1 opened 3 months ago by

FiditeNemini

New activity in mlx-community/mlx-my-repo 4 months ago

🚩 Report: Not working

#30 opened 5 months ago by

southmost

New activity in ZeusLabs/Chronos-Platinum-72B 7 months ago

Really really good!

#3 opened 7 months ago by

FiditeNemini

New activity in black-forest-labs/FLUX.1-schnell 8 months ago

Not running on MacOS ComfyUI

#13 opened 9 months ago by

FiditeNemini

New activity in Kijai/flux-fp8 8 months ago

Can this model be used on Apple Silicon?

#14 opened 9 months ago by

jsmidt

New activity in qresearch/llama-3.1-8B-vision-378 9 months ago

70B?

#3 opened 9 months ago by

FiditeNemini

New activity in qnguyen3/nanoLLaVA-1.5 10 months ago

GGUF conversion assistance?

#3 opened 10 months ago by

FiditeNemini

This model is amazing!

#1 opened 10 months ago by

nicolollo

New activity in cognitivecomputations/dolphin-vision-72b 10 months ago

Just Reviewed It

#3 opened 10 months ago by

fahdmirzac

New activity in gokaygokay/Florence-2-SD3-Captioner 10 months ago

Finetuning code.

#1 opened 10 months ago by

ljnlonoljpiljm

New activity in cognitivecomputations/dolphin-2.9-llama3-70b 12 months ago

large context Dolphin Llama70b?

#6 opened 12 months ago by

Alias1964

New activity in cognitivecomputations/based-13b 12 months ago

Request: Make a Based LLaMa-3-8B and LLaMa-3-70B

#1 opened 12 months ago by

Joseph717171

New activity in cmp-nct/llava-1.6-gguf about 1 year ago

Q8?

#1 opened about 1 year ago by

FiditeNemini

New activity in WhiteRabbitNeo/WhiteRabbitNeo-13B-v1 over 1 year ago

Data sources?

#3 opened over 1 year ago by

FiditeNemini

New activity in dreamgen/opus-v0.5-70b over 1 year ago

Odd behaviour with '.

#1 opened over 1 year ago by

FiditeNemini

New activity in migtissera/Tess-M-Creative-v1.0 over 1 year ago

Great model!

#1 opened over 1 year ago by

dillfrescott