Fidite Nemini's picture

Fidite Nemini PRO

FiditeNemini

AI & ML interests

Prompt engineering, unalignment, MLX, model merging, diffusion models

Recent Activity

liked a model 15 days ago
HiDream-ai/HiDream-I1-Full
reacted to merterbak's post with πŸ”₯ 18 days ago
Meta has unveiled its Llama 4 πŸ¦™ family of models, featuring native multimodality and mixture-of-experts architecture. Two model families are available now: ModelsπŸ€—: https://huggingface.co/collections/meta-llama/llama-4-67f0c30d9fe03840bc9d0164 Blog Post: https://ai.meta.com/blog/llama-4-multimodal-intelligence/ HF's Blog Post: https://huggingface.co/blog/llama4-release - 🧠 Native Multimodality - Process text and images in a unified architecture - πŸ” Mixture-of-Experts - First Llama models using MoE for incredible efficiency - πŸ“ Super Long Context - Up to 10M tokens - 🌐 Multilingual Power - Trained on 200 languages with 10x more multilingual tokens than Llama 3 (including over 100 languages with over 1 billion tokens each) πŸ”Ή Llama 4 Scout - 17B active parameters (109B total) - 16 experts architecture - 10M context window - Fits on a single H100 GPU - Beats Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 πŸ”Ή Llama 4 Maverick - 17B active parameters (400B total) - 128 experts architecture - It can fit perfectly on DGX H100(8x H100) - 1M context window - Outperforms GPT-4o and Gemini 2.0 Flash - ELO score of 1417 on LMArena currently second best model on arena πŸ”Ή Llama 4 Behemoth (Coming Soon) - 288B active parameters (2T total) - 16 experts architecture - Teacher model for Scout and Maverick - Outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM benchmarks
View all activity

Organizations

Fidite Nemini Open Source's profile picture MLX Community's profile picture Cognitive Computations's profile picture

FiditeNemini's activity

New activity in TheDrummer/Fallen-Llama-3.3-R1-70B-v1-GGUF about 2 months ago

Wrong gguf's in repo?

3
#1 opened about 2 months ago by
FiditeNemini
New activity in mkurman/Qwen2.5-14B-DeepSeek-R1-1M 3 months ago

Merge strategy

2
#1 opened 3 months ago by
FiditeNemini
New activity in mlx-community/mlx-my-repo 4 months ago
New activity in ZeusLabs/Chronos-Platinum-72B 7 months ago

Really really good!

2
1
#3 opened 7 months ago by
FiditeNemini
New activity in black-forest-labs/FLUX.1-schnell 8 months ago
New activity in Kijai/flux-fp8 8 months ago
New activity in qresearch/llama-3.1-8B-vision-378 9 months ago

70B?

2
#3 opened 9 months ago by
FiditeNemini
New activity in qnguyen3/nanoLLaVA-1.5 10 months ago

This model is amazing!

1
3
#1 opened 10 months ago by
nicolollo
New activity in cognitivecomputations/dolphin-vision-72b 10 months ago

Just Reviewed It

1
1
#3 opened 10 months ago by
fahdmirzac
New activity in gokaygokay/Florence-2-SD3-Captioner 10 months ago

Finetuning code.

3
#1 opened 10 months ago by
ljnlonoljpiljm
New activity in cmp-nct/llava-1.6-gguf about 1 year ago

Q8?

2
#1 opened about 1 year ago by
FiditeNemini
New activity in WhiteRabbitNeo/WhiteRabbitNeo-13B-v1 over 1 year ago

Data sources?

3
#3 opened over 1 year ago by
FiditeNemini
New activity in dreamgen/opus-v0.5-70b over 1 year ago

Odd behaviour with '.

1
2
#1 opened over 1 year ago by
FiditeNemini
New activity in migtissera/Tess-M-Creative-v1.0 over 1 year ago

Great model!

7
#1 opened over 1 year ago by
dillfrescott