Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 3 days ago • 152
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge Paper • 2407.00088 • Published Jun 25, 2024 • 12
1-bit AI Infra: Part 1.1, Fast and Lossless BitNet b1.58 Inference on CPUs Paper • 2410.16144 • Published Oct 21, 2024 • 4
BitNet Collection 🔥BitNet family of large language models (1-bit LLMs). • 6 items • Updated 4 days ago • 28
Cognition Collection Perception and abstraction. Each modality is tokenized and embedded into vectors for model to comprehend. • 200 items • Updated 7 days ago • 5
Training Software Engineering Agents and Verifiers with SWE-Gym Paper • 2412.21139 • Published Dec 30, 2024 • 23
Mirostat: A Neural Text Decoding Algorithm that Directly Controls Perplexity Paper • 2007.14966 • Published Jul 29, 2020 • 1
MedRAG: Enhancing Retrieval-augmented Generation with Knowledge Graph-Elicited Reasoning for Healthcare Copilot Paper • 2502.04413 • Published Feb 6 • 1
👩💻 OlympicCoder Collection Reasoning datasets and models for competitive coding • 4 items • Updated Mar 11 • 16
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM Mar 12 • 392