CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published 6 days ago • 86
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve? Paper • 2502.17535 • Published Feb 24 • 8
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve? Paper • 2502.17535 • Published Feb 24 • 8
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve? Paper • 2502.17535 • Published Feb 24 • 8 • 2
Running 2.51k 2.51k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Perovskite-LLM: Knowledge-Enhanced Large Language Models for Perovskite Solar Cell Research Paper • 2502.12669 • Published Feb 18 • 2
Perovskite-LLM: Knowledge-Enhanced Large Language Models for Perovskite Solar Cell Research Paper • 2502.12669 • Published Feb 18 • 2
Perovskite-LLM: Knowledge-Enhanced Large Language Models for Perovskite Solar Cell Research Paper • 2502.12669 • Published Feb 18 • 2 • 2
Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing Paper • 2502.04411 • Published Feb 6 • 4
Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing Paper • 2502.04411 • Published Feb 6 • 4
Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing Paper • 2502.04411 • Published Feb 6 • 4 • 2
Can LLMs Maintain Fundamental Abilities under KV Cache Compression? Paper • 2502.01941 • Published Feb 4 • 15
Can LLMs Maintain Fundamental Abilities under KV Cache Compression? Paper • 2502.01941 • Published Feb 4 • 15
Can LLMs Maintain Fundamental Abilities under KV Cache Compression? Paper • 2502.01941 • Published Feb 4 • 15 • 2
ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference Paper • 2502.00299 • Published Feb 1 • 2
ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference Paper • 2502.00299 • Published Feb 1 • 2
ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference Paper • 2502.00299 • Published Feb 1 • 2 • 2
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published Jan 22 • 91