Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
fnlp 's Collections
MOSS-TTSD
MOSS Embodied Planner
Low Rank Sparse Attention
MHA2MLA-refactor
MHA2MLA
MOSS

MHA2MLA-refactor

updated 15 days ago

The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"

Upvote
-

  • fnlp/SmolLM-135M-MLA-d_kv_8-refactor

    Text Generation • 0.1B • Updated 28 days ago • 8

  • fnlp/SmolLM-135M-MLA-d_kv_32-refactor

    Text Generation • 0.1B • Updated Jun 17 • 4

  • fnlp/SmolLM-135M-MLA-d_kv_16-refactor

    Text Generation • 0.1B • Updated Jun 17

  • fnlp/SmolLM-360M-MLA-d_kv_8-refactor

    Text Generation • 0.3B • Updated Jun 17 • 1

  • fnlp/SmolLM-360M-MLA-d_kv_16-refactor

    Text Generation • 0.3B • Updated Jun 17

  • fnlp/SmolLM-360M-MLA-d_kv_32-refactor

    Text Generation • 0.4B • Updated Jun 17 • 1

  • fnlp/smollm1-1B7-d_kv_8-refactor

    Text Generation • 2B • Updated Jun 17

  • fnlp/smollm1-1B7-d_kv_16-refactor

    Text Generation • 2B • Updated Jun 17

  • fnlp/smollm1-1B7-d_kv_32-refactor

    Text Generation • 2B • Updated Jun 17

  • fnlp/llama2-7B-d_kv_16-refactor

    Text Generation • 6B • Updated Jun 17

  • fnlp/llama2-7B-d_kv_32-refactor

    Text Generation • 6B • Updated Jun 17

  • fnlp/llama2-7B-d_kv_64-refactor

    Text Generation • 7B • Updated Jun 17
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs