niuchl's picture
Update performance dashboard (#2)
145af7c verified
|
raw
history blame
1.4 kB
metadata
license: mit
tags:
  - LiteRT
base_model:
  - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

litert-community/DeepSeek-R1-Distill-Qwen-1.5B

This model was converted to LiteRT (aka TFLite) format from deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B using Google AI Edge Torch.

Run the model in colab

Open In Colab

Run the model on Android

Please follow the instructions.

Benchmarking results

Note that all benchmark stats are from a Samsung S24 Ultra.

Model DeepSeek-R1-Distill-Qwen-1.5B (Int8 quantized)
Params 1.78 B
Prefill 512 tokensDecode 128 tokens
LiteRT tk/s (XNNPACK, 4 threads) 260.9523.126
GGML tk/s (CPU, 4 threads) 64.6623.85