litert-community
/

DeepSeek-R1-Distill-Qwen-1.5B

Text Generation

Model card Files Files and versions Community

niuchl commited on Feb 26

Commit

88d3e1a

·

verified ·

1 Parent(s): 1ab5b90

Update README.md

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -56,3 +56,9 @@ Note that all benchmark stats are from a Samsung S24 Ultra with 1280 KV cache si
    <td><p style="text-align: right">1,861</p></td>
   </tr>
 </table>

    <td><p style="text-align: right">1,861</p></td>
   </tr>
 </table>
+*   Model Size: measured by the size of the .tflite flatbuffer (serialization format for LiteRT models)
+*   Memory: indicator of peak RAM usage
+*   The inference on CPU is accelerated via the LiteRT [XNNPACK](https://github.com/google/XNNPACK) delegate with 4 threads
+*   Benchmark is done assuming XNNPACK cache is enabled
+*   dynamic_int8: quantized model with int8 weights and float activations.