niuchl commited on
Commit
88d3e1a
·
verified ·
1 Parent(s): 1ab5b90

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -56,3 +56,9 @@ Note that all benchmark stats are from a Samsung S24 Ultra with 1280 KV cache si
56
  <td><p style="text-align: right">1,861</p></td>
57
  </tr>
58
  </table>
 
 
 
 
 
 
 
56
  <td><p style="text-align: right">1,861</p></td>
57
  </tr>
58
  </table>
59
+
60
+ * Model Size: measured by the size of the .tflite flatbuffer (serialization format for LiteRT models)
61
+ * Memory: indicator of peak RAM usage
62
+ * The inference on CPU is accelerated via the LiteRT [XNNPACK](https://github.com/google/XNNPACK) delegate with 4 threads
63
+ * Benchmark is done assuming XNNPACK cache is enabled
64
+ * dynamic_int8: quantized model with int8 weights and float activations.