Update README.md
Browse files
README.md
CHANGED
@@ -56,3 +56,9 @@ Note that all benchmark stats are from a Samsung S24 Ultra with 1280 KV cache si
|
|
56 |
<td><p style="text-align: right">1,861</p></td>
|
57 |
</tr>
|
58 |
</table>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
56 |
<td><p style="text-align: right">1,861</p></td>
|
57 |
</tr>
|
58 |
</table>
|
59 |
+
|
60 |
+
* Model Size: measured by the size of the .tflite flatbuffer (serialization format for LiteRT models)
|
61 |
+
* Memory: indicator of peak RAM usage
|
62 |
+
* The inference on CPU is accelerated via the LiteRT [XNNPACK](https://github.com/google/XNNPACK) delegate with 4 threads
|
63 |
+
* Benchmark is done assuming XNNPACK cache is enabled
|
64 |
+
* dynamic_int8: quantized model with int8 weights and float activations.
|