vlad-m-dev commited on
Commit
be9ec6f
Β·
verified Β·
1 Parent(s): 342e4aa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +102 -3
README.md CHANGED
@@ -1,3 +1,102 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - alfredplpl/Japanese-photos
5
+ - 3sara/colpali_italian_documents
6
+ pipeline_tag: image-classification
7
+ tags:
8
+ - image-classification
9
+ - mobile
10
+ - tablet
11
+ - quantization
12
+ - onnx
13
+ - mobilenetv3
14
+ - mobilenet_v3
15
+ - mobilenetv3_onnx
16
+ - document-classification
17
+ - photo-classification
18
+ - real-time
19
+ - lightweight
20
+ - efficient
21
+ - document
22
+ - photo
23
+ - images
24
+ - q8
25
+ - int8
26
+ - edge-ai
27
+ - ai-on-device
28
+ - offline
29
+ - privacy
30
+ - fast
31
+ - android
32
+ - ios
33
+ - gallery
34
+ ---
35
+
36
+ # MobileNetV3 β€” ONNX, Quantized
37
+
38
+ ### πŸ”₯ Lightweight mobile model for **image classification** into two categories:
39
+ - **`document`** (scans, receipts, papers, invoices)
40
+ - **`photo`** (regular phone photos: scenes, people, nature, etc.)
41
+
42
+ ---
43
+
44
+ ## 🟒 Overview
45
+
46
+ - **Designed for mobile devices** (phones and tablets, Android/iOS), perfect for real-time on-device inference!
47
+ - Architecture: **MobileNetV2**
48
+ - Format: **ONNX** (both float32 and quantized int8 versions included)
49
+ - Trained on balanced, real-world open-source datasets for both documents and photos.
50
+ - Ideal for tasks like:
51
+ - Document detection in gallery/camera rolls
52
+ - Screenshot, receipt, photo, and PDF preview classification
53
+ - Image sorting for privacy-first offline AI assistants
54
+
55
+ ---
56
+
57
+ ## 🏷️ Model Classes
58
+ - **0** β€” `document`
59
+ - **1** β€” `photo`
60
+
61
+ ---
62
+
63
+ ## ⚑️ Versions
64
+
65
+ - `mobilenet_v3_small.onnx` β€” Standard float32 for maximum accuracy (best for ARM/CPU)
66
+ - `mobilenet_v3_small_quant.onnx` β€” Quantized int8 for even faster inference and smaller file size (best for low-power or edge devices)
67
+
68
+ ---
69
+
70
+ ## πŸš€ Why this model?
71
+
72
+ - **Ultra-small size** (~10-15MB), real-time inference (<100ms) on most phones
73
+ - **Runs 100% offline** (privacy, no cloud required)
74
+ - **Easy integration** with any framework, including React Native (`onnxruntime-react-native`), Android (ONNX Runtime), and iOS.
75
+
76
+ ---
77
+
78
+ ## πŸ—ƒοΈ Datasets
79
+
80
+ - **Photos:** [alfredplpl/Japanese-photos](https://huggingface.co/datasets/alfredplpl/Japanese-photos)
81
+ - **Documents:** [3sara/colpali_italian_documents](https://huggingface.co/datasets/3sara/colpali_italian_documents)
82
+
83
+ ---
84
+
85
+ ## πŸ€– Author
86
+ @vlad-m-dev
87
+ Built for edge-ai/phone/tablet offline image classification: document vs photo
88
+ Telegram: https://t.me/dwight_schrute_engineer
89
+
90
+ ---
91
+
92
+ ## πŸ› οΈ Usage Example
93
+
94
+ ```python
95
+ import onnxruntime as ort
96
+ import numpy as np
97
+
98
+ session = ort.InferenceSession(MODEL_PATH)
99
+ img = np.random.randn(1, 3, 224, 224).astype(np.float32) # Replace with your image preprocessing!
100
+ output = session.run(None, {"input": img})
101
+ pred_class = np.argmax(output[0])
102
+ print(pred_class) # 0 = document, 1 = photo```