Update README.md
Browse files
README.md
CHANGED
@@ -50,9 +50,18 @@ print(f"Found {len(points)} person(s)")
|
|
50 |
|
51 |
### Changelog
|
52 |
|
53 |
-
**2025-06-21**
|
54 |
-
|
55 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
56 |
|
57 |
**2025-04-15** ([full release notes](https://moondream.ai/blog/moondream-2025-04-14-release))
|
58 |
|
|
|
50 |
|
51 |
### Changelog
|
52 |
|
53 |
+
**2025-06-21** ([full release notes](https://moondream.ai/blog/moondream-2025-06-21-release))
|
54 |
+
|
55 |
+
* **Grounded Reasoning**
|
56 |
+
Introduces a new step-by-step reasoning mode that explicitly grounds reasoning in spatial positions within the image before answering, leading to more precise visual interpretation (e.g., chart median calculations, accurate counting). Enable with `reasoning=True` in the `query` skill to trade off speed vs. accuracy.
|
57 |
+
* **Sharper Object Detection**
|
58 |
+
Uses reinforcement learning on higher-quality bounding-box annotations to reduce object clumping and improve fine-grained detections (e.g., distinguishing “blue bottle” vs. “bottle”).
|
59 |
+
* **Faster Text Generation**
|
60 |
+
Yields 20–40 % faster response generation via a new “superword” tokenizer and lightweight tokenizer transfer hypernetwork, which reduces the number of tokens emitted without loss in accuracy and eases future multilingual extensions.
|
61 |
+
* **Improved UI Understanding**
|
62 |
+
Boosts ScreenSpot (UI element localization) performance from an F1\@0.5 of 60.3 to 80.4, making Moondream more effective for UI-focused applications.
|
63 |
+
* **Reinforcement Learning Enhancements**
|
64 |
+
RL fine-tuning applied across 55 vision-language tasks to reinforce grounded reasoning and detection capabilities, with a roadmap to expand to \~120 tasks in the next update.
|
65 |
|
66 |
**2025-04-15** ([full release notes](https://moondream.ai/blog/moondream-2025-04-14-release))
|
67 |
|