|
--- |
|
title: UI Screen Description Generator With Pix2Struct |
|
emoji: 🐨 |
|
colorFrom: purple |
|
colorTo: blue |
|
sdk: gradio |
|
sdk_version: 5.28.0 |
|
app_file: app.py |
|
pinned: false |
|
license: mit |
|
short_description: Built a vision-language application |
|
--- |
|
|
|
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
|
|
|
|
|
# UI Screen Describer with Pix2Struct |
|
|
|
This demo uses Google's `pix2struct-screen2words-large` model to turn UI screenshots into natural language descriptions. |
|
|
|
### Use Cases |
|
- Accessibility |
|
- UI testing |
|
- Auto documentation |
|
|
|
### How it works |
|
Upload any screenshot (e.g., app, webpage, dashboard) and the model will describe it in text. |
|
|
|
Built using Hugging Face Transformers + Gradio. |
|
|
|
|