Spaces:

Ahmadzei
/

RAG

Runtime error

RAG

File size: 152 Bytes

5fa1a76

We use a standard Vision Transformer architecture with minimal modifications, contrastive image-text pre-training, and end-to-end detection fine-tuning.