Real-time video captioning powered by FastVLM
Visualize 3D word embeddings in an interactive space
SOTA real-time object detection model
Separate vocals from background in audio