Spaces:

Ahmadzei
/

RAG

Runtime error

RAG

File size: 135 Bytes

5fa1a76

The CLAP model uses a SWINTransformer to get audio features from a log-Mel spectrogram input, and a RoBERTa model to get text features.