Nomic Embed Multimodal Collection Multimodal models allowing you to search over interleaved text, PDFs, charts, and images! โข 15 items โข Updated 16 days ago โข 20
Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers โข 70 items โข Updated 17 days ago โข 123
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models Paper โข 2402.10986 โข Published Feb 16, 2024 โข 80
LayoutLM Collection The LayoutLM series are Transformer encoders useful for document AI tasks such as invoice parsing, document image classification and DocVQA. โข 6 items โข Updated 6 days ago โข 18
SpeechT5 Collection The SpeechT5 framework consists of a shared seq2seq and six modal-specific (speech/text) pre/post-nets that can address a few audio-related tasks. โข 8 items โข Updated 6 days ago โข 24