Ahmadzei's picture
added 3 more tables for large emb model
5fa1a76
It introduced a new visual-language pre-training
paradigm in which any combination of pre-trained vision encoder and LLM can be used (learn more in the BLIP-2 blog post).