Vision Encoder Decoder Models | |
Overview | |
The [VisionEncoderDecoderModel] can be used to initialize an image-to-text model with any | |
pretrained Transformer-based vision model as the encoder (e.g. |
Vision Encoder Decoder Models | |
Overview | |
The [VisionEncoderDecoderModel] can be used to initialize an image-to-text model with any | |
pretrained Transformer-based vision model as the encoder (e.g. |